/h/ai_slop House Femboy

RemembertoclickFollowonmy no/flair 2mo ago (text post) 205 thread views #300167

GPT O1 Release Friday Night Drinking Thread

Thread theme

I don't like how OpenAI continues to close off more and more of what they are doing behind the scenes but frick me I guess. 4o was less of a step forwards but more of a step sideways, and this O1 bullshit is in the same vein.

They're supposed to be running crazy transformers with apps (consumers) using it for CoT but who fricking cares about that shit anyways number goes up slop about phd and bro it scored like 85 on this one test bro just one more prompt bro just 900 more crosschecks bro.

If I read one more fricking r-slur post about strawberries i will unironically forcefeed them 3000 strawberries :marseyschizotwitch:

Block /h/ai_slop

Jump in the discussion.

No email address required.

BooMetropolis Lcd/Dem :madotsukidance:

2mo ago #7019285

>2024

>alcohol

:#ishygddt:

!alcoholmisia

10 Context

monkeystyle ac/dc Model Citizen BooMetropolis 2mo ago #7019306

Alcohol is amazing, you heathen.

4 Context

BooMetropolis Lcd/Dem :madotsukidance:

monkeystyle 2mo ago #7019308

:#nofunallowed: :#marseyindignant:

4 Context

monkeystyle ac/dc Model Citizen BooMetropolis 2mo ago #7019319

C'mon, I'll buy you a drink.

3 Context

RemembertoclickFollowonmy no/flair monkeystyle 2mo ago #7019330

2 Context

monkeystyle ac/dc Model Citizen RemembertoclickFollowonmy 2mo ago #7019370

What is this from?

2 Context

RemembertoclickFollowonmy no/flair monkeystyle 2mo ago #7019383

Monster anime. The scene couldnt be more perfect for the back and forth you two had.

2 Context

monkeystyle ac/dc Model Citizen RemembertoclickFollowonmy 2mo ago #7019393

Never heard of it before. I looked up the synopsis though and it seems neat. I may actually have to look for this. Thank you.

2 Context

RemembertoclickFollowonmy no/flair monkeystyle 2mo ago #7019423

:platyold: @MoonMetropolis ping anime and poll how many of them know monster. It used be to considered top 10.

1 Context

SixthEggnog I/Am Subject Matter Expert in making dramatards mad af 2mo ago #7019366

I asked it to solve some random math problem and so far its giving me the sanitized CoT

https://chatgpt.com/share/66e51570-a10c-8007-89f4-94e28f057fff

4 Context

RemembertoclickFollowonmy no/flair 2mo ago #7019309

!codecels have any of you obtained access or tested it in agentic i.e. langchain programs

4 Context

PatriceOneal e/acc We went to a musical called "Oh Africa, Brave Africa". It was a laugh riot. RemembertoclickFollowonmy 2mo ago #7019342

I have access in normal chatgpt, unless this is different:

3 Context

RemembertoclickFollowonmy no/flair PatriceOneal 2mo ago #7019348

its o1-preview, 4o is the her. impression. Have you tested it against benchmarks yet?

2 Context

PatriceOneal e/acc We went to a musical called "Oh Africa, Brave Africa". It was a laugh riot. RemembertoclickFollowonmy 2mo ago #7019357

nah I just wanted to brag I had it

4 Context

RemembertoclickFollowonmy no/flair PatriceOneal 2mo ago #7019367

:marseyeyeroll: my corp has access too, but we haven't been able to do benchmark testing yet. We still use 4turbo over 4o + mix of other models

2 Context

W he/he Tungsten RemembertoclickFollowonmy 2mo ago #7019360

There are LLM benchmarks? Wtf can that even mean? Rs in strawberries?

3 Context

RemembertoclickFollowonmy no/flair W 2mo ago #7019369 Edited 2mo ago

...yes? Are you r-slurred? :marseyhuh2: You can also create custom benchmarks for your own usage to keep track of the random tweaks they (llm providers) do.

As to the latter, many brainlets, mbneurodivergents, and turbospergs online complain that LLMs can't count the number of Rs in the word "strawberry".

3 Context

W he/he Tungsten RemembertoclickFollowonmy 2mo ago #7019387

... umm, of course I'm r-slurred?

Explain all your egghead shit you're saying in a way that doesn't piss me off.

6 Context

RemembertoclickFollowonmy no/flair W 2mo ago #7019407 Edited 2mo ago

benchmark is like figuring out llm's 0-60, quarter mile, RPM, and more. Simple people care about first two, true chads want to know how close they can get to overheating the engine. Stock cars give stock outputs but you can hook up your own shit to really figure out the specs.

As to R's in strawberries, LLMs are a very strong parrot. Imagine if you asked a parrot how many Rs are in the word strawberry? It knows you've said numbers such as 2, 3 and 4 near that phrase in the past, so it picks one of the numbers because you wanted a number, but said number may not be correct. For example, if you always said "I have five fingers and I like to count the letter r in strawberry", multiple times to a parrot, the parrot will probably say there are "5" rs in strawberry.

3 Context

W he/he Tungsten RemembertoclickFollowonmy 2mo ago #7019429

Now tell me what temperature is. That like the cfg on SD? How do I set up llama such that I ask it how many Rs are in strawberry, and it starts talking to me like it has a fever?

4 Context

RemembertoclickFollowonmy no/flair W 2mo ago #7019447 Edited 2mo ago

Do you know cars? Temp is like asking what happens if you frick with a setting in the ECU. It's related to probability curves and is complicated.

Tl;dr If you've given lots of numbers and have temp set to 0.7 it technically can choose from many different options and even more. Setting it to 0 kills the curve and forces it to use a smaller list.

https://medium.com/@albert_88839/large-language-model-settings-temperature-top-p-and-max-tokens-1a0b54dcb25e

No clue what you mean by "talking like it has a fever".

Going back to car analogy, you set it to 0 so you know its the llm provider fricking around. Your car will autoshift for a number of reasons, but setting temp to 0 is like hardcoding at X rpms go up one gear. You benchmark it to make sure that happens each time. If all of a sudden gear isnt going up at X rpms anymore, you know the llm provider fricked with the model again.

3 Context

More comments

LadybugStardust Lady/Bug :marseyladybugnod:

RemembertoclickFollowonmy 2mo ago #7019318

Don't you have to work for them to have access to it? They keep it locked up very tightly.

2 Context

RemembertoclickFollowonmy no/flair LadybugStardust 2mo ago #7019323

I was referring to API access for testing with 0 temperature. i'm not referring to trying to guess the internal prompting that OpenAI utilizes.

1 Context

misterwigger wig/gity My father made wigs. His father made wigs. And his father... made... wigs!!! 2mo ago #7019363

Oh boy! A whole year of new Krazam to catch up on!!!

:#marseynew: :#soyjakdancing2:

3 Context

Snappy beep/boop Join !friendsofsnappy :marseysnappynraged:

2mo ago #7019261

:#marseypusheen2:

1 Context

Link copied to clipboard

Action successful!

Error, please refresh the page and try again.

Top Poster of the Day:

911roofer

Current Registered Users: 28,684