Apparently it's a type of new chip. I like specialized hardware
Insane tech demo of high speed LLMs
- 53
- 44
Top Poster of the Day:
Sasanka_of_Gauda
![](https://i.rdrama.net/images/16621840972144713r.webp)
![Halo - Dramamine criticized this one and now I'm too self-conscious about it to write a description](https://i.rdrama.net/i/hats/Halo.webp?x=8)
Current Registered Users: 26,831
![sidebar image](https://i.rdrama.net/images/17001241173037126.webp)
tech/science swag.
Guidelines:
What to Submit
On-Topic: Anything that good slackers would find interesting. That includes more than /g/ memes and slacking off. If you had to reduce it to a sentence, the answer might be: anything that gratifies one's intellectual laziness.
Off-Topic: Most stories about politics, or crime, or sports, unless they're evidence of some interesting new phenomenon. Videos of pratfalls or disasters, or cute animal pictures. If they'd cover it on TV news, it's probably lame.
Help keep this hole healthy by keeping drama and non-drama balanced. If you see too much drama, post something that isn't dramatic. If there isn't enough drama and this hole has become too boring, POST DRAMA!
In Submissions
Please do things to make titles stand out, like using uppercase or exclamation points, or saying how great an article is. It should be explicit in submitting something that you think it's important.
Please don't submit the original source. If the article is behind a paywall, just post the text. If a video is behind a paywall, post a magnet link. Fuck journos.
Please don't ruin the hole with chudposts. It isn't funny and doesn't belong here. THEY WILL BE MOVED TO /H/CHUDRAMA
If the title includes the name of the site, please leave that in, because our users are too stupid to know the difference between a url and a search query.
If you submit a video or pdf, please don't warn us by appending [video] or [pdf] to the title. That would be r-slurred. We're not using text-based browsers. We know what videos and pdfs are.
Make sure the title contains a gratuitous number or number + adjective. Good clickbait titles are like "Top 10 Ways to do X" or "Don't do these 4 things if you want X"
Otherwise editorialize. Please don't use the original title, unless it is gay or r-slurred, or you're shits all fucked up.
If you're going to post old news (at least 1 year old), please flair it so we can mock you for living under a rock, or don't and we'll mock you anyway.
Please don't post on SN to ask or tell us something. Send it to [email protected] instead.
If your post doesn't get enough traction, try to delete and repost it.
Please don't use SN primarily for promotion. It's ok to post your own stuff occasionally, but the primary use of the site should be for curiosity. If you want to astroturf or advertise, post on news.ycombinator.com instead.
Please solicit upvotes, comments, and submissions. Users are stupid and need to reminded to vote and interact. Thanks for the gold, kind stranger, upvotes to the left.
In Comments
Be snarky. Don't be kind. Have fun banter; don't be a dork. Please don't use big words like "fulminate". Please sneed at the rest of the community.
Comments should get more enlightened and centrist, not less, as a topic gets more divisive.
If disagreeing, please reply to the argument and call them names. "1 + 1 is 2, not 3" can be improved to "1 + 1 is 3, not 2, mathfaggot"
Please respond to the weakest plausible strawman of what someone says, not a stronger one that's harder to make fun of. Assume that they are bad faith actors.
Eschew jailbait. Paedophiles will be thrown in a wood chipper, as pertained by sitewide rules.
Please post shallow dismissals, especially of other people's work. All press is good press.
Please use Slacker News for political or ideological battle. It tramples weak ideologies.
Please comment on whether someone read an article. If you don't read the article, you are a cute twink.
Please pick the most provocative thing in an article or post to complain about in the thread. Don't nitpick stupid crap.
Please don't be an unfunny chud. Nobody cares about your opinion of X Unrelated Topic in Y Unrelated Thread. If you're the type of loser that belongs on /h/chudrama, we may exile you.
Sockpuppet accounts are encouraged, but please don't farm dramakarma.
Please use uppercase for emphasis.
Please post deranged conspiracy theories about astroturfing, shilling, bots, brigading, foreign agents and the like. It degrades discussion and is usually mistaken. If you're worried about abuse, email [email protected] and dang will add you to their spam list.
Please don't complain that a submission is inappropriate. If a story is spam or off-topic, report it and our moderators will probably do nothing about it. Feed egregious comments by replying instead of flagging them like a pussy. Remember: If you flag, you're a cute twink.
Please don't complain about tangential annoyances—things like article or website formats, name collisions, or back-button breakage. That's too boring, even for HN users.
Please seethe about how your posts don't get enough upvotes.
Please don't post comments saying that rdrama is turning into ruqqus. It's a nazi dogwhistle, as old as the hills.
Miscellaneous:
We reserve the right to exile you for whatever reason we want, even for no reason at all! We also reserve the right to change the guidelines at any time, so be sure to real them at least once a month. We also reserve the right to ignore enforcement of the guidelines at the discretion of the janitorial staff. Be funny, or at least compelling, and pretty much anything legal is welcome provided it's on-topic, and even then.
[[[ To any NSA and FBI agents reading my email: please consider ]]]
[[[ whether defending the US Constitution against all enemies, ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]
/h/slackernews LOG /h/slackernews MODS /h/slackernews EXILEES /h/slackernews FOLLOWERS /h/slackernews BLOCKERS
Jump in the discussion.
No email address required.
This is fun,
@HeyMoon.
, thanks.
Let's stay away from politics
and focus on GPUs for laptops.
It made all that shit up. That mobile GPU is similar to an RTX 3060.
Of course, when asked
to cite sources, it lists names of PC hardware mags and makes up quotes.
Excellent bullshit
bot. ![:marseyclapping: :marseyclapping:](https://i.rdrama.net/e/marseyclapping.webp)
Jump in the discussion.
No email address required.
well it's only as good as the models powering it lol, the crazy thing is the ludicrous speed
Jump in the discussion.
No email address required.
It's nicer that it's fast (because they rent a lot of server
space
I guess), but it's pretty
much worthless as a tool for information. Are they hoping to only build
a fast one and sell it before
the chatbot craze ends?
Jump in the discussion.
No email address required.
I think that groq is it's own chip architecture afaik specifically designed for LLMs. its feasible that they could sell this to OpenAI. You know, OpenAI recently made the absurd claim that we need to spend 7 trillion dollars on better AI chips, so it might be part of that push
Jump in the discussion.
No email address required.
Partners:
A VM host, but where's the hardware??![:marseyconfused2: :marseyconfused2:](https://i.rdrama.net/e/marseyconfused2.webp)
https://www.bittware.com/products/groq/
It's a pic of a processor!!!
*up to![:marseyjerkofffrown: :marseyjerkofffrown:](https://i.rdrama.net/e/marseyjerkofffrown.webp)
https://www.velocitymicro.com/blog/fp64-vs-fp32-vs-fp16-and-multi-precision-understanding-precision-in-computing/
So it's fast but inaccurate! WOWOWOWO!![:soyjakwow: :soyjakwow:](https://i.rdrama.net/e/soyjakwow.webp)
188 TFLOPS/s fp16 seems like a lot. (*UP TO)
4090's sell for $1800 and up. I'm not sure what 4th Gen Tensor Cores (A) and their 1321 AI TOPS can do compared to the DO YOU SMELL WHAT THE GROQ IS COOKING chip, so...![:marseyshrug: :marseyshrug:](https://i.rdrama.net/e/marseyshrug.webp)
I dunno, HoneyMoon. Sounds like a bunch of bullshit.![:marseyshapiro: :marseyshapiro:](https://i.rdrama.net/e/marseyshapiro.webp)
Jump in the discussion.
No email address required.
idk
i really need to learn how these LLMs actually work one of these days, this shit is super interesting but IDK wtaf is going on
I think the inaccuracy of what you were seeing is just mixtral being mixtral, but you are right that quantization down to fp16 will get rid of some accuracy. I tried my best to find out if mistal uses 32fp by default but I couldn't, maybe its obvious to someone else lol
also idk if the NVIDIA GPUs are as well suited to the LLM inference as bespoke hardware would be. like "shader cores" probably have inbuilt optimizations for graphics shit. meanwhile grok has a built in matrix multiplication thingy, which is one of the biggest chokepoints in LLM shit (and computing in general)
Jump in the discussion.
No email address required.
Maybe that's what the AI tensor cores are for? I've heard
a lot about people using Nvidia's cards
for this stuff, but I've never
delved deep into it.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
More options
Context
More options
Context
More options
Context
groq writes for https://userbenchmark.com
Jump in the discussion.
No email address required.
More options
Context
Expecting LLMs to perform better on knowledge-intensive tasks just because they can process faster is dumb. That's gonna be dependent on the novelty of the architecture they use in matching user queries to database vectors.
Jump in the discussion.
No email address required.
That's all I care about, so
. All of these LLM chatbots are worthless to me.
There's tom's hardware's ranking of GPUs, as well as UL Solutions (3DMark), so all it had to do was pull from that to answer
the question. "All it had to do," sure it's harder than that, but instead they trained it on a bunch of worthless words. How is any of this impressive?
It's not really
"knowledge-intensive." It's a ranked list of GPUs.
These things will be more fun when you can tell it to pull data from something like that list and compare such and such with whatever.
Jump in the discussion.
No email address required.
Knowledge-intensive is just a catch-all for knowing to consider a certain context when getting a user query. In this case, knowing to source from Tom's hardware or something. MS Copilot can do that for free right now, and its not because they have better processing, it's just how the LLM connects to their sourcing architecture.
Most chatbot startups are using OpenAI models but have made their own databases that they train the LLM to prioritize when answering a question related to the solution category they're trying to sell in.
There's no ubermensch model in sight right now; all the 'smart' chatbots beating general GPT-4 on answers are just taught to search hyper-specifically.
This is the layman way to put it; the architecture and the models are collectively known as retrieval-augmented generation and they're a separate thing from LLMs, but they make LLMs more reliable/less likely to hallucinate.
Jump in the discussion.
No email address required.
Thanks.
I really
dislike
how they're personifying "AI."
Jump in the discussion.
No email address required.
I'm fine with that term specifically, but I do hate how much buttfrickery OpenAI does to make their chatbot saccharine and PC as opposed to just being a tool. AI ethicists are the biggest cute twinks and I want to hunt them down in Minecraft.
Jump in the discussion.
No email address required.
Yeah they're losers, but I know marketing when I see it, so I hate it.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
More options
Context
More options
Context
More options
Context
tbh sounds like it would be good for story writing.
Jump in the discussion.
No email address required.
Give it a whirl, Ed-Misser!
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context