Unable to load image

4chan user leaks Facebook's LLAMA, leaves personally identifiable information in the torrent

https://archived.moe/g/thread/91848262#p91850335

REQUESTING SEETHE FROM BLIND ABOUT THE MATTER IF PRESENT

Orange website does what it does best

:marseynerd:: In case it's not clear what's happening here (and from the comments it doesn't seem like it is), someone (not Meta) leaked the models and had the brilliant idea of advertising the magnet link through a GitHub pull request. The part about saving bandwidth is a joke. Meta employees may have not noticed or are still figuring out how to react, so the PR is still up.

(Disclaimer: I work at Meta, but have no relationship with the team that owns the models and have no internal information on this)

:marseynerd2:: It's not even clear someone has leaked the models. A random person has put a download link on a PR, it could be anything.

:!marseynerd2:: >Meta employees may have not noticed or are still figuring out how to react Given that the cat is out of the bag, if I were them, I would say that it is now publicly downloadable under the terms listed in the form. It is great PR, which if this was unintentional, is a positive outcome out of a bad situation.

:marseypirate:: Here is the magnet link for posterity: magnet:?xt=urn:btih:ZXXDAUWYLRUXXBHUYEMS6Q5CE5WA3LVA&dn=LLaMA

:marseygigaretard:: Thanks not working for me... Not that I could run it if I downloaded it.


Based r-slur makes a PR about it on github to "save bandwidth"

:marseyneko:: lgtm *approves PR*

:marseynotes:: Good catch! This will save millions in bandwidth costs.

128
Jump in the discussion.

No email address required.

EDIT: I was wrong

BTW, if you want to run this (or other models) be aware that you'll need some heavy duty hardware, if its anything like the other model I looked into. You need enough VRAM to fit the entire model into memory, which is insanely expensive.

An 80GB A100 costs around $16,000!!!!

OpenAI uses eight A100s working together, which, at those costs, makes the entire array come out to $128,000!!!!

edit: Even renting compute is expensive. For OPT-175B, I calculated it would require about $100.00 an hour

Edit2: was reading about LLAMA, looks like my calculations were off because this isn't a top of the line model, this is a smaller "foundation" model for researchers. However it is still, probably, firmly out of reach of consumer electronics. I could be wrong though!

Jump in the discussion.

No email address required.

Nah you could run it for 30 cents an hour (you can get 2xA40s on vast.ai now for that price and the GPU necessary is about ~1GB per billion parameters, so only 65GB for LLaMA.) Foundation models (GPT-3 vs InstructGPT) aren't any smaller so if someone does instruction tuning/RLHF it won't change the calculations

Jump in the discussion.

No email address required.

:#marseyshrug:

you probably know more about this than I do. i guess I was off on the part where you said

GPU necessary is about ~1GB per billion parameters

For OPT-175B, people said it required 350GB VRAM to run, and I couldn't find any providers on vast that went as high.

75GB is not that bad fr. now im only seeing a 2xA100 instance atm, which comes out to $2.51 an hour

Jump in the discussion.

No email address required.

peanuts for institutional investors, too expensive for hobbyists :marseyitsover:

Jump in the discussion.

No email address required.

too expensive for poors

Jump in the discussion.

No email address required.

Never underestimate coomers

Jump in the discussion.

No email address required.

There's a thing called Petals which allows you to run large models over multiple GPUs connected to the internet. It currently runs Bloom (which sucks), maybe some 4chan waifu autists could make the same thing for this model.

Jump in the discussion.

No email address required.

The nvidia gpu alone is more than 16k and i think the datacenter a100s are a specific nvidia flavor as well.

Jump in the discussion.

No email address required.

Stumbled upon this the other day, not sure how good it is and it’s not the same as GPT and co. But it claims to be able to run on cpu with decent performance. https://github.com/BlinkDL/RWKV-LM

Jump in the discussion.

No email address required.

Jump in the discussion.

No email address required.

Someone will hack it so it works an a calculator just watch. We gon find out it runs on spaghetti code.

Jump in the discussion.

No email address required.

Link copied to clipboard
Action successful!
Error, please refresh the page and try again.