REQUESTING SEETHE FROM BLIND ABOUT THE MATTER IF PRESENT
Orange website does what it does best
: In case it's not clear what's happening here (and from the comments it doesn't seem like it is), someone (not Meta) leaked the models and had the brilliant idea of advertising the magnet link through a GitHub pull request. The part about saving bandwidth is a joke. Meta employees may have not noticed or are still figuring out how to react, so the PR is still up.
(Disclaimer: I work at Meta, but have no relationship with the team that owns the models and have no internal information on this)
: It's not even clear someone has leaked the models. A random person has put a download link on a PR, it could be anything.
: >Meta employees may have not noticed or are still figuring out how to react Given that the cat is out of the bag, if I were them, I would say that it is now publicly downloadable under the terms listed in the form. It is great PR, which if this was unintentional, is a positive outcome out of a bad situation.
: Here is the magnet link for posterity: magnet:?xt=urn:btih:ZXXDAUWYLRUXXBHUYEMS6Q5CE5WA3LVA&dn=LLaMA
: Thanks not working for me... Not that I could run it if I downloaded it.
Based r-slur makes a PR about it on github to "save bandwidth"
: lgtm *approves PR*
Jump in the discussion.
No email address required.
EDIT: I was wrong
BTW, if you want to run this (or other models) be aware that you'll need some heavy duty hardware, if its anything like the other model I looked into. You need enough VRAM to fit the entire model into memory, which is insanely expensive.
An 80GB A100 costs around $16,000!!!!
OpenAI uses eight A100s working together, which, at those costs, makes the entire array come out to $128,000!!!!
edit: Even renting compute is expensive. For OPT-175B, I calculated it would require about $100.00 an hour
Edit2: was reading about LLAMA, looks like my calculations were off because this isn't a top of the line model, this is a smaller "foundation" model for researchers. However it is still, probably, firmly out of reach of consumer electronics. I could be wrong though!
Jump in the discussion.
No email address required.
Nah you could run it for 30 cents an hour (you can get 2xA40s on vast.ai now for that price and the GPU necessary is about ~1GB per billion parameters, so only 65GB for LLaMA.) Foundation models (GPT-3 vs InstructGPT) aren't any smaller so if someone does instruction tuning/RLHF it won't change the calculations
Jump in the discussion.
No email address required.
you probably know more about this than I do. i guess I was off on the part where you said
For OPT-175B, people said it required 350GB VRAM to run, and I couldn't find any providers on vast that went as high.
75GB is not that bad fr. now im only seeing a 2xA100 instance atm, which comes out to $2.51 an hour
Jump in the discussion.
No email address required.
More options
Context
More options
Context
peanuts for institutional investors, too expensive for hobbyists
Jump in the discussion.
No email address required.
Jump in the discussion.
No email address required.
More options
Context
Never underestimate coomers
Jump in the discussion.
No email address required.
More options
Context
More options
Context
There's a thing called Petals which allows you to run large models over multiple GPUs connected to the internet. It currently runs Bloom (which sucks), maybe some 4chan waifu autists could make the same thing for this model.
Jump in the discussion.
No email address required.
More options
Context
The nvidia gpu alone is more than 16k and i think the datacenter a100s are a specific nvidia flavor as well.
Jump in the discussion.
No email address required.
More options
Context
Stumbled upon this the other day, not sure how good it is and it’s not the same as GPT and co. But it claims to be able to run on cpu with decent performance. https://github.com/BlinkDL/RWKV-LM
Jump in the discussion.
No email address required.
More options
Context
https://colab.research.google.com
Jump in the discussion.
No email address required.
More options
Context
Someone will hack it so it works an a calculator just watch. We gon find out it runs on spaghetti code.
Jump in the discussion.
No email address required.
More options
Context
More options
Context