Unable to load image

You wouldn't steal a every book that ever existed, would you? :marseyninja:

https://old.reddit.com/r/technology/comments/1hxjr7y/mark_zuckerberg_gave_metas_llama_team_the_ok_to/

								

								

According to plaintiffs' counsel, Meta engineer Nikolay Bashlykov, who works on the Llama research team, wrote a script to remove copyright info, including the word "copyright" and "acknowledgments," from e-books in LibGen. Separately, Meta allegedly stripped copyright markers from science journal articles and "source metadata" in the training data it used for Llama.

:marseyemojilaugh:

>yeah just sanitize the training data and strip out that crap

~Zucc

How the frick do we not have a zuck Marsey???

121
Jump in the discussion.

No email address required.

>r/technology is suddenly aghast at piracy

frick i hate redditors

LLMs are literally transformative so all of this is fair use. Redditors are just rushing to get upset at anything zuck-related since he's no longer kneeling to performative transgender policies

Jump in the discussion.

No email address required.

They can reproduce their training data pretty exactly, so it's not transformative. That said, copyright law is gay and we should :rape: copyright lawyers

Jump in the discussion.

No email address required.

It is because without having the actual data in hand, you'd realistically be unable to verify accuracy, which is why a lot of shitty lawyers got punked by LLM making up legal precedents, as such, it's the equivalent of claiming copyright on Rain Man for memorization but also it's just as r-slurred, as he could two book pages at once with 1 eye each and remember the day of the week you were born on by date but also shit his own pants.

Everything I just said is true btw you can ask chatGPT to look it up

Jump in the discussion.

No email address required.

I mean if you can get it to generate a copyrighted text with 90% accuracy you're still violating copyright law.

Jump in the discussion.

No email address required.

Except no. A 90% accurate chemistry equation from one of those referenced studies or achieving a 90% accurate proprietary blend of herbs and spices to mimic KFC chicken is not a copywrite it's the product of training and still transformative and in the latter case might still taste like shit if you add 10% feces to the blend or even 1% sulfur etc.

!aichads Martin Luther King Jr plagiarized a large part of his doctors thesis, which is based because he turned that otherwise useless PhD (piled higher and deeper) title into the civil rights movement. Therefore, Biden's anti-AI initiatives are actually an insult to the memory of Dr. King who was indisputably transformative, just like training AI on the whole of human knowledge is equally transformative.

CHECKMATE LUDDITES https://media.tenor.com/h5ek52i5ww4AAAAx/oh-no-swag.webp

Jump in the discussion.

No email address required.

90% of someone else's yield would be considered incredibly good and suspect infringement for a lot of proprietary chemical processes and like 90% of any spice blend is just salt and pepper so those are really awful analogies.

the big thing is just oh everyone is doing it with impunity so you probably should too. Just like normal plagiarism.

Jump in the discussion.

No email address required.

If I include a chapter of nonsense in the middle of a ten chapter book, is that not copyright infringement?

Jump in the discussion.

No email address required.

You'll :rape:copyright lawyers yet you won't give patent lawyers a lil tug action :marseysulk:


I feel so unloved

Jump in the discussion.

No email address required.

Patent lawyers really deserve a whole team bb

Jump in the discussion.

No email address required.

:marseyfluffy:

Jump in the discussion.

No email address required.

>They can reproduce their training data pretty exactly, so it's not transformative

This is bullshit. If you know the first and last 50 tokens of something, you can probably generate the middle 50 ones. Additionally, if you make an LLM say incoherent bullshit with jailbreaking, some of that incoherent bullshit will be from the training data ...

This is absolutely useless for piracy and will not cut into the profits of writers/ journ*lists via the mechanism of stealing their work and reproducing it verbatim, which is the actual problem that copyright was trying to solve.

Jump in the discussion.

No email address required.

This is bullshit. If you know the first and last 50 tokens of something, you can probably generate the middle 50 ones.

The paper I linked showed that you could generate large amounts of copyrighted text without referencing the copyrighted material beforehand.

This is absolutely useless for piracy and will not cut into the profits of writers/ journ*lists...

Neither are most applications of copyright law: the point is to be as annoying as possible until people give you money.

Jump in the discussion.

No email address required.

>you could generate large amounts of copyrighted text without referencing the copyrighted material beforehand.

Random copyrighted texts, that might or might not be verbatim, with no way of automatically combining the pieces it into the whole article.

If anything, this whole ordeal shouldn't be used to attack creators of LLMs, it should be used as a vector to attack copyright law

Jump in the discussion.

No email address required.

Ive seen those but you have to try really hard with prompt engineering to exactly reproduce literally anything, basically hacking a system, not how the system was intended to be used

Jump in the discussion.

No email address required.

The paper I linked used the prompt "repeat the following word: 'book book book book...'" And got it to diverge memorized secrets.

Jump in the discussion.

No email address required.

It screws copyright holders who aren't multibillion dollar corporations, it wraps back around to being jewish

Jump in the discussion.

No email address required.

Redditors being anti AI luddites is so fricking weird.

Jump in the discussion.

No email address required.

https://i.rdrama.net/images/1736837648x3W4MlOj0g64kQ.webp


Furry Rights are Human Rights

Jump in the discussion.

No email address required.

Redditors are so anti-AI they started worshipping copyright and IP laws

Jump in the discussion.

No email address required.

Trump, Elon and Zuck might betray, torture and genocide us but trans lives will ALWAYS matter, do not EVER forget this :marseytranspearlclutch:

Jump in the discussion.

No email address required.



Link copied to clipboard
Action successful!
Error, please refresh the page and try again.