Huggingface codecel makes a Bluesky post dataset for ML training and posts it on Bluesky, causes an absolute seethefest from the AIphobic and is bullied by :marseytrain2:s into taking it down and apologizing :marseyxd:

https://bsky.app/profile/danielvanstrien.bsky.social/post/3lbvih4luvk23

I've removed the Bluesky data from the repo. While I wanted to support tool development for the platform, I recognize this approach violated principles of transparency and consent in data collection. I apologize for this mistake.

Daniel van Strien (@danielvanstrien.bsky.social) 2024-11-27T02:19:57.958Z

These r-slurs realize there is a public firehose API where you can collect every post right? I myself collected like 20M before I got bored and stopped.

59
Jump in the discussion.

No email address required.

lol, did you know you can just ignore the T.O.S.!?

Jump in the discussion.

No email address required.

Link copied to clipboard
Action successful!
Error, please refresh the page and try again.