Unable to load image

Logging every single Bluesky post :marseyevilgrin:

I'm logging every single Bluesky post.

Their API is designed like dogshit don't let the :marseytrain2:s that wrote it tell you otherwise.

It's like 10-15k posts per minute.

My log file is growing by like 100-200MB/hr of just text lol.

I don't get how they think Bluesky won't be used for AI training when there's an unauthenticated stream that lets you log absolutely everything.

I don't know if I'm violating the ToS because I don't care if I am.

Tell me if you want me to grep anything juicy.

:#marseyevilgrin:

175
Jump in the discussion.

No email address required.

!codecels behold this obscure :marseynonpotable: secret :marseyglow: hack https://docs.bsky.app/blog/jetstream

Jump in the discussion.

No email address required.

Reached out to the guy behind https://pullpush.io/ - @pullpush-actual

He's up for setting up something similar for bluesky if you can find someone to cover hosting costs

Jump in the discussion.

No email address required.

Google didn't show me this :marseyrain:

Twitter had a firehose API that only fancy people could use at high cost.

There's no way they maintain this long term.

It's ripe for extreme abuse. And the userbase is going to seethe hard when AI scrapers are using it lmao.

Every piece of text and image getting scraped and turned into an LLM in real time.

Additionally that data coming in is pre-jannied, so even if it's getting censored somebody is going to do a report on what they see the users actually trying to post like fedposts etc.

Jump in the discussion.

No email address required.

Oh no their posts that they're posting online on a public website are going to be used for AI

:#marseysleep3:

Jump in the discussion.

No email address required.

Link copied to clipboard
Action successful!
Error, please refresh the page and try again.