Unable to load image

[Massive Cope] Why Twitter Didn’t Go Down: From a Real Twitter SRE

https://matthewtejo.substack.com/p/why-twitter-didnt-go-down-from-a

:#marseyelonpaypig:

So here we go. Twitter supposedly dumped 80% of their work force and ex-employees and media predicted it would collapse in on itself. We have all seen the articles ad nauseam at this point. Well we have another article from someone who supposedly did some worky at twitter.

Twitter supposedly lost around 80% of its work force. What ever the real number is, there are whole teams with out engineers on it now. Yet, the website goes on and the tweets keep coming. This left a lot wondering what exactly was going on with all those engineers and made it seem like it was all just bloat. I’d like to explain my little corner of Twitter (though it wasn’t so little) and some of the work that went on that kept this thing running.

For five years I was a Site Reliability Engineer(SRE) at Twitter. For four of those years I was the sole SRE for the Cache team. There was a few before me, and the whole team I worked with, where a bunch came and went. But for four years I was the one responsible for automation, reliability and operations in the team. I designed and implemented most of the cowtools that are keeping it running so I think I’m qualified to talk about it. (There might be only one or two other people)

So basically this one guy and maybe 2 other people designed all the automation cowtools to help keep Twitter afloat. It gets more technically but let's admit it, most none of us really care.

This substack generated a 1000 comment thread on HN

Why Twitter Didn’t Go Down: From a Real Twitter SRE

Top comment

This left a lot wondering what exactly was going on with all those engineers and made it seem like it was all just bloat.

I was partly expecting the rest of the article to explain to me why exactly it wasn't just bloat. But it goes on talking about this 1~3-person cache SRE team that built solid infra automation that's really resilient to both hardware and software failures. If anything, the article might actually persuade me that it was all bloat.

This is where my :marseybrainlet: went as well. If 1-3 people can automate something that kept Twitter up and running, the frick where the other 6,077 employees doing?

Twitter had 7,500 employees. most of the roles you mention (security engineers, code reviews, data architecture, HR, internal audit teams, content moderators, scrum master) are not bloat. So the question is what are the other 7000 people doing?

They had over a thousand moderators. So maybe your estimate of how many people is required is a bit off.

Twitter had a lot of people in marketing.

1-3 people doing SRE engineering to keep Twitter automated vs 1,000 moderators. :marseycatgirljanny:

I think the real question is: Twitter grew 3x on the headcount front with a flat stock price over the course of less than 5 years. What exactly where these thousands of employees actually doing and why did the previous CEO think what they were doing was worth hiring them for? That's just basic accountability from a stock holder or employee perspective. That's apparently a ton of money being wasted on nothing at all.

Twitter used to experience significant downtime compared to all other major platforms and one of the reason was its lack of redundancies across everything. Headcount is one such thing and it takes manpower to automate infrastructures as discussed in the post.

Sure, you can run the platform with 1/10 headcount with significantly degraded user experiences (say 98%). This is not a problem for startups but people usually have higher expectations for established companies. As always, the last 2% is a hard problem and business doesn't really want to deal with a such unreliable platform. You wanna onboard big advertisers which potentially spend $100M ARR? Then you need to assign a dedicated account manager to handle all customer escalations. PMs then triage and plan their feature requests and later engineers implement it. Which all adds up.

And they also uses your competitor's product, like Google, FB, TikTok etc etc... Twitter is a severely underdog here, so you need to support at least a minimal, essential subset of features in those products to convince them to spend their money on Twitter. That alone takes hundreds of engineers, data scientists and PM thanks to modern ad serving stacks with massive complexity...

Does your product keep crashing? Just hire people. That is all you need to do. See that bum on the corner. Increased headcount. Bam. Solved. Yeah Twitter is lacking in staff compared to Google, which run a multitude of other products, or Facebook, which runs a multitude of other products. I keep seeing this weird misconception that Elon fire 80% of Twitter and that 80% were all engineers and coders. :marseyshrug:

No matter if it was or not and for better or worse: If Twitter survives this without any major harm it will have profound consequences for the whole software industry.

:marseydeadinside2:

There are many, many posts like these basically arguing that "it runs fine now, and fine for the nth week, but :marseysal:" I'm lazy and won't link.

The most helpful thing to reflect on in these Twitter operational discussions is the difference between homeostasis and evolution. You can get rid of 80% of the work force and the existing homeostasis systems will keep things running smoothly despite known day-to-day chaos. Where you’re really going to run into trouble is inventing responses to novel chaos and gradually changing times.


I agree those were odd takes. I've likened firing most of the engineers to taking your hands off the wheel in the car. It won't crash immediately, but it doesn't mean the car can go driverless. With that said, there are differences between internal systems and something like Twitter on the public internet. I assume that Twitter is a system under constant attack. What happens when the next log4shell level vulnerability comes out?


All this does is point out that smart people worked at Twitter who may now no longer work there, whether on their own accord, or due to Elon’s bulldogging tactics.

Elon thinks he knows what he’s doing, but what he is going to be left with are people who are willing to work hard by his standards, but not necessarily smart.

The simple truth is Elon knows nothing about the actual work involved in tech. He knows words or elicits help from others on what to say that sounds like tech speak (RPCs!), but when it comes to being truly knowledgeable in this space, he is losing his most valuable assets because of his amazingly poor managerial and ownership style.

I know there are a lot of Elon fans on this site, and will disagree with all of this; but his abilities have not at all been proven. Yes, he knows how to spend money to claim credit for technical advances, but until he actually has his hands dirty in the muck of the hard work of tech, he will always be a glorified self-promoter with no substance.

And Twitter will suffer for it.


I think people's expectations became so exaggerated it was inevitable they wouldn't be lived up to. I'm sure Twitter will experience degradation from the drastic cost-cutting, but it was never going to happen overnight and I'm not sure why news outlets were saying that (except that their sources were employees with slightly inflated senses of their own importance, which we're all sometimes guilty of). And people became really invested in the idea that a site cannot possibly stay up without dedicated SREs, as though tons of tech sites (including big names like Amazon) don't just devolve this work to their on-call rotations.


Now to :chudsey: talking about things being overblown

Because politics and ideologues are so prevalent and everyone's attaching 'the other' (both sides) for the own. And of course people in 'important' roles who've been laid off are going to say the company is doomed, these are the probably the worst source to go off. They're not going to say "oh yeah I didn't do much at all really, just bossed people around and spent my budget every year."

It's actually scary how many people, even engineers, put their reputation on the line saying Twitter wouldn't survive the weekend. It wasn't just Twitter employees. It's like a mass psychosis of some kind. It comes off as a kind of desperation, as though they need Elon to fail. Why? What's driving that response?


It wasn't only news outlets. A majority of my... politically-noisy tech friends on Facebook went through a recent phase of intensely posting about Twitter on fire and collapsing. Now just because they're "in tech" doesn't mean they have any idea about Twitter, but they should at least know enough to know they don't know what's going to happen, but obviously they're not actually using their brains when posting comments like that. Point is a lot of people opposed to Musk have been participating in a spiraling echo chamber of fairy tales and wishful thinking, it's not just journ*lists (although clearly they're printing lies with ulterior motvies too, as usual).

I'm unable to find HNer's discussing how the media souring recently fired employees is probably not the best non-biased source of information, but, there are too many words.

Me rn :

:#!marseygiveup:

:#marseyyawn:

:#marseysleep:

20
Jump in the discussion.

No email address required.

lol, lmao

Jump in the discussion.

No email address required.

Link copied to clipboard
Action successful!
Error, please refresh the page and try again.