Maggie Johnson-Pint (@maggiepint) / X

Maggie Johnson-Pint

19.3K posts

Maggie Johnson-Pint

@maggiepint

Dog Person. DateTime weirdo. These days I work on planes. Forever ❤️JS. She/her @maggie.bsky.social @[email protected]

Woodinville, WA

Joined July 2014

Maggie Johnson-Pint
@maggiepint
Jul 2, 2023
This is weird to say out loud, but I actually am kinda an expert in rate limiting, so I'm gonna explain some stuff. About half of incidents in large-scale production systems involve having more requests than you can serve. There are two categories of this kind of incident:
2.6M
Maggie Johnson-Pint
@maggiepint
Jul 2, 2023
Replying to @maggiepint
1. Top-Down overload or "Reddit Hug of Death": This is what Bluesky experienced today - suddenly there was a HUGE demand surge and the servers just *couldn't* for a while. This also happens after superbowl ads or when pop stars announce tours or during DDOS attacks.
201K
Maggie Johnson-Pint
@maggiepint
Jul 2, 2023
Replying to @maggiepint
2. Bottom-up: This is the less obvious and more common scenario, when something inside the system fails, that makes the system unable to serve normal load. If you lose a redis cache and everything is reading to DB, you will drastically reduce your ability to serve requests.
190K
Maggie Johnson-Pint
@maggiepint
Jul 2, 2023
Replying to @maggiepint
I don't know what happened at Twitter today, but I don't think Elon woke up and decided to shut it all down - my bet is some 'bottom up' problem (but not necessarily the DDOSd yourself problem everyone is tweeting about - that could be an effect of getting limited, not the cause)
168K
Maggie Johnson-Pint
@maggiepint
Jul 2, 2023
Replying to @maggiepint
Anyways, hope this was informative to someone somewhere because it took a while to write 😂.
68K
Maggie Johnson-Pint
@maggiepint
Jul 2, 2023
Replying to @maggiepint
The best rate limiters are 'adaptive', and can change rate limits based on system stress, priority of requests, and other things. Twitter has a really good one because they had a really exceptional infra team until a year ago.
123K
Maggie Johnson-Pint
@maggiepint
Jul 2, 2023
Replying to @maggiepint
Similarly, if a database replica, cloud region, or cluster goes down, you will be in a really tough spot for serving normal workload. And of course if a developer on one service writes code that suddenly slams another service, that's "DDOSing Yourself" and is also bottom-up.
183K
Maggie Johnson-Pint
@maggiepint
Aug 13, 2023
My husband quit tech and ran a home improvement business for about a year. He was actually pretty good at it, had more business than he could take. He went back to tech. Turns out it's unending 12 hour days and body pain for 1/3 the money. For the farmer folks.
267K
Maggie Johnson-Pint
@maggiepint
Mar 13, 2023
Replying to @ask_aubry
He's gonna have a BIG surprise when he finds out the courts in pretty much all states won't let you take your kids out of state in divorce situations for basically any reason besides physical abuse.
36K
Maggie Johnson-Pint
@maggiepint
Jul 2, 2023
Replying to @maggiepint
Even if they don't crash, requests stack up waiting for completion - this is called 'backup' - which is what causes the slowness in the requests that do work. Backups have this bad effect of causing users to refresh the page, causing more requests and... more backups.
148K
Maggie Johnson-Pint
@maggiepint
Jul 2, 2023
Replying to @maggiepint
In these scenarios, the rate limiter is the only thing standing between you and death - because of course if computers get hit with more requests than they can deal with eventually they OOM and crash.
158K
Maggie Johnson-Pint
@maggiepint
Feb 23, 2021
A lot of things you think are best practice are actually just your opinion.
Savvas Stephanides
@SavvasStephnds
Feb 22, 2021
Offend a programmer with a single tweet
Maggie Johnson-Pint
@maggiepint
Jul 2, 2023
Replying to @maggiepint
My hypothesis - Twitter lost a big part of a critical back end system - maybe they stopped paying their GCP bill, maybe they lost a critical cache and everything was reading other data, I truly do not know.
100K
Maggie Johnson-Pint
@maggiepint
Jul 2, 2023
Replying to @maggiepint
Another: "I'm a product developer - why do I care about an infra problem?" 1. if you handle this in code you can do something other than give your users 'error' 2. If you handle this in client code, you can save the entire infrastructure by never sending. Literal hero shit.
74K