I’m starting a new company with few good people and an exciting idea. I appreciate all the support. More updates, summaries, feels, etc will follow. linkedin.com/posts/gwenshap…
Are we still obsessing over the mythical big data? Because the real problem in most orgs is 50000000 very small data sets, some barely maintained, loosely tied together between wiki, jira, slack, google drive, bigquery, postgres, zendesk, salesforce, and a bunch of excels.
Biggest mistake you've ever made in production?
Mine is from 25 years ago:
My manager asked me to "clean space on the database servers".
I found a bunch of files called ".log" taking a lot of space.
So I deleted them.
<Waiting for everyone who knows DBs to 🤦♀️>
Of course
Yesterday, I posted a riddle:
What happens when you add and remove a column from a table in Postgres 2000 times?
Answer:
After 1598 times, you get "ERROR: tables can have at most 1600 columns"
But... the table only has 2 columns when I get the error!
So... why?
Because
Unpopular opinion: Most software architecture advice sucks because everyone just talks about the architectures in their most successful projects and coolest companies. It is all survivor bias, hindsight bias and cherrypicked results.
OK, I just learned about port reuse in MacOS, and it is a bit wild. 🦓🐺🦏
Here's what happened:
1. Start NextJS app on port 3000.
2. Point browser to localhost:3000. Next looks good.
3. Start Rails app on port 3000.
4. Refresh. Rails looks good.
WTF?
Myth: Using UUID as the primary key will slow down inserts.
Fact: Not in Postgres.
I often recommend using UUIDs instead of integer sequences as primary keys. I was surprised to discover that many developers are uncomfortable with them and believe they will slow down inserts.
If you are using Postgres for embeddings, JSON or large text columns, or all of the above... then you are using TOAST, whether you know it or no.
TOAST can have massive impact on performance. So lets talk about it a bit: 🧵
Now that I'm thinking about job queues, I don't think I have a solution I'm 100% happy with.
- RDBMS doesn't scale well
- Kafka is a mismatch in APIs and data model
- RabbitMQ was a bit of a DR nightmare last I tried
Anyone has OSS queue that scaled to 100K+ workers?