Writing
Data platform mistakes with a price tag.
Practical essays for people who own shared data platforms and have to make the ugly parts reliable, explainable, and cheaper.
Retries Are Not Free
Retry policy is a cost-control problem, not just a reliability default.
When Iceberg Says 321 GB and S3 Says 46 TB
Table metadata can tell one story while object storage tells another. Orphan files are a budget, reliability, and compliance risk.
Why Chargeback Fails in Shared Data Platforms
Infrastructure tags tell you who owns the box, not who caused the bill.
Who Owns This Table?
Most data lake cost problems are ownership problems wearing a storage bill costume.
The Full Rewrite Anti-Pattern in Data Lakes
Full rewrites are often the most expensive way to make a small logical change.