Data infrastructure, without the theatre
Valery Frolov
I write about building reliable, secure, and cost-aware data platforms, mostly from years of running data infrastructure at Wix.
Featured writing
Failure modes worth catching early.
The best platform lessons are specific: which number lied, which ownership field was missing, which maintenance job reported green while the expensive part kept burning.
When Iceberg Says 321 GB and S3 Says 46 TB
Table metadata can tell one story while object storage tells another. Orphan files are a budget, reliability, and compliance risk.
Who Owns This Table?
Most data lake cost problems are ownership problems wearing a storage bill costume.
The Full Rewrite Anti-Pattern in Data Lakes
Full rewrites are often the most expensive way to make a small logical change.
Operating lens
Metadata is not enough.
Table formats, object stores, schedulers, catalogs, and finance reports each tell partial truths. The work is connecting them before the platform learns about a problem through the bill.