
Millions, Trillions, and Invoice processing
🗂️ R2 Data Catalog: Managed Apache Iceberg tables with zero egress fees.
What happened – Cloudflare launched R2 Data Catalog public beta: a built-in Apache Iceberg catalog on every R2 bucket.
Why it matters – Lets teams query lakehouse data with ACID guarantees and zero egress without running separate metastore services.
⏱️ How Smartsheet Reduced Latency and Optimized Costs in Serverless Architecture.
What happened – Smartsheet fine-tuned AWS Lambda concurrency and storage, cutting P99 invocation latency 80 % and big-query cost 60 %.
Why it matters – Shows concrete tactics—provisioned concurrency, Graviton, S3 tiering—to slash latency and spend on mature serverless stacks.
🕵️ How Netflix Accurately Attributes eBPF Flow Logs.
What happened – Netflix replaced event-only mapping with FlowExporter heartbeats and regional collectors, eliminating misattribution across 5 M flows/sec.
Why it matters – Delivers trustworthy, near-realtime dependency graphs for security and incident triage without extra storage or cost.
🔍 How Discord Indexes Trillions of Messages.
What happened – Discord rebuilt search on Kubernetes-managed Elasticsearch cells, PubSub queues, and per-user shards, doubling indexing throughput and cutting p99 latency 50 %.
Why it matters – Blueprint for migrating from monolithic clusters to multi-cell search that scales to petabytes without dropping messages during failures.
🔐 How Meta Understands Data at Scale.
What happened – Meta embedded automated data classification and lineage checks into dev workflows, surfacing types and owners for millions of data assets instantly.
Why it matters – Inline metadata lets engineers meet privacy constraints early, reducing rework and accelerating safe product delivery.
📌 Improving Pinterest Search Relevance Using Large Language Models.
What happened – Pinterest deployed a fine-tuned cross-encoder to re-rank Pin-query pairs, boosting search precision and engagement metrics.
Why it matters – Shows affordable LLM distillation can lift search quality with a small model and sub-100 ms latency.
🚨 Anomaly Detection in Time Series Using Statistical Analysis at Booking.com
What happened – Booking.com built Granomaly, a service computing percentile-based prediction bands from Graphite history and publishing them as metrics.
Why it matters – Allows Grafana alerts that ignore past incidents yet catch drops within minutes—no heavy ML or external tools.
🧾 Advancing Invoice Document Processing at Uber Using GenAI
What happened – Uber’s TextSense platform applies GPT-4 plus OCR and Cadence workflows to extract 15-20 fields from multilingual invoices, halving manual work.
Why it matters – Case study of GenAI beating RPA—90 % header accuracy, 70 % handling-time drop, 25 % cost savings—via configurable, model-agnostic pipeline.
🌀 Overclocking dbt: Discord’s Custom Solution in Processing Petabytes of Data
What happened – Discord added env-based table aliasing, dbt-turbo partial parsing, and versioned backfills, cutting compile time 5× for 2,500 models.
Why it matters – Demonstrates how to extend dbt with macros and CI guardrails to support 100+ devs on petabyte warehouses.
This is not a system design per se. However, the scale itself here is eye browning. A case study on how Pinterest migrated 3.7 Million Lines of Flow Code to TypeScript. It may be especially interesting if you are into TS or frontend topics.
Got a link that belongs here, or any feedback? Reach out to me on LinkedIn, and I’ll check it out. Until next time – stay scalable! ✌️