Case Study: Skynet or Bust
Introduction
A company that is losing money has a hail-mary idea to build an ambitious new offering that has the potential to destroy or save the company, and there are only a few months to pull it off.
Background
A mobile advertising company saw early success with a few key partnerships but after they collapsed the company was no longer profitable and its runway was vanishing fast. There had been talk for a while of entering into what is called "Real-time Bidding" as a means to turn things around. The challenges were numerous:
- 3 months to bring the product to market
- New technology
- Extremely aggressive performance demands
- 120ms round-trip latency
- 9,000 requests-per-second on a laptop or less
- Limited threshold for downtime or errors
- Tight budget controls
Approach
There were many variables to work through and a very short amount of time. Technology issues were resolved through rapid proof-of-concepts and horizon planning. This allowed us to simplify scope and architecture to the absolute minimum to enter the market. Partnering closely with product management helped also reduce scope to a viable thin slice of functionality.
A key concern was how the bidding system would spend money. A simple risk analysis showed that a bug in how the system spent money, would result in the company being bankrupt in less than fifteen minutes.
The team adopted a more robust development approach to manage this and the other performance issues. Key elements of that approach were: Test-driven development, Performance testing, and WIP Limits.
Results
The team successfully delivered the first real-time bidding system within 24 hours of the 3-month deadline. It operated without downtime, degradation, or defects within the desired operational costs.
Also, the team developed the first continuous delivery pipeline which allowed for immediate deployments instead of the twice a month release schedule other products followed. This pipeline demonstrated a path to save an additional $120,000 a month in deployment costs.
Conclusion
The approach the team took to deliver this product was considered radical, slow, and inefficient by others, but the results were the opposite. No team in the company's history had delivered anything that operated without bugs, outages, or degradation on launch or managed such demanding SLAs.
As other teams began to work on the bidding system, it became clear how these different approaches led to different results. Other teams introduced bugs and outages regularly, and bugs introduced in budgeting caused overspending that earned the CEO's scrutiny.