This talk is not about Cosmos DB. Well, actually, maybe it is a little bit. But it's really a case study of tricky problem-solving.
Come on an adventure with me, as I recount our efforts to tame an out-of-control database. Then, stroke your chin with me (I'll stroke my own) as I reflect on what we should have done differently, the factors that lead us down the paths we explored, and the lessons we can generalise from the experience.
Our Cosmos DB instance was struggling to keep up with our fast-growing user-base. Despite the rapid scaling, it should really have coped. "What's going on?" we asked each other. Over the course of several weeks, we tried several approaches, but the problem got steadily worse. Some of the things we tried were quite interesting, but they had varying levels of success. The eventual solution was infuriatingly simple, so why did it take us so long to work it out?
This talk will cover some real-world techniques for tuning and optimising Cosmos DB but will focus on the process of solving complicated operational problems with new (i.e. "under-tooled") technologies.