Skip to main content

One post tagged with "Shard Manager"

Posts about the Cadence Shard Manager

View All Tags

Two Hidden Deadlocks in Cadence Matching: 1 Day, 2 Engineers, 6 Lines of Code

· 13 min read
Jakob Haahr Taankvist
Senior Software Engineer @ Uber
Eleonora Di Gregorio
Senior Software Engineer @ Uber

How the new Cadence Shard Manager Found and Mitigated Two Latent Deadlocks

We're rolling out a new Shard Manager service for Cadence that replaces the existing hash-ring based routing, and it's coming to the open-source release soon. The new architecture gives us load balancing, graceful shard handovers, and the debuggability and observability we've used to find the two deadlocks in this post. During the rollout, the Shard Manager exposed two latent deadlocks in the Cadence Matching service. It moved traffic to the healthy instance and kept the system running while two engineers fixed them in a day, with six lines of code total.

To understand how this happened, we first need to understand the new architecture.