I love Discord’s backend team they are doing some amazing stuff at their product and this is just one of them, that how they used Rust to scale Elixir to gain 11 million concurrent users. Grab you Coffee or Tea and follow along.
In the realm of software development, challenges often arise when dealing with large-scale data structures. Discord, a popular communication platform, faced one such challenge when optimizing its Member List feature. This blog explores the development process behind Discord’s SortedSet, a high-performance data structure, which allowed them to efficiently manage large sorted sets with fast mutation operations. Join us on this journey as we delve into the problem, explore various solutions, and ultimately discover how the power of Rust and Elixir combined to create an exceptional solution.
Discord’s Member List, which displays users on the platform, needed to be updated efficiently. However, constantly sending updates for the entire Member List resulted in unnecessary network traffic, CPU usage, and reduced battery life. To address this, Discord required a data structure capable of holding hundreds of thousands of entries, enabling fast mutation operations, and providing efficient indexing for additions and removals.
Elixir, the language of choice for Discord’s core services, possesses immutable data structures. While ideal for concurrency and code reasoning, this immutability presented challenges when handling large-scale mutations. Discord’s engineers embarked on a quest to find a suitable data structure that met their requirements. The search began with Elixir’s built-in data structures, such as MapSet and List. However, MapSet lacked ordering guarantees, ruling it out as an option. Wrapping a List and enforcing uniqueness and sorting showed promising results for small lists but proved impractical for larger lists due to slow insertion times.Next, they explored Erlang’s ordsets (Ordered Sets). While ordsets performed well for smaller lists, they faced performance issues when handling larger lists, falling short of the desired speed requirements.
As existing options failed to meet the performance demands, Discord’s engineers decided to delve into the realm of custom data structure design. They conceived the idea of chaining multiple small ordsets together to quickly access the correct ordset when needed. This concept resembled a Skip List, which served as the inspiration for their implementation. The initial incarnation of the new data structure, dubbed OrderedSet, involved a wrapper around a list of Cells. Each Cell contained a small ordset along with the first and last items and a count. Leveraging compile-time guards, the OrderedSet achieved favorable performance in worst-case scenarios, significantly outperforming raw ordsets and naive List implementations.
Although the initial implementation improved performance, it introduced a new worst-case scenario when inserting at the beginning of the list. To overcome this challenge, Discord’s engineers developed a more sophisticated approach. They allowed Cells to dynamically swell and split, inserting new Cells in the middle of the list. While slightly more expensive, this approach improved worst-case performance, as it required a Cell Split instead of numerous Cell operations.
Despite reaching impressive performance levels with Elixir, Discord sought to push the boundaries further. They turned to Rust, a systems programming language known for its zero-cost abstractions and mutable data structures. By leveraging Rust’s capabilities through NIFs (Native Implemented Functions), they aimed to achieve even greater performance gains.
Discord’s core services were primarily built in Elixir, making it essential to integrate Rust seamlessly into their existing Elixir-based infrastructure. Thanks to a remarkable Elixir project called Rustler, which facilitates safe and efficient Elixir-Rust interactions, Discord could create a well-behaved NIF. Rustler’s guarantees ensured that Rust would not crash the BEAM VM or leak memory.
The initial benchmarks of the Rust-backed SortedSet yielded extremely promising results. Adding an item to the set ranged from 0.4𝜇s to 2.85𝜇s, outperforming the previous OrderedSet implementation by a considerable margin. Discord expanded the benchmarking to include a wider range of Erlang Terms and functionality required for the Member List. The results were remarkable, with SortedSet demonstrating fast insertions across various set sizes, even with a million items.
Thanks to the introduction of SortedSet, Discord witnessed significant performance improvements across all guilds on their platform, without any adverse effects on memory usage. The collaboration between Rust and Elixir proved that they could coexist effectively, with Elixir handling real-time communication logic and Rust providing exceptional performance when required.
Discord’s journey to optimize their Member List led to the development of SortedSet, a high-performance data structure combining the power of Rust and Elixir. By exploring various options and pushing the boundaries of data structure design, Discord achieved their goal of efficiently managing large sorted sets with fast mutation operations. With SortedSet as an open-source library, the Discord team has contributed a valuable resource to the wider software development community, demonstrating the possibilities when combining different programming languages and leveraging their unique strengths.