Microsoft opens a job for Redis, which earns hundreds of millions of dollars in annual revenue? Garnet, which is far ahead in open source performance: No modification, Redis client can directly access it

Microsoft opens a job for Redis, which earns hundreds of millions of dollars in annual revenue? Garnet, which is far ahead in open source performance: No modification, Redis client can directly access it

Recently, Microsoft officially opened source cache storage system Garnet. According to Badrish Chandramouli, senior principal researcher in the Microsoft Research Database Group, the Garnet project was built from scratch and has performance as its core considerations (especially thread scalability in throughput and a higher proportion of low latency levels).

最后更新 3/21/2024 12:26 AM
凌敏,核子可乐
预计阅读 16 分钟
分类
.NET
标签
.NET C# open source Redis Garnet

微软开抢年收入上亿美元的 Redis 饭碗?开源性能遥遥领先的 Garnet:无需修改,Redis 客户端可直接接入

Redis and Dragonfly should feel a sense of crisis!

Microsoft open source new cache storage system Garnet

Recently, Microsoft officially opened source cache storage system Garnet. According to Badrish Chandramouli, senior principal researcher in the Microsoft Research Database Group, the Garnet project was built from scratch and has performance as its core considerations (especially thread scalability in throughput and a higher proportion of low latency levels).

Specifically, Garnet has the following advantages:

  • Garnet uses the popular RESP line protocol as a starting point, so most users can access Garnet directly through Redis clients written in most programming languages without modification.
  • Garnet provides better scalability and throughput through multiple client connections and small batch formats, helping large applications and services save operating costs.
  • Garnet shows better client latency levels in the 99th and 99.9th percentile, and a higher proportion of stability performance is critical for real-world scenarios.
  • Garnet is based on the latest. NET technology and is cross-platform, scalable and modern. It is designed to be easy to develop and adjust without sacrificing performance levels in common scenarios. Extend its APIs by leveraging. NET's rich library ecosystem and providing open optimization opportunities. With its full exploration of. NET, Garnet has demonstrated top-notch performance on both Linux and Windows platforms.

It is understood that Microsoft Research has been studying modern key-value database architectures since 2016. In 2018, after Microsoft open-source FASTER, an embedded key-value library, its performance exceeded the original system by several orders of magnitude, while focusing on a simple single-node intra-process key-value model.

Starting from 2021, Microsoft will begin to build a new remote cache storage solution based on the needs of actual use cases. It contains all necessary features to serve as a viable alternative to existing cache storage. Challenges faced by Microsoft at the time included maintaining/enhancing the performance advantages it had achieved in its early work, while considering how to better adapt to a more realistic, universal network environment. The result of this work is Garnet.

When asked what scenarios Garnet is suitable for deployment, Chandramouli said that "*** any existing application that uses Redis, KeyDB, or Dragonfly as a cache storage solution is suitable **. Garnet can provide higher throughput, lower latency, reduce costs by reducing cache storage fragmentation that needs to be managed, and can also overflow data to local disks or SSDs to cache data that exceeds the memory size. In addition, Garnet is also suitable for a variety of new applications that want to use extremely high-performance caching layers to improve performance and reduce the cost of back-end storage servers or databases."

img

In terms of API functions, Garnet supports a wide range of APIs, including raw strings, analysis, and object manipulation. It also provides a cluster model with functions such as fragmentation, replication, and dynamic key migration. Gartner supports client-side RESP transactions and server-side stored procedures written in C#, and also allows users to set custom actions on top of original strings and new object types. All of this can be written simply using C#, so the development threshold for custom extensions is lower.

In terms of network, storage, and clustering functions, Garnet uses a fast and pluggable network layer and supports subsequent extensions, such as in conjunction with the kernel bypass stack. It supports the Transport Layer Security (TLS) communication protocol and various basic access controls. Garnet's storage layer, known as Tsavorian, is a fork from OSS FASTER and provides a series of powerful database functions, such as thread scalability, tiered storage support (memory, SSD, cloud storage, etc.), fast non-blocking checkpoint, recovery, persistent operation logging, multi-key transaction support, and better intrinsic management and reuse capabilities. In addition, Garnet also supports cluster operation mode.

In addition to single-node execution, Garnet also supports cluster mode, allowing users to create and manage sharding and replication deployments. Garnet also supports an efficient, dynamic key migration scheme to rebalance various slices. Users can use standard Redis cluster commands to create and manage Garnet clusters, and each node performs Gossip to share and evolve cluster state. Overall, Garnet's clustering model is a huge and still developing feature, and Microsoft said more details will be shared with you in subsequent articles.

Chandramouli added in an email to The Stack,"We also look forward to feedback on Garnet's performance in various other real-world applications. In addition, we also have a powerful C#-based stored procedure model that allows users to customize the transactions they focus on. Finally, we see Garnet as an important innovative tool for the future, including application scenarios such as optimizing disk IO, kernel bypass networks, and vector databases."

What are the highlights of Garnet?

The rapid growth of cloud and edge computing has led to significant improvements in data and coverage for related applications and services. But at the same time, they also put forward practical requirements for higher efficiency, lower latency, and lower cost in terms of data access, updates and conversion. These applications and services often require significant operating expenses in storage interactions, making them one of the most expensive and challenging platform areas today. A cache storage software layer in the form of a separate scalable remote process can effectively reduce these costs and improve application performance. This has also driven the development of the cache storage industry, including many well-known open source systems such as Redis, Memcached, KeyDB and Dragonfly.

Unlike traditional remote cache stores that only support simple get/setup interfaces, modern caches require a rich API and feature set. They support analytical data structures such as raw strings and Hyperloglogs, as well as complex data types such as sort sets and hashes. They must also allow users to set checkpoints and restore capabilities for the cache, create data slices, maintain replicas, and support transactions and custom extensions.

However, existing systems often have difficulty meeting such rich functional requirements while maintaining system design simplicity, including the inability to fully utilize the latest hardware features (such as multi-core, tiered storage, and fast networking). In addition, many of these systems were not designed with real needs such as being easily extensible by application developers or running well on different platforms/operating systems.

According to the introduction, Garnet has reconsidered the entire cache storage stack in design-from retrieving data packets from the network, to parsing and processing database operations, to performing storage interactions.

The following figure shows the overall architecture of Garnet. It can be seen that Garnet's network layer inherits the shared memory design inspired by Microsoft's ShadowFax research. TLS processing and storage interactions are performed on the IO completion thread, which avoids common thread switching overhead. This approach enables data to be transferred to the network through CPU cache consistency rather than a traditional shuffle design that requires moving data on a server.

Garnet项目整体架构

Garnet's storage design consists of two sets of Tsavorite key-value stores, both bound to a unified operation log. The former set of storage is called "main storage", is optimized for raw string operations and is responsible for managing memory to avoid garbage collection. The second set is optional "object storage", which is mainly optimized for complex objects and custom data types, specifically covering popular data types such as sorted sets, sets, hashes, lists, and geospatial. They are stored on the memory heap (to ensure updates are more efficient) and stored on disk in a serialized form. In the future, Microsoft will also study how to simplify Garnet's system maintenance through unified indexing and logging.

A distinctive feature of Garnet's design is the use of the Tsavorite storage API. The AIP is used to provide a larger, richer, and extensible RESP API surface capable of performing read, update insert, delete, and atomic read-modify-write operations, all implemented through Garnet's asynchronous callbacks to insert logic at multiple points during each operation. The storage API model also ensures that Garnet can completely separate problem resolution and query processing from other storage functions such as concurrency, storage tiering, and checkpoints.

In addition, Garnet has further added support for multi-key transactions based on dual-phase locking. Users can use RESP client-side transactions (MULTI-EXEC) or use server-side transactional stored procedures in C#.

performance

The Microsoft research team demonstrated and compared key performance indicators between Gartner and other leading open source cache storage solutions.

First, the team pre-configured two sets of Azure standard F72s v2 virtual machines running Linux (Ubuntu 20.04)(72 vCPUs per virtual machine plus 144 GiB of memory), and enabled accelerated TCP. One set of virtual machines runs various cache storage servers, and the other is dedicated to publishing workloads. Here, Microsoft uses its own benchmark tool Resp.benchmark, which uniformly provides performance test results.

Microsoft compared Garnet to the latest open source versions of Redis (v7.2), KeyDB (v6.3.4), and Dragonfly (v6.2.11). In experiments, Microsoft used uniformly and randomly distributed keys (Garnet's shared memory design has better performance optimization for non-randomly distributed keys). In these experiments, data was pre-loaded onto each server and embedded in memory.

Experiment 1: Comparison of throughput for different number of client sessions

Starting with a large GET operation (4096 requests per batch) and a low load (8 byte keys and values), we tried to minimize network overhead and gradually increased the number of client sessions to compare system performance. As you can see from the figure below, Garnet demonstrates scalability that surpasses Redis and KeyDB, while achieving higher throughput (logarithmic on the y-axis) than all three major baseline systems. Please note that although Dragonfly's scalability performance is similar to Garnet, the former is a purely in-memory system. In addition, when the database size (i.e., the number of pre-loaded keys) significantly exceeds the processor's cache size (256 million keys), Garnet still has strong throughput performance compared to other systems.

img

The throughput corresponding to different numbers of client sessions (logarithmic scale) for database sizes of (a) 1024 keys and (b) 256 million keys.

Experiment 2: Comparison of throughput for different batch sizes

Next, use a GET operation plus a fixed number (64) of client sessions to change the batch size. As with the previous experiment, continue to try two different database sizes. As shown in the following figure, Garnet's performance is better even without batch processing; and after batch processing, Garnet's performance advantages are enhanced even if the batch size is small. The load size is the same as in Experiment 1, and the y-axis is also logarithmic.

img

When the database size is (a) 1024 keys and (b) 256 million keys, the throughput is compared under different batch sizes (on logarithmic scale).

Experiment 3: Comparison of delay in different number of implementation opinions conversation

Next, we tested client latency on various systems. As the following figure shows, as the number of client sessions increases, Garnet's latency (in microseconds) is lower and more stable in all centiles than other systems. In the experiment, the sending operation was carried out at a mixed ratio with GET operation accounting for 80% and SET operation accounting for 20%, and batch processing was not carried out.

img

Latency levels at (a) median,(b) 99th percentile, and (c) 99.9th percentile across client sessions.

Experiment 4: Comparison of Latency for Different Batch Sizes

Garnet's latency levels are optimized for batch and query systems for adaptive clients. Microsoft increased the batch size from 1 to 64 and collated the latency levels on different percentiles with 128 active client connections in the graph below. As you can see from the figure below, Gartner's latency is generally low. As in previous experiments, a mixed ratio of GET operations accounting for 80% and SET operations accounting for 20%.

img

Delay levels above (a) the median,(b) the 99th percentile, and (c) the 99.9th percentile for different batch sizes.

Developer: Redis needs major performance optimization!

Judging from the benchmark performance chart, the throughput of GET commands exceeds that of Dragonfly by more than ten times. Although the 50th percentile latency level is slightly higher than Dragonfly, the 99th percentile latency level is lower than Dragonfly. Garnet and Dragonfly perform far better than Redis in terms of throughput and latency, which many developers believe suggests that Redis may need major performance optimization.

Developer hipadev23 said,"Garnet is indeed the first alternative to outperform Redis in terms of both low and high concurrency levels, which is a remarkable achievement." "Redis may require significant performance optimizations."

Developer mtmk believes that the emergence of Garnet is definitely good news for friends who need to run Redis directly on Microsoft Windows Server (or compatible) but don't want to rely on WSL2. Memory usage issues previously caused by the Redis port (now archived)(mainly due to the memory mapped file AFAIK) will no longer exist.

There are also many developers who still firmly choose Redis. Redis is more developer friendly in some ways and takes longer and more stable runs. Regarding Garnet, everyone is generally worried about licensing agreements, product pricing, updates and maintenance. throwaway38375 said,"Redis should be more stable in terms of licensing agreements or product pricing, and it has survived billions of hours of production operations. Redis is also easier to install and understand." Someone believes,"For such a project launched by Microsoft Research, my biggest concern is not the licensing agreement and product pricing, but the lack of updates (features, maintenance and even security updates)."

By the way: Garnet was developed using C#

During community discussions, many developers were surprised that the Garnet project was actually developed using C#.

Developer west0n said: "What surprised me the most was that the Garnet project was developed in C#, while Dragonfly was developed in C++, and Redis was developed in C." Developer whimsicalism even bluntly said,"I'm surprised that Garnet written in the garbage collection language C#defeated Redis and Dragonfly."

Some developers have given more pertinent comments on this. pjmlp believes that "garbage collection languages are different from garbage collection languages. Languages like C#and. NET actually provide all performance tuning options equivalent to C++." He said that what everyone should do is to study hard, rather than grouping all garbage collection languages into one category and beating them to death. [** Webmaster's note: . NET is a platform, C#is an implementation of. NET, C#and. NET are analogous to Java and JDK**]

In addition, more specifically, MSIL and. NET are also designed to support C++, and languages such as C#and F#also have ways to access these functions. Even if some functions are not exposed at the language syntax level, developers can directly use MSIL generated by C++/CLI.

What do you think of this? Welcome to leave your opinions in the comment area.

Reference link:

https://www.microsoft.com/en-us/research/blog/introducing-garnet-an-open-source-next-generation-faster-cache-store-for-accelerating-applications-and-services/

https://www.thestack.technology/microsoft-takes-on-redis-with-new-open-source-garnet-cache-store/

https://news.ycombinator.com/item?id=39752504

Keep Exploring

延伸阅读

更多文章
同分类 / 同标签 4/22/2026

Support for. NET by operating system versions (250707 update)

Use virtual machines and test machines to test the support of each version of the operating system for. NET. After installing the operating system, it is passed by measuring the corresponding running time of the installation and being able to run the Stardust Agent.

继续阅读