c#百萬對象序列化深度剖析:如何在網絡傳輸中實現速度與體積的完美平衡

c#百萬對象序列化深度剖析:如何在網絡傳輸中實現速度與體積的完美平衡

在網絡通信中,數據序列化是將對象狀態轉換為可存儲或可傳輸的形式的過程,這對於tcp網絡傳輸尤為關鍵。在項目中,當需要處理幾十萬條數據的傳輸時,傳統的json序列化方式由於其冗餘的欄位名和字符串格式,導致了二進位包體積龐大,且序列化與反序列化的效率低下。為了解決這些問題,我們考慮採用更加高效的序列化方法,以減少包大小並提升處理速度。

最后更新 2023/12/11 下午10:29
沙漠尽头的狼
预计阅读 22 分钟
分类
.NET
标签
.NET C# 優化

目錄

  1. 本文背景
  2. 構建測試數據
  3. 方案對比
    • json序列化
    • 自定義二進位序列化
    • BinaryWriter\BinaryReader
    • ProtoBuf
    • MessagePack
    • 方案分析
    • 基準測試
  4. 總結

1. 本文背景

大家好,我是沙漠盡頭的狼。

在网络通信中,数据序列化是将对象状态转换为可存储或可传输的形式的过程,这对于TCP网络传输尤为关键。在项目中,当需要处理几十万条数据的传输时,传统的Json序列化方式由于其冗余的字段名和字符串格式,导致了二进制包体积庞大,且序列化与反序列化的效率低下。为了解决这些问题,我考虑采用更加高效的序列化方法,以减少包大小并提升处理速度。本文将探讨自定义二进制序列化、BinaryWriter/BinaryReaderMessagePackProtoBuf等4种序列化方法,并通过比较它们的性能,为大家提供我目前认为的最佳实践指南。

2. 構建測試數據

创建C#控制台程序,添加OrganizationMember两个类,类中包含基本的数据类型和List<T>,其他数组、字典可以自行扩展:

public class Organization
{
    public int Id { get; set; }

    public string[]? Tags { get; set; }

    public List<Member>? Members { get; set; }
}

public class Member
{
    public int Id { get; set; }

    public string? Name { get; set; }

    public string? Description { get; set; }

    public string? Address { get; set; }

    public double Value { get; set; }

    public long UpdateTime { get; set; }
}

創建測試數據100萬條,為了和標題遙相呼應:

public class BenchmarkTest
{
    /// <summary>
    /// 测试数据量
    /// </summary>
    private const int DataCount = 1000000;

    private static readonly Random RandomShared = new(DateTime.Now.Millisecond);

    /// <summary>
    /// 测试数据
    /// </summary>
    private static readonly Organization TestData = new()
    {
        Id = 1,
        Tags = Enumerable.Range(0, 5).Select(index => $"测试标签{index}").ToArray(),
        Members = Enumerable.Range(0, DataCount).Select(index => new Member()
        {
            Id = index,
            Name = $"测试名字{index}",
            Description = $"测试描述{RandomShared.Next(1, int.MaxValue)}",
            Address = $"测试地址{RandomShared.Next(1, int.MaxValue)}",
            Value = RandomShared.Next(1, int.MaxValue),
            UpdateTime = ((DateTimeOffset)DateTime.Now).ToUnixTimeMilliseconds()
        }).ToList()
    };
}

3. 方案對比

首先创建序列化接口ISerializeHelper,各种序列化提供程序需要分别实现:

public interface ISerializeHelper
{
    byte[] Serialize(Organization data);

    Organization? Deserialize(byte[] buffer);
}

再创建BenchmarkTest类,添加RunSerialize方法用于执行序列化提供程序,在此方法中依次调用提供程序的序列化和反序列方法,统一对测试方法进行耗时和组包大小打印输出统计:

public class BenchmarkTest
{
    // ...省略前面的代码
    private static void RunSerialize(ISerializeHelper helper)
    {
        Stopwatch sw = Stopwatch.StartNew();

        var buffer = helper.Serialize(TestData);

        sw.Stop();
        Log($"{helper.GetType().Name} Serialize {sw.ElapsedMilliseconds}ms {buffer.Length}byte");

        sw.Restart();

        var data = helper.Deserialize(buffer);

        sw.Stop();

        Log($"{helper.GetType().Name} Deserialize {sw.ElapsedMilliseconds}ms {data?.Members?.Count}项");
    }

    private static void Log(string log)
    {
        Console.WriteLine($"{DateTime.Now:yyyy-MM-dd HH:mm:ss fff}: {log}");
    }
}

3.1. json序列化

首先测试Json序列化,安装System.Text.Json包:

<PackageReference Include="System.Text.Json" Version="8.0.0" />

创建JsonSerializeHelper提供程序,并实现ISerializeHelper接口:

using ByteTest.Core.Models;
using System.Text.Json;

namespace ByteTest.Core.Helpers;

public class JsonSerializeHelper : ISerializeHelper
{
    public byte[] Serialize(Organization data)
    {
        return JsonSerializer.SerializeToUtf8Bytes(data);
    }

    public Organization? Deserialize(byte[] buffer)
    {
        var data = JsonSerializer.Deserialize<Organization>(buffer);
        return data;
    }
}

JsonSerializer.SerializeToUtf8Bytes 方法会直接生成字节数据,而不是先生成字符串然后再转换为字节,这对于一些需要将 JSON 数据写入文件或网络流的场景非常有用,因为这些场景通常需要字节数据而不是字符串。此外,由于避免了不必要的字符串分配,它还可以提高性能并减少内存压力。

再在BenchmarkTest类添加测试方法Test

public static void Test()
{
    RunSerialize(new JsonByteSerializeHelper());
}

在Program中调用Test()方法:

BenchmarkTest.Test();

程式輸出如下:

2023-12-10 22:28:24 880: JsonByteSerializeHelper Serialize 2813ms 196227181byte
2023-12-10 22:28:26 858: JsonByteSerializeHelper Deserialize 1964ms 1000000项

Json序列化100万条数据,需要2.8s,数据包187.14MB,真大、真慢。

3.2.自定義二進位序列化

接下來測試下自定義的二進位序列化,這是我原來常用的方式,目前看也是囉嗦,首先定義數據包欄位規範:

數據類型 二進位長度 說明
數字類型(short\ushort\int\uint\long\ulong\double等) 2\2\4\4\8\8\8 基本的數字類型是定長的
string 4+n 用int類型4個字節表示字符串二進位後的長度,n表示字符串二進位數組實際長度
T[]`List` 4+n 數組或列表和字符串類似,用int類型4個字節表示數組或列表二進位後的長度,n表示數組或列表二進位數組實際長度

添加CustomSerializeHelper实现接口ISerializeHelper

using ByteTest.Core.Models;

namespace ByteTest.Core.Helpers;

public class CustomSerializeHelper : ISerializeHelper
{
    public byte[] Serialize(Organization data)
    {
        // 1、计算Id
        var idBuffer = BitConverter.GetBytes(data.Id);

        // 2、计算Tag数组
        var tagBuffer = GetBytes(data.Tags);

        // 3、计算Members
        var membersBuffer = GetBytes(data.Members);

        return GetBytes(new[] { idBuffer, tagBuffer, membersBuffer });
    }

    public Organization? Deserialize(byte[] buffer)
    {
        var data = new Organization();
        var index = 0;

        data.Id = BitConverter.ToInt32(buffer, index);
        index += sizeof(int);

        data.Tags = GetTags(buffer, ref index);
        data.Members = GetMembers(buffer, ref index);
        return data;
    }

    /// <summary>
    /// 获取字符串列表byte[]
    /// </summary>
    /// <param name="data"></param>
    /// <returns></returns>
    private byte[] GetBytes(string[]? data)
    {
        var dataCount = data?.Length ?? 0;
        var dataCountBuffer = BitConverter.GetBytes(dataCount);

        if (dataCount <= 0)
        {
            return dataCountBuffer;
        }

        var dataValueBuffers = data!.Select(item => ByteHelper.GetBytes(item)!).ToArray();
        var dataValueBuffer = GetBytes(dataValueBuffers);
        return GetBytes(new[] { dataCountBuffer, dataValueBuffer });
    }

    /// <summary>
    /// 获取成功列表byte[]
    /// </summary>
    /// <param name="data"></param>
    /// <returns></returns>
    private byte[] GetBytes(List<Member>? data)
    {
        var dataCount = data?.Count ?? 0;
        var dataCountBuffer = BitConverter.GetBytes(dataCount);

        if (dataCount <= 0)
        {
            return dataCountBuffer;
        }

        var dataValueBuffers = data!.Select(item =>
        {
            var idBuffer = BitConverter.GetBytes(item.Id);
            var nameBuffer = ByteHelper.GetBytes(item.Name);
            var descriptionBuffer = ByteHelper.GetBytes(item.Description);
            var addressBuffer = ByteHelper.GetBytes(item.Address);
            var valueBuffer = BitConverter.GetBytes(item.Value);
            var updateTimeBuffer = BitConverter.GetBytes(item.UpdateTime);

            var buffer = GetBytes(new byte[][]
                { idBuffer, nameBuffer, descriptionBuffer, addressBuffer, valueBuffer, updateTimeBuffer });
            return buffer;
        }).ToArray();
        var dataValueBuffer = GetBytes(dataValueBuffers);
        return GetBytes(new[] { dataCountBuffer, dataValueBuffer });
    }

    private byte[] GetBytes(byte[][] data)
    {
        var dataBufferLen = data.Sum(itemBuffer => itemBuffer.Length);
        var dataBuffer = new byte[dataBufferLen];
        var dataIndex = 0;
        for (var i = 0; i < data.Length; i++)
        {
            var itemBuffer = data[i];
            Array.Copy(itemBuffer, 0, dataBuffer, dataIndex, itemBuffer.Length);

            dataIndex += itemBuffer.Length;
        }

        return dataBuffer;
    }

    private string[]? GetTags(byte[] buffer, ref int index)
    {
        var count = BitConverter.ToInt32(buffer, index);
        index += sizeof(int);

        if (count <= 0)
        {
            return default;
        }

        var data = new string[count];
        for (var i = 0; i < count; i++)
        {
            data[i] = GetString(buffer, ref index);
        }

        return data;
    }

    private List<Member>? GetMembers(byte[] buffer, ref int index)
    {
        var count = BitConverter.ToInt32(buffer, index);
        index += sizeof(int);

        if (count <= 0)
        {
            return default;
        }

        var data = new List<Member>();
        for (var i = 0; i < count; i++)
        {
            var people = new Member();

            people.Id = BitConverter.ToInt32(buffer, index);
            index += sizeof(int);

            people.Name = GetString(buffer, ref index);

            people.Description = GetString(buffer, ref index);

            people.Address = GetString(buffer, ref index);

            people.Value = BitConverter.ToDouble(buffer, index);
            index += sizeof(double);

            people.UpdateTime = BitConverter.ToInt64(buffer, index);
            index += sizeof(long);

            data.Add(people);
        }

        return data;
    }

    private string GetString(byte[] buffer, ref int index)
    {
        var count = BitConverter.ToInt32(buffer, index);
        index += sizeof(int);

        if (count <= 0)
        {
            return string.Empty;
        }

        var data = ByteHelper.DefaultEncoding.GetString(buffer, index, count);
        index += count;
        return data;
    }
}

代码不少,看个大概就好,主要用BitConverter.GetBytesBitConverter.ToXXX获取或设置基本数据类型的byte[]数据,对于多byte[]的复制,如果数据较小,用Array.Copy,如果数据较大建议用Buffer.BlockCopy

代码中使用到辅助类ByteHelper,用于计算字符串byte[]:

public static class ByteHelper
{
    public static Encoding DefaultEncoding = Encoding.UTF8;

    /// <summary>
    /// 获取字符串二进制数据:字符串二进制数据=4个字节int表示字符串实际值长度+n个字节表示字符串实际值
    /// </summary>
    /// <param name="str"></param>
    /// <returns></returns>
    public static byte[] GetBytes(string? str)
    {
        if (string.IsNullOrEmpty(str))
        {
            return BitConverter.GetBytes(0);
        }

        var strValueBuffer = DefaultEncoding.GetBytes(str);
        var strValueLen = strValueBuffer.Length;
        var strValueLenBuffer = BitConverter.GetBytes(strValueLen);

        var strBufferLen = sizeof(int) + strValueLen;
        var strBuffer = new byte[strBufferLen];

        var index = 0;
        Array.Copy(strValueLenBuffer, 0, strBuffer, index, sizeof(int));
        index += sizeof(int);

        Array.Copy(strValueBuffer, 0, strBuffer, index, strValueLen);

        return strBuffer;
    }
}

我们修改Test方法,加入CustomSerializeHelper测试:

public static void Test()
{
    var serializeHelpers = new List<ISerializeHelper>
    {
        new JsonSerializeHelper(),
        new CustomSerializeHelper()
    };

    serializeHelpers.ForEach(RunSerialize);
}

程式輸出如下:

2023-12-10 22:45:14 701: JsonSerializeHelper Serialize 2774ms 196225588byte
2023-12-10 22:45:16 613: JsonSerializeHelper Deserialize 1898ms 1000000项
2023-12-10 22:45:17 414: CustomSerializeHelper Serialize 801ms 92854209byte
2023-12-10 22:45:18 072: CustomSerializeHelper Deserialize 657ms 1000000项

能看出優化了不少,自定義二進位序列化減少1億個字節(100mb左右),序列化(組包)快2s左右,反序列化(解包)快1s多。

3.3. BinaryWriter\BinaryReader

BinaryWriterBinaryReader 类是用于以二进制格式写入和读取数据的类。它们分别提供了一系列的方法来写入和读取各种基本数据类型(如int, float, double, string等)的二进制表示。这些类通常与文件流(FileStream)一起使用,但也可以与其他类型的流(如MemoryStream)配合使用。

自定义的方式全手工操作,需要自己进行字节数组的复制,各种转换,有点原始,使用BinaryWriter\BinaryReader进行序列化操作应该二进制序列化的标准用法了。

创建BinarySerializeHelper类,并实现ISerializeHelper接口:

public class BinarySerializeHelper : ISerializeHelper
{
    public byte[] Serialize(Organization data)
    {
        using var stream = new MemoryStream();
        using var writer = new BinaryWriter(stream, ByteHelper.DefaultEncoding);
        writer.Write(data.Id);
        Write(writer, data.Tags);
        Write(writer, data.Members);

        return stream.ToArray();
    }

    public Organization Deserialize(byte[] buffer)
    {
        var data = new Organization();
        using var stream = new MemoryStream(buffer);
        using var reader = new BinaryReader(stream, ByteHelper.DefaultEncoding);
        data.Id = reader.ReadInt32();
        data.Tags = ReadStringList(reader);
        data.Members = ReadPeopleList(reader);

        return data;
    }

    private static void Write(BinaryWriter writer, string[]? data)
    {
        var count = data?.Length ?? 0;
        writer.Write(count);
        if (count <= 0)
        {
            return;
        }

        foreach (var item in data!)
        {
            writer.Write(item);
        }
    }

    private static void Write(BinaryWriter writer, List<Member>? data)
    {
        var count = data?.Count ?? 0;
        writer.Write(count);
        if (count > 0)
        {
            foreach (var item in data)
            {
                writer.Write(item.Id);
                writer.Write(item.Name ?? string.Empty);
                writer.Write(item.Description ?? string.Empty);
                writer.Write(item.Address ?? string.Empty);
                writer.Write(item.Value);
                writer.Write(item.UpdateTime);
            }
        }
    }

    private static string[]? ReadStringList(BinaryReader reader)
    {
        var count = reader.ReadInt32();
        if (count <= 0)
        {
            return default;
        }

        var values = new string[count];
        for (int i = 0; i < count; i++)
        {
            values[i] = reader.ReadString();
        }

        return values;
    }

    private static List<Member>? ReadPeopleList(BinaryReader reader)
    {
        var count = reader.ReadInt32();
        if (count <= 0)
        {
            return default;
        }

        var values = new List<Member>();
        for (int i = 0; i < count; i++)
        {
            var item = new Member();
            item.Id = reader.ReadInt32();
            item.Name = reader.ReadString();
            item.Description = reader.ReadString();
            item.Address = reader.ReadString();
            item.Value = reader.ReadDouble();
            item.UpdateTime = reader.ReadInt64();

            values.Add(item);
        }

        return values;
    }
}

代码也不少,列表的序列化和反序列在上面自定义二进制序列化时就应该封装成方法,通过反射实现通用列表的序列化和反序列化,这一小节也是,不想再折腾了,我们在BenchmarkTest类的Test方法内加上BinarySerializeHelper,再运行程序:

2023-12-10 22:52:56 986: JsonSerializeHelper Serialize 2715ms 196225584byte
2023-12-10 22:52:58 910: JsonSerializeHelper Deserialize 1910ms 1000000项
2023-12-10 22:52:59 730: CustomSerializeHelper Serialize 819ms 92853722byte
2023-12-10 22:53:00 389: CustomSerializeHelper Deserialize 659ms 1000000项
2023-12-10 22:53:00 660: BinarySerializeHelper Serialize 269ms 83853707byte
2023-12-10 22:53:01 466: BinarySerializeHelper Deserialize 806ms 1000000项

BinaryWriter\BinaryReader序列化包又比自定义二进制序列化方式小8.5MB左右,序列化速度也快0.5s,反序列化稍微慢点,不错哟。

3.4. ProtoBuf

大家聽過或使用過google 的 protocol buffers吧?

本小节介绍使用protobuf-net库,这是一个在 .NET 环境中使用的库,它提供了对 Google 的 Protocol Buffers 数据序列化格式的支持。Protocol Buffers 是一种轻量级、高效的结构化数据序列化机制,通常用于跨服务或应用程序的通信,以及数据存储。

安装包protobuf-net

<PackageReference Include="protobuf-net" Version="3.2.30" />

给测试的类添加类序列化特性[ProtoContract]和属性序列化特性[ProtoMember(序列化顺序)]

[ProtoContract]
public class Organization
{
    [ProtoMember(1)] public int Id { get; set; }

    [ProtoMember(2)] public string[]? Tags { get; set; }

    [ProtoMember(3)] public List<Member>? Members { get; set; }
}

[ProtoContract]
public class Member
{
    [ProtoMember(1)] public int Id { get; set; }

    [ProtoMember(2)] public string? Name { get; set; }

    [ProtoMember(3)] public string? Description { get; set; }

    [ProtoMember(4)] public string? Address { get; set; }

    [ProtoMember(5)] public double Value { get; set; }

    [ProtoMember(6)] public long UpdateTime { get; set; }
}

添加ProtoBufSerializeHelper类,并实现ISerializeHelper接口:

using ByteTest.Core.Models;
using ProtoBuf;

namespace ByteTest.Core.Helpers;

public class ProtoBufSerializeHelper : ISerializeHelper
{
    public byte[] Serialize(Organization data)
    {
        using var stream = new MemoryStream();
        Serializer.Serialize(stream, data);
        return stream.ToArray();
    }

    public Organization? Deserialize(byte[] buffer)
    {
        using var stream = new MemoryStream(buffer);
        return Serializer.Deserialize<Organization>(stream);
    }
}

还有一步,给上面的Test方法加上ProtoBufSerializeHelper序列化方式,程序输出如下:

2023-12-10 23:01:17 478: JsonSerializeHelper Serialize 2767ms 196225803byte
2023-12-10 23:01:19 556: JsonSerializeHelper Deserialize 2064ms 1000000项
2023-12-10 23:01:20 350: CustomSerializeHelper Serialize 793ms 92853782byte
2023-12-10 23:01:21 012: CustomSerializeHelper Deserialize 662ms 1000000项
2023-12-10 23:01:21 271: BinarySerializeHelper Serialize 258ms 83853767byte
2023-12-10 23:01:22 086: BinarySerializeHelper Deserialize 815ms 1000000项
2023-12-10 23:01:22 629: ProtoBufSerializeHelper Serialize 542ms 88837248byte
2023-12-10 23:01:23 688: ProtoBufSerializeHelper Deserialize 1058ms 1000000项

咦,protobuf比binarywriter的序列化包還大,並且還慢,難道我用的不對?可能還需要添加壓縮算法吧,後面再研究了,我們繼續看最後一個messagepack,有使用問題歡迎指出。

3.5. MessagePack

介绍最后一种序列化包MessagePack,这是一种高效的二进制序列化格式,它允许数据在不同的系统之间进行快速且紧凑的传输。它类似于JSON,但是更小、更快、更节省空间。

需要安装包MessagePack包:

<PackageReference Include="MessagePack" Version="2.6.100-alpha" />

添加类MessagePackSerializeHelper,并实现接口ISerializeHelper

using ByteTest.Core.Models;
using MessagePack;

namespace ByteTest.Core.Helpers;

public class MessagePackSerializeHelper : ISerializeHelper
{
    // 这种方式需要在类和字段上添加特性,稍显麻烦,但添加压缩选项后,组包体积、组包和解包速度更快
    //readonly MessagePackSerializerOptions _options = MessagePackSerializerOptions.Standard.WithCompression(MessagePackCompression.Lz4BlockArray);

    // 这种方式不需要给传输对象添加特性,也可添加压缩选项
    readonly MessagePackSerializerOptions _options =
        MessagePack.Resolvers.ContractlessStandardResolver.Options.WithCompression(MessagePackCompression
            .Lz4BlockArray);

    public byte[] Serialize(Organization data)
    {
        return MessagePackSerializer.Serialize(data, _options);
    }

    public Organization? Deserialize(byte[] buffer)
    {
        return MessagePackSerializer.Deserialize<Organization>(buffer, _options);
    }
}

看上面的注釋代碼,提供的選項可能更優,壓縮體積更小,後面我們添加測試。

最後修改test方法:

   public static void Test(List<ISerializeHelper>? moreHelpers = null)
    {
        var serializeHelpers = new List<ISerializeHelper>
        {
            new JsonSerializeHelper(),
            new CustomSerializeHelper(),
            new BinarySerializeHelper(),
            new ProtoBufSerializeHelper(),
            new MessagePackSerializeHelper(),
        };
        if (moreHelpers?.Count() > 0)
        {
            serializeHelpers.AddRange(moreHelpers);
        }

        serializeHelpers.ForEach(RunSerialize);
    }

運行程式輸出:

2023-12-11 21:34:47 782: JsonSerializeHelper Serialize 2456ms 196225500byte
2023-12-11 21:34:51 215: JsonSerializeHelper Deserialize 3430ms 1000000项
2023-12-11 21:34:52 186: CustomSerializeHelper Serialize 970ms 92853911byte
2023-12-11 21:34:52 711: CustomSerializeHelper Deserialize 526ms 1000000项
2023-12-11 21:34:53 734: BinarySerializeHelper Serialize 1022ms 83853896byte
2023-12-11 21:34:54 354: BinarySerializeHelper Deserialize 620ms 1000000项
2023-12-11 21:34:55 170: ProtoBufSerializeHelper Serialize 815ms 88837377byte
2023-12-11 21:34:56 205: ProtoBufSerializeHelper Deserialize 1035ms 1000000项
2023-12-11 21:34:57 123: MessagePackSerializeHelper Serialize 917ms 43583878byte
2023-12-11 21:34:58 527: MessagePackSerializeHelper Deserialize 1403ms 1000000项

再切換上面的注釋選項,代碼修改如下:

using ByteTest.Core.Models;
using MessagePack;

namespace ByteTest.Core.Helpers;

public class MessagePackSerializeHelper : ISerializeHelper
{
    // 这种方式需要在类和字段上添加特性,稍显麻烦,但添加压缩选项后,组包体积、组包和解包速度更快
    readonly MessagePackSerializerOptions _options = MessagePackSerializerOptions.Standard.WithCompression(MessagePackCompression.Lz4BlockArray);

    // 这种方式不需要给传输对象添加特性,也可添加压缩选项
    //readonly MessagePackSerializerOptions _options =
    //    MessagePack.Resolvers.ContractlessStandardResolver.Options.WithCompression(MessagePackCompression
    //        .Lz4BlockArray);

    public byte[] Serialize(Organization data)
    {
        return MessagePackSerializer.Serialize(data, _options);
    }

    public Organization? Deserialize(byte[] buffer)
    {
        return MessagePackSerializer.Deserialize<Organization>(buffer, _options);
    }
}

并给传输类添加特性[MessagePackObject],需要序列化的属性添加特性[Key(序列化索引)]

using MessagePack;
using ProtoBuf;

namespace ByteTest.Core.Models;

[ProtoContract]
[MessagePackObject]
public class Organization
{
    [ProtoMember(1)] [Key(0)] public int Id { get; set; }

    [ProtoMember(2)] [Key(1)] public string[]? Tags { get; set; }

    [ProtoMember(3)] [Key(2)] public List<Member>? Members { get; set; }
}

[ProtoContract]
[MessagePackObject]
public class Member
{
    [ProtoMember(1)] [Key(0)] public int Id { get; set; }

    [ProtoMember(2)] [Key(1)] public string? Name { get; set; }

    [ProtoMember(3)] [Key(2)] public string? Description { get; set; }

    [ProtoMember(4)] [Key(3)] public string? Address { get; set; }

    [ProtoMember(5)] [Key(4)] public double Value { get; set; }

    [ProtoMember(6)] [Key(5)] public long UpdateTime { get; set; }
}

程式輸出:

2023-12-11 21:49:34 153: JsonSerializeHelper Serialize 2383ms 196226140byte
2023-12-11 21:49:37 736: JsonSerializeHelper Deserialize 3581ms 1000000项
2023-12-11 21:49:38 720: CustomSerializeHelper Serialize 983ms 92854251byte
2023-12-11 21:49:39 250: CustomSerializeHelper Deserialize 530ms 1000000项
2023-12-11 21:49:40 273: BinarySerializeHelper Serialize 1023ms 83854236byte
2023-12-11 21:49:40 907: BinarySerializeHelper Deserialize 632ms 1000000项
2023-12-11 21:49:41 660: ProtoBufSerializeHelper Serialize 754ms 88837717byte
2023-12-11 21:49:42 676: ProtoBufSerializeHelper Deserialize 1014ms 1000000项
2023-12-11 21:49:43 357: MessagePackSerializeHelper Serialize 681ms 38706475byte
2023-12-11 21:49:44 344: MessagePackSerializeHelper Deserialize 986ms 1000000项

這裡再貼出上一個選項messagepack的輸出:

2023-12-11 21:34:57 123: MessagePackSerializeHelper Serialize 917ms 43583878byte
2023-12-11 21:34:58 527: MessagePackSerializeHelper Deserialize 1403ms 1000000项

看出還是第一個選項序列化包體積和速度更優秀。

方案分析

100萬條測試數據,5種序列化方法測試統計數據列出表格如下:

序列化方法 Json 自定義二進位 Binary ProtoBuf MessagePack
序列化包大小(mb) 187.13 88.55 79.97 84.72 36.91
序列化用時(ms) 2383 983 1023 754 681
反序列化用時(ms) 3581 530 632 1014 986

通過上表分析,序列化後,messagepack的包最小,為36.91mb,json最大達到187.13mb,另三種在80mb左右;如果考慮序列化效率messagepack最好,反序列化效率竟然是我的自定義二進位方式最快?我們去掉json方式,再運行一次:

2023-12-11 21:55:47 813: CustomSerializeHelper Serialize 1263ms 92854890byte
2023-12-11 21:55:48 804: CustomSerializeHelper Deserialize 989ms 1000000项
2023-12-11 21:55:49 215: BinarySerializeHelper Serialize 410ms 83854875byte
2023-12-11 21:55:50 081: BinarySerializeHelper Deserialize 866ms 1000000项
2023-12-11 21:55:50 726: ProtoBufSerializeHelper Serialize 644ms 88838356byte
2023-12-11 21:55:51 725: ProtoBufSerializeHelper Deserialize 999ms 1000000项
2023-12-11 21:55:52 426: MessagePackSerializeHelper Serialize 701ms 38701799byte
2023-12-11 21:55:53 427: MessagePackSerializeHelper Deserialize 999ms 1000000项
序列化方法 自定義二進位 Binary ProtoBuf MessagePack
序列化包大小(mb) 88.55 79.97 84.72 36.91
序列化用時(ms) 1263 410 644 701
反序列化用時(ms) 989 866 999 999

組包大小不變,序列化使用binarywriter最快,反序列也是binaryreader,測試數據不可靠呀,我們使用基準測試。

基準測試

安裝benchmarkdotnet包用於基準測試:

<PackageReference Include="BenchmarkDotNet" Version="0.13.11" />

修改benchmarktest類:

using BenchmarkDotNet.Attributes;
using ByteTest.Core.Helpers;
using ByteTest.Core.Models;
using MessagePack;
using System.Diagnostics;

namespace ByteTest.Core.Test;

[MemoryDiagnoser, RankColumn]
public class BenchmarkTest
{
    // 省略测试数据代码

    //[Benchmark]
    //public void JsonByteSerialize()
    //{
    //    RunSerialize(new JsonSerializeHelper());
    //}

    [Benchmark]
    public void CustomSerialize()
    {
        RunSerialize(new CustomSerializeHelper());
    }

    [Benchmark]
    public void BinarySerialize()
    {
        RunSerialize(new BinarySerializeHelper());
    }

    [Benchmark]
    public void ProtoBufPackSerialize()
    {
        RunSerialize(new ProtoBufSerializeHelper());
    }

    [Benchmark]
    public void MessagePackSerialize()
    {
        RunSerialize(new MessagePackSerializeHelper());
    }
	
    // 省略统计相关代码
}

修改program.cs:

using BenchmarkDotNet.Running;
using ByteTest.Core.Test;

// 运行基准测试
BenchmarkRunner.Run<BenchmarkTest>();

// 普通测试
//BenchmarkTest.Test();

測試結果如下:

Method Mean Error StdDev Rank Gen0 Gen1 Gen2 Allocated
CustomSerialize 1.702 s 0.0120 s 0.0094 s 4 156000.0000 45000.0000 2000.0000 1230.31 MB
BinarySerialize 1.100 s 0.0101 s 0.0084 s 1 38000.0000 14000.0000 2000.0000 566.16 MB
ProtoBufPackSerialize 1.337 s 0.0190 s 0.0159 s 2 38000.0000 14000.0000 2000.0000 581.66 MB
MessagePackSerialize 1.544 s 0.0222 s 0.0197 s 3 68000.0000 29000.0000 1000.0000 449.67 MB

大致看下,體積肯定messagepack占優秀,網絡傳輸中分片更少,意味著網絡來回時間花費少;組包(封包)和解包(拆包)原生的binarywriter和binaryreader更優。

4. 總結

總的來說,數據包大小需要根據網絡環境和設備能力來合理設置,以確保高效的數據傳輸。同時,高效的組包和解包處理能力對於維持網絡傳輸性能也是至關重要的,前者可考慮messagepack進行壓縮,後者考慮原生binarywriter和binaryreader。

你有更好的方式推荐吗?欢迎留言讨论,您也可以给本文测试代码提PR哦,链接地址是ByteTest

參考

Keep Exploring

延伸阅读

更多文章
同分类 / 同标签 2026/2/7

aot使用經驗總結

從項目創建伊始,就應養成良好的習慣,即只要添加了新功能或使用了較新的語法,就及時進行 aot 發布測試。

继续阅读