** Contents **
- present context
- Build test data
- scheme comparison
- JSON serialization
- Custom binary serialization
- BinaryWriter\BinaryReader
- ProtoBuf
- MessagePack
- scheme analysis
- benchmarking
- summary
1. present context
Hello everyone, I am a wolf at the end of the desert.
在网络通信中,数据序列化是将对象状态转换为可存储或可传输的形式的过程,这对于TCP网络传输尤为关键。在项目中,当需要处理几十万条数据的传输时,传统的Json序列化方式由于其冗余的字段名和字符串格式,导致了二进制包体积庞大,且序列化与反序列化的效率低下。为了解决这些问题,我考虑采用更加高效的序列化方法,以减少包大小并提升处理速度。本文将探讨自定义二进制序列化、BinaryWriter/BinaryReader、MessagePack和ProtoBuf等4种序列化方法,并通过比较它们的性能,为大家提供我目前认为的最佳实践指南。
2. Build test data
创建C#控制台程序,添加Organization和Member两个类,类中包含基本的数据类型和List<T>,其他数组、字典可以自行扩展:
public class Organization
{
public int Id { get; set; }
public string[]? Tags { get; set; }
public List<Member>? Members { get; set; }
}
public class Member
{
public int Id { get; set; }
public string? Name { get; set; }
public string? Description { get; set; }
public string? Address { get; set; }
public double Value { get; set; }
public long UpdateTime { get; set; }
}
Create 1 million pieces of test data, in order to echo the title:
public class BenchmarkTest
{
/// <summary>
/// 测试数据量
/// </summary>
private const int DataCount = 1000000;
private static readonly Random RandomShared = new(DateTime.Now.Millisecond);
/// <summary>
/// 测试数据
/// </summary>
private static readonly Organization TestData = new()
{
Id = 1,
Tags = Enumerable.Range(0, 5).Select(index => $"测试标签{index}").ToArray(),
Members = Enumerable.Range(0, DataCount).Select(index => new Member()
{
Id = index,
Name = $"测试名字{index}",
Description = $"测试描述{RandomShared.Next(1, int.MaxValue)}",
Address = $"测试地址{RandomShared.Next(1, int.MaxValue)}",
Value = RandomShared.Next(1, int.MaxValue),
UpdateTime = ((DateTimeOffset)DateTime.Now).ToUnixTimeMilliseconds()
}).ToList()
};
}
3. scheme comparison
首先创建序列化接口ISerializeHelper,各种序列化提供程序需要分别实现:
public interface ISerializeHelper
{
byte[] Serialize(Organization data);
Organization? Deserialize(byte[] buffer);
}
再创建BenchmarkTest类,添加RunSerialize方法用于执行序列化提供程序,在此方法中依次调用提供程序的序列化和反序列方法,统一对测试方法进行耗时和组包大小打印输出统计:
public class BenchmarkTest
{
// ...省略前面的代码
private static void RunSerialize(ISerializeHelper helper)
{
Stopwatch sw = Stopwatch.StartNew();
var buffer = helper.Serialize(TestData);
sw.Stop();
Log($"{helper.GetType().Name} Serialize {sw.ElapsedMilliseconds}ms {buffer.Length}byte");
sw.Restart();
var data = helper.Deserialize(buffer);
sw.Stop();
Log($"{helper.GetType().Name} Deserialize {sw.ElapsedMilliseconds}ms {data?.Members?.Count}项");
}
private static void Log(string log)
{
Console.WriteLine($"{DateTime.Now:yyyy-MM-dd HH:mm:ss fff}: {log}");
}
}
3.1. JSON serialization
首先测试Json序列化,安装System.Text.Json包:
<PackageReference Include="System.Text.Json" Version="8.0.0" />
创建JsonSerializeHelper提供程序,并实现ISerializeHelper接口:
using ByteTest.Core.Models;
using System.Text.Json;
namespace ByteTest.Core.Helpers;
public class JsonSerializeHelper : ISerializeHelper
{
public byte[] Serialize(Organization data)
{
return JsonSerializer.SerializeToUtf8Bytes(data);
}
public Organization? Deserialize(byte[] buffer)
{
var data = JsonSerializer.Deserialize<Organization>(buffer);
return data;
}
}
JsonSerializer.SerializeToUtf8Bytes 方法会直接生成字节数据,而不是先生成字符串然后再转换为字节,这对于一些需要将 JSON 数据写入文件或网络流的场景非常有用,因为这些场景通常需要字节数据而不是字符串。此外,由于避免了不必要的字符串分配,它还可以提高性能并减少内存压力。
再在BenchmarkTest类添加测试方法Test:
public static void Test()
{
RunSerialize(new JsonByteSerializeHelper());
}
在Program中调用Test()方法:
BenchmarkTest.Test();
The program output is as follows:
2023-12-10 22:28:24 880: JsonByteSerializeHelper Serialize 2813ms 196227181byte
2023-12-10 22:28:26 858: JsonByteSerializeHelper Deserialize 1964ms 1000000项
Json序列化100万条数据,需要2.8s,数据包187.14MB,真大、真慢。
3.2. Custom binary serialization
Next, I will test the custom binary serialization. This is the method I used to commonly use, but it is still tedious at present. First, I will define the packet field specification:
| data type | binary length | description |
|---|---|---|
| Number type (short\rush\int\uint\long\ulong\double, etc.) | 2\2\4\4\8\8\8 | The basic number type is fixed length |
| string | 4+n | Use 4 bytes of type int to represent the binary length of the string, and n to represent the actual length of the string binary array |
T[]`List |
4+n | An array or list is similar to a string. 4 bytes of type int are used to represent the binary length of the array or list, and n represents the actual length of the array or list binary array. |
添加CustomSerializeHelper实现接口ISerializeHelper:
using ByteTest.Core.Models;
namespace ByteTest.Core.Helpers;
public class CustomSerializeHelper : ISerializeHelper
{
public byte[] Serialize(Organization data)
{
// 1、计算Id
var idBuffer = BitConverter.GetBytes(data.Id);
// 2、计算Tag数组
var tagBuffer = GetBytes(data.Tags);
// 3、计算Members
var membersBuffer = GetBytes(data.Members);
return GetBytes(new[] { idBuffer, tagBuffer, membersBuffer });
}
public Organization? Deserialize(byte[] buffer)
{
var data = new Organization();
var index = 0;
data.Id = BitConverter.ToInt32(buffer, index);
index += sizeof(int);
data.Tags = GetTags(buffer, ref index);
data.Members = GetMembers(buffer, ref index);
return data;
}
/// <summary>
/// 获取字符串列表byte[]
/// </summary>
/// <param name="data"></param>
/// <returns></returns>
private byte[] GetBytes(string[]? data)
{
var dataCount = data?.Length ?? 0;
var dataCountBuffer = BitConverter.GetBytes(dataCount);
if (dataCount <= 0)
{
return dataCountBuffer;
}
var dataValueBuffers = data!.Select(item => ByteHelper.GetBytes(item)!).ToArray();
var dataValueBuffer = GetBytes(dataValueBuffers);
return GetBytes(new[] { dataCountBuffer, dataValueBuffer });
}
/// <summary>
/// 获取成功列表byte[]
/// </summary>
/// <param name="data"></param>
/// <returns></returns>
private byte[] GetBytes(List<Member>? data)
{
var dataCount = data?.Count ?? 0;
var dataCountBuffer = BitConverter.GetBytes(dataCount);
if (dataCount <= 0)
{
return dataCountBuffer;
}
var dataValueBuffers = data!.Select(item =>
{
var idBuffer = BitConverter.GetBytes(item.Id);
var nameBuffer = ByteHelper.GetBytes(item.Name);
var descriptionBuffer = ByteHelper.GetBytes(item.Description);
var addressBuffer = ByteHelper.GetBytes(item.Address);
var valueBuffer = BitConverter.GetBytes(item.Value);
var updateTimeBuffer = BitConverter.GetBytes(item.UpdateTime);
var buffer = GetBytes(new byte[][]
{ idBuffer, nameBuffer, descriptionBuffer, addressBuffer, valueBuffer, updateTimeBuffer });
return buffer;
}).ToArray();
var dataValueBuffer = GetBytes(dataValueBuffers);
return GetBytes(new[] { dataCountBuffer, dataValueBuffer });
}
private byte[] GetBytes(byte[][] data)
{
var dataBufferLen = data.Sum(itemBuffer => itemBuffer.Length);
var dataBuffer = new byte[dataBufferLen];
var dataIndex = 0;
for (var i = 0; i < data.Length; i++)
{
var itemBuffer = data[i];
Array.Copy(itemBuffer, 0, dataBuffer, dataIndex, itemBuffer.Length);
dataIndex += itemBuffer.Length;
}
return dataBuffer;
}
private string[]? GetTags(byte[] buffer, ref int index)
{
var count = BitConverter.ToInt32(buffer, index);
index += sizeof(int);
if (count <= 0)
{
return default;
}
var data = new string[count];
for (var i = 0; i < count; i++)
{
data[i] = GetString(buffer, ref index);
}
return data;
}
private List<Member>? GetMembers(byte[] buffer, ref int index)
{
var count = BitConverter.ToInt32(buffer, index);
index += sizeof(int);
if (count <= 0)
{
return default;
}
var data = new List<Member>();
for (var i = 0; i < count; i++)
{
var people = new Member();
people.Id = BitConverter.ToInt32(buffer, index);
index += sizeof(int);
people.Name = GetString(buffer, ref index);
people.Description = GetString(buffer, ref index);
people.Address = GetString(buffer, ref index);
people.Value = BitConverter.ToDouble(buffer, index);
index += sizeof(double);
people.UpdateTime = BitConverter.ToInt64(buffer, index);
index += sizeof(long);
data.Add(people);
}
return data;
}
private string GetString(byte[] buffer, ref int index)
{
var count = BitConverter.ToInt32(buffer, index);
index += sizeof(int);
if (count <= 0)
{
return string.Empty;
}
var data = ByteHelper.DefaultEncoding.GetString(buffer, index, count);
index += count;
return data;
}
}
代码不少,看个大概就好,主要用BitConverter.GetBytes和BitConverter.ToXXX获取或设置基本数据类型的byte[]数据,对于多byte[]的复制,如果数据较小,用Array.Copy,如果数据较大建议用Buffer.BlockCopy。
代码中使用到辅助类ByteHelper,用于计算字符串byte[]:
public static class ByteHelper
{
public static Encoding DefaultEncoding = Encoding.UTF8;
/// <summary>
/// 获取字符串二进制数据:字符串二进制数据=4个字节int表示字符串实际值长度+n个字节表示字符串实际值
/// </summary>
/// <param name="str"></param>
/// <returns></returns>
public static byte[] GetBytes(string? str)
{
if (string.IsNullOrEmpty(str))
{
return BitConverter.GetBytes(0);
}
var strValueBuffer = DefaultEncoding.GetBytes(str);
var strValueLen = strValueBuffer.Length;
var strValueLenBuffer = BitConverter.GetBytes(strValueLen);
var strBufferLen = sizeof(int) + strValueLen;
var strBuffer = new byte[strBufferLen];
var index = 0;
Array.Copy(strValueLenBuffer, 0, strBuffer, index, sizeof(int));
index += sizeof(int);
Array.Copy(strValueBuffer, 0, strBuffer, index, strValueLen);
return strBuffer;
}
}
我们修改Test方法,加入CustomSerializeHelper测试:
public static void Test()
{
var serializeHelpers = new List<ISerializeHelper>
{
new JsonSerializeHelper(),
new CustomSerializeHelper()
};
serializeHelpers.ForEach(RunSerialize);
}
The program output is as follows:
2023-12-10 22:45:14 701: JsonSerializeHelper Serialize 2774ms 196225588byte
2023-12-10 22:45:16 613: JsonSerializeHelper Deserialize 1898ms 1000000项
2023-12-10 22:45:17 414: CustomSerializeHelper Serialize 801ms 92854209byte
2023-12-10 22:45:18 072: CustomSerializeHelper Deserialize 657ms 1000000项
It can be seen that there have been a lot of optimizations. Custom binary serialization is reduced by 100 million bytes (about 100MB), serialization (packaging) is about 2 seconds faster, and deserialization (unpacking) is more than 1 second faster.
3.3. BinaryWriter\BinaryReader
BinaryWriter 和 BinaryReader 类是用于以二进制格式写入和读取数据的类。它们分别提供了一系列的方法来写入和读取各种基本数据类型(如int, float, double, string等)的二进制表示。这些类通常与文件流(FileStream)一起使用,但也可以与其他类型的流(如MemoryStream)配合使用。
自定义的方式全手工操作,需要自己进行字节数组的复制,各种转换,有点原始,使用BinaryWriter\BinaryReader进行序列化操作应该二进制序列化的标准用法了。
创建BinarySerializeHelper类,并实现ISerializeHelper接口:
public class BinarySerializeHelper : ISerializeHelper
{
public byte[] Serialize(Organization data)
{
using var stream = new MemoryStream();
using var writer = new BinaryWriter(stream, ByteHelper.DefaultEncoding);
writer.Write(data.Id);
Write(writer, data.Tags);
Write(writer, data.Members);
return stream.ToArray();
}
public Organization Deserialize(byte[] buffer)
{
var data = new Organization();
using var stream = new MemoryStream(buffer);
using var reader = new BinaryReader(stream, ByteHelper.DefaultEncoding);
data.Id = reader.ReadInt32();
data.Tags = ReadStringList(reader);
data.Members = ReadPeopleList(reader);
return data;
}
private static void Write(BinaryWriter writer, string[]? data)
{
var count = data?.Length ?? 0;
writer.Write(count);
if (count <= 0)
{
return;
}
foreach (var item in data!)
{
writer.Write(item);
}
}
private static void Write(BinaryWriter writer, List<Member>? data)
{
var count = data?.Count ?? 0;
writer.Write(count);
if (count > 0)
{
foreach (var item in data)
{
writer.Write(item.Id);
writer.Write(item.Name ?? string.Empty);
writer.Write(item.Description ?? string.Empty);
writer.Write(item.Address ?? string.Empty);
writer.Write(item.Value);
writer.Write(item.UpdateTime);
}
}
}
private static string[]? ReadStringList(BinaryReader reader)
{
var count = reader.ReadInt32();
if (count <= 0)
{
return default;
}
var values = new string[count];
for (int i = 0; i < count; i++)
{
values[i] = reader.ReadString();
}
return values;
}
private static List<Member>? ReadPeopleList(BinaryReader reader)
{
var count = reader.ReadInt32();
if (count <= 0)
{
return default;
}
var values = new List<Member>();
for (int i = 0; i < count; i++)
{
var item = new Member();
item.Id = reader.ReadInt32();
item.Name = reader.ReadString();
item.Description = reader.ReadString();
item.Address = reader.ReadString();
item.Value = reader.ReadDouble();
item.UpdateTime = reader.ReadInt64();
values.Add(item);
}
return values;
}
}
代码也不少,列表的序列化和反序列在上面自定义二进制序列化时就应该封装成方法,通过反射实现通用列表的序列化和反序列化,这一小节也是,不想再折腾了,我们在BenchmarkTest类的Test方法内加上BinarySerializeHelper,再运行程序:
2023-12-10 22:52:56 986: JsonSerializeHelper Serialize 2715ms 196225584byte
2023-12-10 22:52:58 910: JsonSerializeHelper Deserialize 1910ms 1000000项
2023-12-10 22:52:59 730: CustomSerializeHelper Serialize 819ms 92853722byte
2023-12-10 22:53:00 389: CustomSerializeHelper Deserialize 659ms 1000000项
2023-12-10 22:53:00 660: BinarySerializeHelper Serialize 269ms 83853707byte
2023-12-10 22:53:01 466: BinarySerializeHelper Deserialize 806ms 1000000项
BinaryWriter\BinaryReader序列化包又比自定义二进制序列化方式小8.5MB左右,序列化速度也快0.5s,反序列化稍微慢点,不错哟。
3.4. ProtoBuf
Have you heard of or used Google's Protocol Buffers?
本小节介绍使用protobuf-net库,这是一个在 .NET 环境中使用的库,它提供了对 Google 的 Protocol Buffers 数据序列化格式的支持。Protocol Buffers 是一种轻量级、高效的结构化数据序列化机制,通常用于跨服务或应用程序的通信,以及数据存储。
安装包protobuf-net:
<PackageReference Include="protobuf-net" Version="3.2.30" />
给测试的类添加类序列化特性[ProtoContract]和属性序列化特性[ProtoMember(序列化顺序)]:
[ProtoContract]
public class Organization
{
[ProtoMember(1)] public int Id { get; set; }
[ProtoMember(2)] public string[]? Tags { get; set; }
[ProtoMember(3)] public List<Member>? Members { get; set; }
}
[ProtoContract]
public class Member
{
[ProtoMember(1)] public int Id { get; set; }
[ProtoMember(2)] public string? Name { get; set; }
[ProtoMember(3)] public string? Description { get; set; }
[ProtoMember(4)] public string? Address { get; set; }
[ProtoMember(5)] public double Value { get; set; }
[ProtoMember(6)] public long UpdateTime { get; set; }
}
添加ProtoBufSerializeHelper类,并实现ISerializeHelper接口:
using ByteTest.Core.Models;
using ProtoBuf;
namespace ByteTest.Core.Helpers;
public class ProtoBufSerializeHelper : ISerializeHelper
{
public byte[] Serialize(Organization data)
{
using var stream = new MemoryStream();
Serializer.Serialize(stream, data);
return stream.ToArray();
}
public Organization? Deserialize(byte[] buffer)
{
using var stream = new MemoryStream(buffer);
return Serializer.Deserialize<Organization>(stream);
}
}
还有一步,给上面的Test方法加上ProtoBufSerializeHelper序列化方式,程序输出如下:
2023-12-10 23:01:17 478: JsonSerializeHelper Serialize 2767ms 196225803byte
2023-12-10 23:01:19 556: JsonSerializeHelper Deserialize 2064ms 1000000项
2023-12-10 23:01:20 350: CustomSerializeHelper Serialize 793ms 92853782byte
2023-12-10 23:01:21 012: CustomSerializeHelper Deserialize 662ms 1000000项
2023-12-10 23:01:21 271: BinarySerializeHelper Serialize 258ms 83853767byte
2023-12-10 23:01:22 086: BinarySerializeHelper Deserialize 815ms 1000000项
2023-12-10 23:01:22 629: ProtoBufSerializeHelper Serialize 542ms 88837248byte
2023-12-10 23:01:23 688: ProtoBufSerializeHelper Deserialize 1058ms 1000000项
Eh, ProtoBuf is bigger and slower than BinaryWriter's serialization package. Am I using it wrong? You may need to add a compression algorithm. We'll study it later. Let's continue to look at the last MessagePack. Please point out if you have any usage problems.
3.5. MessagePack
介绍最后一种序列化包MessagePack,这是一种高效的二进制序列化格式,它允许数据在不同的系统之间进行快速且紧凑的传输。它类似于JSON,但是更小、更快、更节省空间。
需要安装包MessagePack包:
<PackageReference Include="MessagePack" Version="2.6.100-alpha" />
添加类MessagePackSerializeHelper,并实现接口ISerializeHelper:
using ByteTest.Core.Models;
using MessagePack;
namespace ByteTest.Core.Helpers;
public class MessagePackSerializeHelper : ISerializeHelper
{
// 这种方式需要在类和字段上添加特性,稍显麻烦,但添加压缩选项后,组包体积、组包和解包速度更快
//readonly MessagePackSerializerOptions _options = MessagePackSerializerOptions.Standard.WithCompression(MessagePackCompression.Lz4BlockArray);
// 这种方式不需要给传输对象添加特性,也可添加压缩选项
readonly MessagePackSerializerOptions _options =
MessagePack.Resolvers.ContractlessStandardResolver.Options.WithCompression(MessagePackCompression
.Lz4BlockArray);
public byte[] Serialize(Organization data)
{
return MessagePackSerializer.Serialize(data, _options);
}
public Organization? Deserialize(byte[] buffer)
{
return MessagePackSerializer.Deserialize<Organization>(buffer, _options);
}
}
Looking at the comment code above, the options provided may be better and the compression volume will be smaller. Let's add tests later.
Finally, modify the Test method:
public static void Test(List<ISerializeHelper>? moreHelpers = null)
{
var serializeHelpers = new List<ISerializeHelper>
{
new JsonSerializeHelper(),
new CustomSerializeHelper(),
new BinarySerializeHelper(),
new ProtoBufSerializeHelper(),
new MessagePackSerializeHelper(),
};
if (moreHelpers?.Count() > 0)
{
serializeHelpers.AddRange(moreHelpers);
}
serializeHelpers.ForEach(RunSerialize);
}
Output from running program:
2023-12-11 21:34:47 782: JsonSerializeHelper Serialize 2456ms 196225500byte
2023-12-11 21:34:51 215: JsonSerializeHelper Deserialize 3430ms 1000000项
2023-12-11 21:34:52 186: CustomSerializeHelper Serialize 970ms 92853911byte
2023-12-11 21:34:52 711: CustomSerializeHelper Deserialize 526ms 1000000项
2023-12-11 21:34:53 734: BinarySerializeHelper Serialize 1022ms 83853896byte
2023-12-11 21:34:54 354: BinarySerializeHelper Deserialize 620ms 1000000项
2023-12-11 21:34:55 170: ProtoBufSerializeHelper Serialize 815ms 88837377byte
2023-12-11 21:34:56 205: ProtoBufSerializeHelper Deserialize 1035ms 1000000项
2023-12-11 21:34:57 123: MessagePackSerializeHelper Serialize 917ms 43583878byte
2023-12-11 21:34:58 527: MessagePackSerializeHelper Deserialize 1403ms 1000000项
Then switch the comment options above, and modify the code as follows:
using ByteTest.Core.Models;
using MessagePack;
namespace ByteTest.Core.Helpers;
public class MessagePackSerializeHelper : ISerializeHelper
{
// 这种方式需要在类和字段上添加特性,稍显麻烦,但添加压缩选项后,组包体积、组包和解包速度更快
readonly MessagePackSerializerOptions _options = MessagePackSerializerOptions.Standard.WithCompression(MessagePackCompression.Lz4BlockArray);
// 这种方式不需要给传输对象添加特性,也可添加压缩选项
//readonly MessagePackSerializerOptions _options =
// MessagePack.Resolvers.ContractlessStandardResolver.Options.WithCompression(MessagePackCompression
// .Lz4BlockArray);
public byte[] Serialize(Organization data)
{
return MessagePackSerializer.Serialize(data, _options);
}
public Organization? Deserialize(byte[] buffer)
{
return MessagePackSerializer.Deserialize<Organization>(buffer, _options);
}
}
并给传输类添加特性[MessagePackObject],需要序列化的属性添加特性[Key(序列化索引)]:
using MessagePack;
using ProtoBuf;
namespace ByteTest.Core.Models;
[ProtoContract]
[MessagePackObject]
public class Organization
{
[ProtoMember(1)] [Key(0)] public int Id { get; set; }
[ProtoMember(2)] [Key(1)] public string[]? Tags { get; set; }
[ProtoMember(3)] [Key(2)] public List<Member>? Members { get; set; }
}
[ProtoContract]
[MessagePackObject]
public class Member
{
[ProtoMember(1)] [Key(0)] public int Id { get; set; }
[ProtoMember(2)] [Key(1)] public string? Name { get; set; }
[ProtoMember(3)] [Key(2)] public string? Description { get; set; }
[ProtoMember(4)] [Key(3)] public string? Address { get; set; }
[ProtoMember(5)] [Key(4)] public double Value { get; set; }
[ProtoMember(6)] [Key(5)] public long UpdateTime { get; set; }
}
Program output:
2023-12-11 21:49:34 153: JsonSerializeHelper Serialize 2383ms 196226140byte
2023-12-11 21:49:37 736: JsonSerializeHelper Deserialize 3581ms 1000000项
2023-12-11 21:49:38 720: CustomSerializeHelper Serialize 983ms 92854251byte
2023-12-11 21:49:39 250: CustomSerializeHelper Deserialize 530ms 1000000项
2023-12-11 21:49:40 273: BinarySerializeHelper Serialize 1023ms 83854236byte
2023-12-11 21:49:40 907: BinarySerializeHelper Deserialize 632ms 1000000项
2023-12-11 21:49:41 660: ProtoBufSerializeHelper Serialize 754ms 88837717byte
2023-12-11 21:49:42 676: ProtoBufSerializeHelper Deserialize 1014ms 1000000项
2023-12-11 21:49:43 357: MessagePackSerializeHelper Serialize 681ms 38706475byte
2023-12-11 21:49:44 344: MessagePackSerializeHelper Deserialize 986ms 1000000项
Here is the output of the previous option, MessagePack:
2023-12-11 21:34:57 123: MessagePackSerializeHelper Serialize 917ms 43583878byte
2023-12-11 21:34:58 527: MessagePackSerializeHelper Deserialize 1403ms 1000000项
I can see that the first option is the serialization package for better size and speed.
scheme analysis
1 million pieces of test data, and the test statistical data of 5 serialization methods are listed in the following table:
| serialization method | Json | custom binary | Binary | ProtoBuf | MessagePack |
|---|---|---|---|---|---|
| Serialization packet size (MB) | 187.13 | 88.55 | 79.97 | 84.72 | 36.91 |
| Serialization time (ms) | 2383 | 983 | 1023 | 754 | 681 |
| deserialization time (ms) | 3581 | 530 | 632 | 1014 | 986 |
Through the analysis in the above table, after serialization, the minimum package of MessagePack is 36.91MB, the maximum of Json reaches 187.13MB, and the other three types are around 80MB; if we consider the serialization efficiency MessagePack is the best, but the deserialization efficiency is actually the fastest in my custom binary method? Let's remove the Json method and run it again:
2023-12-11 21:55:47 813: CustomSerializeHelper Serialize 1263ms 92854890byte
2023-12-11 21:55:48 804: CustomSerializeHelper Deserialize 989ms 1000000项
2023-12-11 21:55:49 215: BinarySerializeHelper Serialize 410ms 83854875byte
2023-12-11 21:55:50 081: BinarySerializeHelper Deserialize 866ms 1000000项
2023-12-11 21:55:50 726: ProtoBufSerializeHelper Serialize 644ms 88838356byte
2023-12-11 21:55:51 725: ProtoBufSerializeHelper Deserialize 999ms 1000000项
2023-12-11 21:55:52 426: MessagePackSerializeHelper Serialize 701ms 38701799byte
2023-12-11 21:55:53 427: MessagePackSerializeHelper Deserialize 999ms 1000000项
| serialization method | custom binary | Binary | ProtoBuf | MessagePack |
|---|---|---|---|---|
| Serialization packet size (MB) | 88.55 | 79.97 | 84.72 | 36.91 |
| Serialization time (ms) | 1263 | 410 | 644 | 701 |
| deserialization time (ms) | 989 | 866 | 999 | 999 |
The package size remains unchanged, serialization uses BinaryWriter the fastest, and inverse sequencing is also BinaryReader. The test data is unreliable, so we use benchmark testing.
benchmarking
Install the BenchmarkDotNet package for benchmark testing:
<PackageReference Include="BenchmarkDotNet" Version="0.13.11" />
Modify BenchmarkTest class:
using BenchmarkDotNet.Attributes;
using ByteTest.Core.Helpers;
using ByteTest.Core.Models;
using MessagePack;
using System.Diagnostics;
namespace ByteTest.Core.Test;
[MemoryDiagnoser, RankColumn]
public class BenchmarkTest
{
// 省略测试数据代码
//[Benchmark]
//public void JsonByteSerialize()
//{
// RunSerialize(new JsonSerializeHelper());
//}
[Benchmark]
public void CustomSerialize()
{
RunSerialize(new CustomSerializeHelper());
}
[Benchmark]
public void BinarySerialize()
{
RunSerialize(new BinarySerializeHelper());
}
[Benchmark]
public void ProtoBufPackSerialize()
{
RunSerialize(new ProtoBufSerializeHelper());
}
[Benchmark]
public void MessagePackSerialize()
{
RunSerialize(new MessagePackSerializeHelper());
}
// 省略统计相关代码
}
Modify Program.cs:
using BenchmarkDotNet.Running;
using ByteTest.Core.Test;
// 运行基准测试
BenchmarkRunner.Run<BenchmarkTest>();
// 普通测试
//BenchmarkTest.Test();
The test results are as follows:
| Method | Mean | Error | StdDev | Rank | Gen0 | Gen1 | Gen2 | Allocated |
|---|---|---|---|---|---|---|---|---|
| CustomSerialize | 1.702 s | 0.0120 s | 0.0094 s | 4 | 156000.0000 | 45000.0000 | 2000.0000 | 1230.31 MB |
| BinarySerialize | 1.100 s | 0.0101 s | 0.0084 s | 1 | 38000.0000 | 14000.0000 | 2000.0000 | 566.16 MB |
| ProtoBufPackSerialize | 1.337 s | 0.0190 s | 0.0159 s | 2 | 38000.0000 | 14000.0000 | 2000.0000 | 581.66 MB |
| MessagePackSerialize | 1.544 s | 0.0222 s | 0.0197 s | 3 | 68000.0000 | 29000.0000 | 1000.0000 | 449.67 MB |
Roughly looking at it, MessagePack is the best in size, and there are fewer fragments in network transmission, which means that the network spends less round-trip time; the native BinaryWriter and BinaryReader are better for grouping (encapsulating) and unpacking (unpacking).
4. summary
In general, the packet size needs to be set reasonably based on the network environment and device capabilities to ensure efficient data transmission. At the same time, efficient packet formation and unpacking processing capabilities are also crucial to maintaining network transmission performance. The former can consider MessagePack for compression, while the latter can consider native BinaryWriter and BinaryReader.
你有更好的方式推荐吗?欢迎留言讨论,您也可以给本文测试代码提PR哦,链接地址是ByteTest
** Reference **