Author: InCerry
Source: www.cnblogs.com/InCerry/archive/2022/05/05/Dotnet-Opt-Perf-Use-Struct-Instead-Of-Class.html
Copyright: This work is licensed under the "Attribution-Non-Commercial Use-Sharing Alike 4.0 International" license agreement.
Statement: The copyright of this blog belongs to "InCerry".
1. preface
我们知道在 C#和 Java 明显的一个区别就是 C#可以自定义值类型,也就是今天的主角struct,我们有了更加方便的class为什么微软还加入了struct呢?这其实就是今天要谈到的一个优化性能的 Tips使用结构体替代类。
So what are the benefits of using structs instead of classes? In what scenarios do I need to use a structure instead of a class? Today's article provides you with answers one by one.
** Note: All this article takes x64-bit platforms as examples **
2. Realistic cases
To give an example of a real-world system, everyone knows the process of ticket purchase. Start by selecting the city and airport (this is the route), then choose a flight and class you like based on your need date and time, and then pay.

2.1 memory footprint
那么全国大约 49 航司,8000 多个航线,平均每个航线有 20 个航班,每个航班平均有 10 组舱位价格(经济舱、头等还有不同的折扣权益),一般 OTA(Online Travel Agency:在线旅游平台)允许预订一年内的机票。也就是说平台可能有8000*20*10*365=~5亿的价格数据(以上数据均来源网络,实际中的数据量不方便透露)。
In order to allow you to search for the desired flight faster, the OTA platform will take popular route price data out of the database and cache it in memory (memory is much faster than separate network and disk transfer, see the figure below for details). Just take 20% and about 100 million data is in memory.
| operation | speed |
|---|---|
| execute instructions | 1/1,000,000,000 seconds = 1 nanosecond |
| Read data from level 1 cache | 0.5 nanosecond |
| Branch prediction failed | 5 nanoseconds |
| Read data from secondary cache | 7 nanoseconds |
| Use Mutex to lock and unlock | 25 nanoseconds |
| Read data from main memory (RAM memory) | 100 nanoseconds |
| Send 2Kbytes of data over a 1Gbps network | 20,000 nanoseconds |
| Read 1MB of data from memory | 250,000 nanoseconds |
| The head moves to a new position (referred to as a mechanical hard disk) | 8 million nanoseconds |
| Read 1MB of data from disk | 20,000,000 nanoseconds |
| Send a packet from the United States to Europe and back | 150 milliseconds = 150,000,000 nanoseconds |
Suppose we have the following class with these attributes in it (it is much more complex in reality, and it will be stored in various dimensions such as route and date, and different flights have different selling rules, which is convenient to ignore here in the demonstration), then how much space does it take to cache 100 million data in memory?
public class FlightPriceClass
{
/// <summary>
/// 航司二字码 如 中国国际航空股份有限公司:CA
/// </summary>
public string Airline { get; set; }
/// <summary>
/// 起始机场三字码 如 上海虹桥国际机场:SHA
/// </summary>
public string Start { get; set; }
/// <summary>
/// 抵达机场三字码 如 北京首都国际机场:PEK
/// </summary>
public string End { get; set; }
/// <summary>
/// 航班号 如 CA0001
/// </summary>
public string FlightNo { get; set; }
/// <summary>
/// 舱位代码 如 Y
/// </summary>
public string Cabin { get; set; }
/// <summary>
/// 价格 单位:元
/// </summary>
public decimal Price { get; set; }
/// <summary>
/// 起飞日期 如 2017-01-01
/// </summary>
public DateOnly DepDate { get; set; }
/// <summary>
/// 起飞时间 如 08:00
/// </summary>
public TimeOnly DepTime { get; set; }
/// <summary>
/// 抵达日期 如 2017-01-01
/// </summary>
public DateOnly ArrDate { get; set; }
/// <summary>
/// 抵达时间 如 08:00
/// </summary>
public TimeOnly ArrTime { get; set; }
}
We can write a Benchmark to see how much space 100W of data requires, and then deduce 100M of data.
// 随机预先生成100W的数据 避免计算逻辑导致结果不准确
public static readonly FlightPriceClass[] FlightPrices = Enumerable.Range(0,
100_0000
).Select(index =>
new FlightPriceClass
{
Airline = $"C{(char)(index % 26 + 'A')}",
Start = $"SH{(char)(index % 26 + 'A')}",
End = $"PE{(char)(index % 26 + 'A')}",
FlightNo = $"{index % 1000:0000}",
Cabin = $"{(char)(index % 26 + 'A')}",
Price = index % 1000,
DepDate = DateOnly.FromDateTime(BaseTime.AddHours(index)),
DepTime = TimeOnly.FromDateTime(BaseTime.AddHours(index)),
ArrDate = DateOnly.FromDateTime(BaseTime.AddHours(3 + index)),
ArrTime = TimeOnly.FromDateTime(BaseTime.AddHours(3 + index)),
}).ToArray();
// 使用类来存储
[Benchmakr]
public FlightPriceClass[] GetClassStore()
{
var arrays = new FlightPriceClass[FlightPrices.Length];
for (int i = 0; i < FlightPrices.Length; i++)
{
var item = FlightPrices[i];
arrays[i] = new FlightPriceClass
{
Airline = item.Airline,
Start = item.Start,
End = item.End,
FlightNo = item.FlightNo,
Cabin = item.Cabin,
Price = item.Price,
DepDate = item.DepDate,
DepTime = item.DepTime,
ArrDate = item.ArrDate,
ArrTime = item.ArrTime
};
}
return arrays;
}
Let's take a look at the final result. The picture is shown below.

It can be seen from the above figure that 100W of data requires about 107MB of memory storage, so an occupied object is about 112 bytes, so an object worth 100 million is about 10.4GB. This size is already relatively large, so are there any more solutions to reduce some memory usage? A small partner said some plans.
- String can be numbered with int
- You can use long to store timestamps
- You can find a way to compress it using algorithms such as zip
- wait
We won't use these methods for the time being. Comparing the title of this article, you should be able to think of a way. Hehe, that is to use a structure instead of a class. We have defined the same structure, as shown below.
[StructLayout(LayoutKind.Auto)]
public struct FlightPriceStruct
{
// 属性与类一致
......
}
我们可以使用Unsafe.SizeOf来查看值类型所需要的内存大小,比如像下面这样。

You can see that this struct only requires 88 bytes, which is 27% less than the 112 bytes required by the class. Let's see how much memory can be saved.

The results are great, 27% less memory as we calculated, 57% faster assignments, and more importantly, fewer GC occurrences.
So why do structs save so much memory? Here we need to talk about the difference between structures and classes storing data. The following figure shows the storage format of class arrays.

We can see that class arrays only store pointers to array reference elements, not directly store data, and each instance of a reference type has the following things.
- Object header: The size is 8Byte. The description on CoreCLR stores "all additional information that needs to be loaded on the object", such as storing the object's lock value or HashCode cache value.
- Method table pointer: The size is 8Byte and points to the description data of the type, which is often mentioned (Method Table). MT will store GCInfo, fields, method definitions, etc.
- Object placeholder: The size is 8Byte. The current GC requires all objects to have at least one field of the current pointer size. If it is an empty class, in addition to the object header and method table pointer, it will also occupy 8Byte. If it is not an empty class, it will store the first field.
That is to say, an empty class does not define anything, and requires at least 24 bytes of space, 8bytes object header +8bytes method table pointer +8bytes object placeholder.
Back in this article, since it is not an empty class, each object requires an additional 16 bytes to store the object header and method table in addition to data storage. In addition, the array requires 8 bytes to store pointers to the object, so storing an object in the array requires an additional 24 bytes of space. Let's take a look at the value type (struct) again.

From the above figure, we can see that if it is an array of value types, then the data is stored directly on the array and no reference is required. So storing the same data saves 24 bytes for each empty structure (no need for object headers, method tables, and pointers to instances).
In addition, the array in the structure array is also a reference type, so it also has 24 bytes of data, and its object placeholder is used to store the first field of the array type-array size.
我们可以使用ObjectLayoutInspector这个 NuGet 包打印对象的布局信息,类定义的布局信息如下,可以看到除了数据存储需要的 88byte 以外,还有 16byte 额外空间。

The layout information of the structure definition is as follows. It can be seen that each structure is an actual data storage and does not contain additional occupancy.

那可不可以节省更多的内存呢?我们知道在 64 位平台上一个引用(指针)是 8byte,而在 C#上默认的字符串使用Unicode-16,也就是说 2byte 代表一个字符,像航司二字码、起抵机场这些小于 4 个字符的完全可以使用 char 数组来节省内存,比一个指针占用还要少,那我们修改一下代码。
// 跳过本地变量初始化
[SkipLocalsInit]
// 调整布局方式 使用Explicit自定义布局
[StructLayout(LayoutKind.Explicit, CharSet = CharSet.Unicode)]
public struct FlightPriceStructExplicit
{
// 需要手动指定偏移量
[FieldOffset(0)]
// 航司使用两个字符存储
public unsafe fixed char Airline[2];
// 由于航司使用了4byte 所以起始机场偏移4byte
[FieldOffset(4)]
public unsafe fixed char Start[3];
// 同理起始机场使用6byte 偏移10byte
[FieldOffset(10)]
public unsafe fixed char End[3];
[FieldOffset(16)]
public unsafe fixed char FlightNo[4];
[FieldOffset(24)]
public unsafe fixed char Cabin[2];
// decimal 16byte
[FieldOffset(28)]
public decimal Price;
// DateOnly 4byte
[FieldOffset(44)]
public DateOnly DepDate;
// TimeOnly 8byte
[FieldOffset(48)]
public TimeOnly DepTime;
[FieldOffset(56)]
public DateOnly ArrDate;
[FieldOffset(60)]
public TimeOnly ArrTime;
}
Let's take a look at the layout information of this new structure object.

可以看到现在只需要 68byte 了,最后 4byte 是为了地址对齐,因为 CPU 字长是 64bit,我们不用管。按照我们的计算能比 88Byte 节省了 29%的空间。当然使用unsafe fixed char以后就不能直接赋值了,需要进行数据拷贝才行,代码如下。
// 用于设置string值的扩展方法
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static unsafe void SetTo(this string str, char* dest)
{
fixed (char* ptr = str)
{
Unsafe.CopyBlock(dest, ptr, (uint)(Unsafe.SizeOf<char>() * str.Length));
}
}
// Benchmark的方法
public static unsafe FlightPriceStructExplicit[] GetStructStoreStructExplicit()
{
var arrays = new FlightPriceStructExplicit[FlightPrices.Length];
for (int i = 0; i < FlightPrices.Length; i++)
{
ref var item = ref FlightPrices[i];
arrays[i] = new FlightPriceStructExplicit
{
Price = item.Price,
DepDate = item.DepDate,
DepTime = item.DepTime,
ArrDate = item.ArrDate,
ArrTime = item.ArrTime
};
ref var val = ref arrays[i];
// 需要先fixed 然后再赋值
fixed (char* airline = val.Airline)
fixed (char* start = val.Start)
fixed (char* end = val.End)
fixed (char* flightNo = val.FlightNo)
fixed (char* cabin = val.Cabin)
{
item.Airline.SetTo(airline);
item.Start.SetTo(start);
item.End.SetTo(end);
item.FlightNo.SetTo(flightNo);
item.Cabin.SetTo(cabin);
}
}
return arrays;
}
Come back and see if this storage improvement can save 29% of space.

Right, saving about 29% of memory from 84MB-> 65MB. Not bad, it can basically meet expectations.
但是我们发现这个 Gen0 Gen1 Gen2 这些 GC 发生了很多次,在实际中的话因为这些都是使用的托管内存,GC 在进行回收的时候会扫描这 65MB 的内存,可能会让它的 STW 变得更久;既然这些是缓存的数据,一段时间内不会回收和改变,那我们能让 GC 别扫描这些嘛?答案是有的,我们可以直接使用非托管内存,使用Marshal类就可以申请和管理非托管内存,可以达到你写 C 语言的时候用的malloc函数类似的效果。
// 分配非托管内存
// 传参是所需要分配的字节数
// 返回值是指向内存的指针
IntPtr Marshal.AllocHGlobal(int cb);
// 释放分配的非托管内存
// 传参是由Marshal分配内存的指针地址
void Marshal.FreeHGlobal(IntPtr hglobal);
Change Benchmark's code again to use unmanaged memory.
// 定义了out ptr参数,用于将指针传回
public static unsafe int GetStructStoreUnManageMemory(out IntPtr ptr)
{
// 使用AllocHGlobal分配内存,大小使用SizeOf计算结构体大小乘需要的数量
var unManagerPtr = Marshal.AllocHGlobal(Unsafe.SizeOf<FlightPriceStructExplicit>() * FlightPrices.Length);
ptr = unManagerPtr;
// 将内存空间指派给FlightPriceStructExplicit数组使用
var arrays = new Span<FlightPriceStructExplicit>(unManagerPtr.ToPointer(), FlightPrices.Length);
for (int i = 0; i < FlightPrices.Length; i++)
{
ref var item = ref FlightPrices[i];
arrays[i] = new FlightPriceStructExplicit
{
Price = item.Price,
DepDate = item.DepDate,
DepTime = item.DepTime,
ArrDate = item.ArrDate,
ArrTime = item.ArrTime
};
ref var val = ref arrays[i];
fixed (char* airline = val.Airline)
fixed (char* start = val.Start)
fixed (char* end = val.End)
fixed (char* flightNo = val.FlightNo)
fixed (char* cabin = val.Cabin)
{
item.Airline.SetTo(airline);
item.Start.SetTo(start);
item.End.SetTo(end);
item.FlightNo.SetTo(flightNo);
item.Cabin.SetTo(cabin);
}
}
// 返回长度
return arrays.Length;
}
// 切记,非托管内存不使用的时候 需要手动释放
[Benchmark]
public void GetStructStoreUnManageMemory()
{
_ = FlightPriceCreate.GetStructStoreUnManageMemory(out var ptr);
// 释放非托管内存
Marshal.FreeHGlobal(ptr);
}
Let's take a look at Benchmark's results.

The result was very Amazing. No space was allocated in managed memory, and the assignment speed was much faster than before. When GC occurred later, there was no need to scan this piece of memory, which reduced GC pressure. This kind of result is basically quite satisfactory.
Up to now, 100 million data is stored at almost 6.3GB. If you use the other improved methods above, it should be possible to reduce the number a little. For example, like the following code, you use enumerations to replace strings, and the amount is stored in 'minutes', and only time stamps are stored.
[StructLayout(LayoutKind.Explicit, CharSet = CharSet.Unicode)]
[SkipLocalsInit]
public struct FlightPriceStructExplicit
{
// 使用byte标识航司 byte范围0~255
[FieldOffset(0)]
public byte Airline;
// 使用无符号整形表示起抵机场和航班号 2^16次方
[FieldOffset(1)]
public UInt16 Start;
[FieldOffset(3)]
public UInt16 End;
[FieldOffset(5)]
public UInt16 FlightNo;
[FieldOffset(7)]
public byte Cabin;
// 不使用decimal 价格精确到分存储
[FieldOffset(8)]
public long PriceFen;
// 使用时间戳替代
[FieldOffset(16)]
public long DepTime;
[FieldOffset(24)]
public long ArrTime;
}
The final result is that each data only requires 32 bytes of space to store, so if it stores 100 million yuan, it will be less than 3GB.

This article will not continue to discuss these methods.
2.2 calculation speed
So is there any problem with using structures? Let's take a look at the calculation. This calculation is very simple. It is to filter out the qualified routes. First, the following code method is defined for the class and structure. The Explicit structure is relatively special, so we use Span comparison.
// 类和结构体定义的方法 当然实际中的筛选可能更加复杂
// 比较航司
public bool EqulasAirline(string airline)
{
return Airline == airline;
}
// 比较起飞机场
public bool EqualsStart(string start)
{
return Start == start;
}
// 比较抵达机场
public bool EqualsEnd(string end)
{
return End == end;
}
// 比较航班号
public bool EqualsFlightNo(string flightNo)
{
return FlightNo == flightNo;
}
// 价格是否小于指定值
public bool IsPriceLess(decimal min)
{
return Price < min;
}
// 对于Explicit结构体 定义了EqualsSpan方法
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static unsafe bool SpanEquals(this string str, char* dest, int length)
{
// 使用span来比较两个数组
return new Span<char>(dest, length).SequenceEqual(str.AsSpan());
}
// 实现的方法如下所示
public static unsafe bool EqualsAirline(FlightPriceStructExplicit item, string airline)
{
// 传需要比较的长度
return airline.SpanEquals(item.Airline, 2);
}
// 下面的方式类似,不再赘述
public static unsafe bool EqualsStart(FlightPriceStructExplicit item, string start)
{
return start.SpanEquals(item.Start, 3);
}
public static unsafe bool EqualsEnd(FlightPriceStructExplicit item, string end)
{
return end.SpanEquals(item.End, 3);
}
public static unsafe bool EqualsFlightNo(FlightPriceStructExplicit item, string flightNo)
{
return flightNo.SpanEquals(item.FlightNo, 4);
}
public static unsafe bool EqualsCabin(FlightPriceStructExplicit item, string cabin)
{
return cabin.SpanEquals(item.Cabin, 2);
}
public static bool IsPriceLess(FlightPriceStructExplicit item, decimal min)
{
return item.Price < min;
}
Finally, Benchmark's code is as follows. The code logic is the same for each storage structure. Since 100W of data runs out in one go, the data volume of each storage method is 150W.
// 将需要的数据初始化好 避免对测试造成影响
private static readonly FlightPriceClass[] FlightPrices = FlightPriceCreate.GetClassStore();
private static readonly FlightPriceStruct[] FlightPricesStruct = FlightPriceCreate.GetStructStore();
private static readonly FlightPriceStructUninitialized[] FlightPricesStructUninitialized =
FlightPriceCreate.GetStructStoreUninitializedArray();
private static readonly FlightPriceStructExplicit[] FlightPricesStructExplicit =
FlightPriceCreate.GetStructStoreStructExplicit();
// 非托管内存比较特殊 只需要存储指针地址即可
private static IntPtr _unManagerPtr;
private static readonly int FlightPricesStructExplicitUnManageMemoryLength =
FlightPriceCreate.GetStructStoreUnManageMemory(out _unManagerPtr);
[Benchmark(Baseline = true)]
public int GetClassStore()
{
var caAirline = 0;
var shaStart = 0;
var peaStart = 0;
var ca0001FlightNo = 0;
var priceLess500 = 0;
for (int i = 0; i < FlightPrices.Length; i++)
{
// 简单的筛选数据
var item = FlightPrices[i];
if (item.EqualsAirline("CA"))caAirline++;
if (item.EqualsStart("SHA"))shaStart++;
if (item.EqualsEnd("PEA"))peaStart++;
if (item.EqualsFlightNo("0001"))ca0001FlightNo++;
if (item.IsPriceLess(500))priceLess500++;
}
Debug.WriteLine($"{caAirline},{shaStart},{peaStart},{ca0001FlightNo},{priceLess500}");
return caAirline + shaStart + peaStart + ca0001FlightNo + priceLess500;
}
[Benchmark]
public int GetStructStore()
{
var caAirline = 0;
var shaStart = 0;
var peaStart = 0;
var ca0001FlightNo = 0;
var priceLess500 = 0;
for (int i = 0; i < FlightPricesStruct.Length; i++)
{
var item = FlightPricesStruct[i];
if (item.EqualsAirline("CA"))caAirline++;
if (item.EqualsStart("SHA"))shaStart++;
if (item.EqualsEnd("PEA"))peaStart++;
if (item.EqualsFlightNo("0001"))ca0001FlightNo++;
if (item.IsPriceLess(500))priceLess500++;
}
Debug.WriteLine($"{caAirline},{shaStart},{peaStart},{ca0001FlightNo},{priceLess500}");
return caAirline + shaStart + peaStart + ca0001FlightNo + priceLess500;
}
[Benchmark]
public int GetFlightPricesStructExplicit()
{
var caAirline = 0;
var shaStart = 0;
var peaStart = 0;
var ca0001FlightNo = 0;
var priceLess500 = 0;
for (int i = 0; i < FlightPricesStructExplicit.Length; i++)
{
var item = FlightPricesStructExplicit[i];
if (FlightPriceStructExplicit.EqualsAirline(item,"CA"))caAirline++;
if (FlightPriceStructExplicit.EqualsStart(item,"SHA"))shaStart++;
if (FlightPriceStructExplicit.EqualsEnd(item,"PEA"))peaStart++;
if (FlightPriceStructExplicit.EqualsFlightNo(item,"0001"))ca0001FlightNo++;
if (FlightPriceStructExplicit.IsPriceLess(item,500))priceLess500++;
}
Debug.WriteLine($"{caAirline},{shaStart},{peaStart},{ca0001FlightNo},{priceLess500}");
return caAirline + shaStart + peaStart + ca0001FlightNo + priceLess500;
}
[Benchmark]
public unsafe int GetFlightPricesStructExplicitUnManageMemory()
{
var caAirline = 0;
var shaStart = 0;
var peaStart = 0;
var ca0001FlightNo = 0;
var priceLess500 = 0;
var arrays = new Span<FlightPriceStructExplicit>(_unManagerPtr.ToPointer(), FlightPricesStructExplicitUnManageMemoryLength);
for (int i = 0; i < arrays.Length; i++)
{
var item = arrays[i];
if (FlightPriceStructExplicit.EqualsAirline(item,"CA"))caAirline++;
if (FlightPriceStructExplicit.EqualsStart(item,"SHA"))shaStart++;
if (FlightPriceStructExplicit.EqualsEnd(item,"PEA"))peaStart++;
if (FlightPriceStructExplicit.EqualsFlightNo(item,"0001"))ca0001FlightNo++;
if (FlightPriceStructExplicit.IsPriceLess(item,500))priceLess500++;
}
Debug.WriteLine($"{caAirline},{shaStart},{peaStart},{ca0001FlightNo},{priceLess500}");
return caAirline + shaStart + peaStart + ca0001FlightNo + priceLess500;
}
Benchmark's results are as follows.

We see that using structs alone is a little slower than classes, but those that use Explicit layout and unmanaged memory are much slower, more than twice the gap. Is it really impossible to have both fish and cake?
Let's analyze the reason why the latter two methods are slow. The reason is that because of value copying, we know that the default reference type in C#is reference passing, and the value type is value passing.
- Reference types only need to be copied once when calling methods. The length is CPU word length. 32-bit systems are 4 bytes, and 64-bit systems are 8 bytes.
- The value type call method is value passing. For example, if the value needs to occupy 4 bytes, then 4 bytes will be copied. When it is less than or equal to the CPU word length, it has an advantage, and when it is greater than the word length, the advantage becomes a disadvantage.
Our structures are much larger than the CPU word length of 64 bits 8bytes, and our subsequent code implementations have multiple value copies, which slows down the overall speed.
So is there any way to avoid value copying? Of course, value types can also be passed by reference in C#. We have the ref keyword, so we just need to add it where the value is copied. The code is as follows.
// 改造比较方法,使其支持引用传递
// 加入ref
public static unsafe bool EqualsAirlineRef(ref FlightPriceStructExplicit item, string airline)
{
// 传递的是引用 需要fixed获取指针
fixed(char* ptr = item.Airline)
{
return airline.SpanEquals(ptr, 2);
}
}
// Benchmark内部代码也修改为引用传递
[Benchmark]
public unsafe int GetStructStoreUnManageMemoryRef()
{
var caAirline = 0;
var shaStart = 0;
var peaStart = 0;
var ca0001FlightNo = 0;
var priceLess500 = 0;
var arrays = new Span<FlightPriceStructExplicit>(_unManagerPtr.ToPointer(), FlightPricesStructExplicitUnManageMemoryLength);
for (int i = 0; i < arrays.Length; i++)
{
// 从数组里面拿直接引用
ref var item = ref arrays[i];
// 传参也直接传递引用
if (FlightPriceStructExplicit.EqualsAirlineRef(ref item,"CA"))caAirline++;
if (FlightPriceStructExplicit.EqualsStartRef(ref item,"SHA"))shaStart++;
if (FlightPriceStructExplicit.EqualsEndRef(ref item,"PEA"))peaStart++;
if (FlightPriceStructExplicit.EqualsFlightNoRef(ref item,"0001"))ca0001FlightNo++;
if (FlightPriceStructExplicit.IsPriceLessRef(ref item,500))priceLess500++;
}
Debug.WriteLine($"{caAirline},{shaStart},{peaStart},{ca0001FlightNo},{priceLess500}");
return caAirline + shaStart + peaStart + ca0001FlightNo + priceLess500;
}
Let's run the results again. Our Explicit structure is far ahead, 33% faster than using classes. In the previous round, using unmanaged memory also performed well, ranking second.

So if it is also passed by reference, it will be slower to use classes? This is back to the lower-level CPU-related knowledge. In addition to the basic computing units, our CPU also has data caches such as L1, L2, and L3, as shown in the following figure.


This is linked to CPU performance. Remember the picture at the beginning of the article? The cache within the CPU is the fastest, so the first reason is that the structure array data is stored in a continuous address space, which is very conducive to CPU caching; and class objects, because they are reference types, require pointer access, which is not very beneficial for CPU caching.
The second reason is that when a reference type is accessed, a dereference operation is required, which means that the data in the corresponding memory needs to be found through pointers, but the structure does not need to be.
那么如何验证我们的观点呢,其实BenchmarkDotNet提供了这样的指标展示,只需要引入BenchmarkDotNet.Diagnostics.Windows NuGet 包,然后在需要评测的类上面加入以下代码。
[HardwareCounters(
HardwareCounter.LlcMisses, // 缓存未命中次数
HardwareCounter.LlcReference)] // 解引用次数
public class SpeedBench : IDisposable
{
......
}
The results are as follows. Since additional statistical information on Windows ETW is needed, the running will be slightly slower.

As we can see from the figure above, using reference types to cache the most misses and dereferences are also a lot, which slows down performance.
As shown in the following figure, sequential storage structures are more efficient than skipping reference type memory access. In addition, the smaller the size of the object, the more friendly it will be to cache.


3. summary
In this article, we discuss how to use structs to replace classes to reduce large memory usage and improve computing performance by almost half. The simple use of unmanaged memory in. NETwas also discussed. Structures are something I like very much. They have a fairly efficient storage structure and very good performance. But you should not convert all classes to structs because they can be used in different scenarios.
So when do we need to use structs and when do we need to use classes? Microsoft officially gave the answer.
○ˇ If instances of types are small and usually have a short lifetime or are often embedded in other objects, consider defining structs rather than classes.
Avoid defining structures unless they have all of the following characteristics:
- It logically represents a single value, similar to primitive types (int, double, etc.)-for example, our cached data is basically primitive types.
- Its instance size is less than 16 bytes-the cost of value copying is huge, but now with ref there are more applicable scenarios.
- It is immutable-in our example today, cached data does not change, so it has this characteristic.
- It doesn't have to be boxed frequently-frequent loading and unloading has a large loss on performance. In our scenario, the functions are ref-adapted, so this does not exist.
In all other cases, types should be defined as classes.
** In fact, you can also see from these methods that C#is a language that is simple to get started but has a high limit. You can usually use the syntax features of C#to quickly realize requirements; if there is a performance bottleneck, you can completely write C#code like C++ code to achieve performance comparable to C++. **