** The code for this article is based on CsvHelper 15.0.5**
profile
CsvHelper is a. NET library for reading and writing CSV files. Extremely fast, flexible and easy to use.
CsvHelper is built on top of. NET Standard 2.0 and can run almost anywhere.
GitHub address: github.com/joshclose/csvhelper
module
| module | function |
|---|---|
| CsvHelper | The core class for reading and writing CSV data. |
| CsvHelper.Configuration | A class that configures CsvHelper's read-write behavior. |
| CsvHelper.Configuration.Attributes | Configure the features of CsvHelper. |
| CsvHelper.Expressions | The class that generates the LINQ expression. |
| CsvHelper.TypeConversion | A class that converts CSV fields to and from. NET types. |
read
** Testing **
public class Foo
{
public int ID { get; set; }
public string Name { get; set; }
}
**csv file data **
ID,Name
1,Tom
2,Jerry
Read all records
using (var reader = new StreamReader("foo.csv"))
{
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
var records = csv.GetRecords<Foo>();
}
}
When reading csv files, empty lines will be ignored, and if the empty lines contain spaces, an error will be reported. 如果是 Excel 编辑的 CSV 文件,空行将会变成仅包含分隔符
,的行,也会报错。
Read one by one
using (var reader = new StreamReader("foo.csv"))
{
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
while (csv.Read())
{
var record = csv.GetRecord<Foo>();
}
}
}
GetRecords<T>方法通过yield返回一个IEnumerable<T>,并不会将内容一次全部读进内存,除非调用了ToList或ToArray方法。所以这种逐条读取的写法没有太多必要。
Read individual fields
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
csv.Read();
csv.ReadHeader();
while (csv.Read())
{
var id = csv.GetField<int>(0);
var name = csv.GetField<string>("Name");
}
}
When reading line by line, you can ignore the title line, but not here.
csv.Read(); 这句是读取标题,如果没有的话,while 循环第一次取到的是标题,肯定会报错。
csv.ReadHeader(); 这句是给标题赋值,如果没有的话,csv.GetField<string>("Name") 会报找不到标题。
使用 TryGetField 可以防止意外的报错。
csv.TryGetField(0, out int id);
write
Write all records
var records = new List<Foo>
{
new Foo { ID = 1, Name = "Tom" },
new Foo { ID = 2, Name = "Jerry" },
};
using (var writer = new StreamWriter("foo.csv"))
{
using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
{
csv.WriteRecords(records);
}
}
Write one by one
using (var writer = new StreamWriter("foo.csv"))
{
using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
{
foreach (var record in records)
{
csv.WriteRecord(record);
}
}
}
write field by field
using (var writer = new StreamWriter("foo.csv"))
{
using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
{
csv.WriteHeader<Foo>();
csv.NextRecord();
foreach (var record in records)
{
csv.WriteField(record.ID);
csv.WriteField(record.Name);
csv.NextRecord();
}
}
}
characteristics
Index
Index 特性用于标记字段顺序。
When reading a file, if there is no title, the fields can only be determined by order.
public class Foo
{
[Index(0)]
public int ID { get; set; }
[Index(1)]
public string Name { get; set; }
}
using (var reader = new StreamReader("foo.csv"))
{
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
csv.Configuration.HasHeaderRecord = false;
var records = csv.GetRecords<Foo>().ToList();
}
}
csv.Configuration.HasHeaderRecord = false配置告知CsvReader没有标题。必须要加这一行,否则会默认第一行为标题而跳过,导致最后的结果中少了一行。如果数据量比较多,会很难发现这个 bug。
在写入文件的时候,会按 Index 顺序写入。如果不想写入标题,也需要添加 csv.Configuration.HasHeaderRecord = false;
Name
如果字段名称和列名不一致,可以使用 Name 属性。
public class Foo
{
[Name("id")]
public int ID { get; set; }
[Name("name")]
public string Name { get; set; }
}
NameIndex
NameIndex 用于处理 CSV 文件中的同名列。
public class Foo
{
...
[Name("Name")]
[NameIndex(0)]
public string FirstName { get; set; }
[Name("Name")]
[NameIndex(1)]
public string LastName { get; set; }
}
Ignore
ignore field
Optional
If no matching field is found during reading, ignore it.
public class Foo
{
...
[Optional]
public string Remarks { get; set; }
}
Default
当读取的字段为空时 Default 特性可为其指定默认值。
Default 特性仅在读取时有效,写入时是不会将空值替换为默认值写入的。
NullValues
public class Foo
{
...
[NullValues("None", "none", "Null", "null")]
public string None { get; set; }
}
读取文件时,若 CSV 文件中某字段的值为空,那么读取后的值是 "",而非 null,标记 NullValues 特性后,若 CSV 文件中的某字段值为 NullValues 指定的值,则读取后为 null。
若同时标记了 Default 特性,则此特性不起作用。
The trick is that this feature does not work when writing files. Therefore, it can cause problems of inconsistency between reading and writing.
Constant
Constant 特性为字段指定一个常量值,读写时都使用此值,无论指定了什么其他映射或配置。
Format
Format 指定类型转换时使用的字符串格式。
For example, number and time types, we often specify their formats.
public class Foo
{
...
[Format("0.00")]
public decimal Amount { get; set; }
[Format("yyyy-MM-dd HH:mm:ss")]
public DateTime JoinTime { get; set; }
}
BooleanTrueValues and BooleanFalseValues
These two features are used to convert bool into a specified display format.
public class Foo
{
...
[BooleanTrueValues("yes")]
[BooleanFalseValues("no")]
public bool Vip { get; set; }
}
NumberStyles
public class Foo
{
...
[Format("X2")]
[NumberStyles(NumberStyles.HexNumber)]
public int Data { get; set; }
}
比较有用是 NumberStyles.HexNumber 和 NumberStyles.AllowHexSpecifier,这两个枚举的作用差不多。此特性仅在读取时有效,写入时并不会转成 16 进制写入。这会导致读写不一致,可以用 Format 特性指定写入格式。
mapping
如果无法给要映射的类添加特性,在这种情况下,可以使用 ClassMap 方式进行映射。
The effect of using mappings is the same as using features, and the cheating is also the same. The following example uses properties to implement the functionality of the above features.
public class Foo2
{
public int ID { get; set; }
public string Name { get; set; }
public decimal Amount { get; set; }
public DateTime JoinTime { get; set; }
public string Msg { get; set; }
public string Msg2 { get; set; }
public bool Vip { get; set; }
public string Remarks { get; set; }
public string None { get; set; }
public int Data { get; set; }
}
public class Foo2Map : ClassMap<Foo2>
{
public Foo2Map()
{
Map(m => m.ID).Index(0).Name("id");
Map(m => m.Name).Index(1).Name("name");
Map(m => m.Amount).TypeConverterOption.Format("0.00");
Map(m => m.JoinTime).TypeConverterOption.Format("yyyy-MM-dd HH:mm:ss");
Map(m => m.Msg).Default("Hello");
Map(m => m.Msg2).Ignore();
Map(m => m.Vip)
.TypeConverterOption.BooleanValues(true, true, new string[] { "yes" })
.TypeConverterOption.BooleanValues(false, true, new string[] { "no" });
Map(m => m.Remarks).Optional();
Map(m => m.None).TypeConverterOption.NullValues("None", "none", "Null", "null");
Map(m => m.Data)
.TypeConverterOption.NumberStyles(NumberStyles.HexNumber)
.TypeConverterOption.Format("X2");
}
}
You need to register before using a map
csv.Configuration.RegisterClassMap<Foo2Map>();
ConvertUsing
ConvertUsing 允许使用一个委托方法实现类型转换。
// 常数
Map(m => m.Constant).ConvertUsing(row => 3);
// 把两列聚合在一起
Map(m => m.Name).ConvertUsing(row => $"{row.GetField<string>("FirstName")} {row.GetField<string>("LastName")}");
Map(m => m.Names).ConvertUsing(row => new List<string> { row.GetField<string>("Name") } );
configured
Delimiter
delimiter
csv.Configuration.Delimiter = ",";
HasHeaderRecord
This configuration has been mentioned earlier. Do you use the first line as the title?
csv.Configuration.HasHeaderRecord = false;
IgnoreBlankLines
是否忽略空行,默认 true
csv.Configuration.IgnoreBlankLines = false;
无法忽略一个仅包含空格或 , 的行。
AllowComments
是否允许注释,注释以 # 开头。
csv.Configuration.AllowComments = true;
Comment
获取或设置用于表示注释掉的行的字符。默认是 #。
csv.Configuration.Comment = '/';
BadDataFound
Set a function that fires when the data is incorrect and can be used to log.
IgnoreQuotes
Gets or sets a value indicating whether quotes should be ignored and treated like any other character when parsing.
默认是 false,如果字符串中有引号,必须是 3 个 " 连在一起,读取到的字符串中才会有一个 ",如果是 1 个则忽略,2 个则报错。
如果为 true,则会将 " 当做字符串原样返回。
csv.Configuration.IgnoreQuotes = true;
CsvWriter 中是没有这个属性的,一旦字符串中包含 ",写出来就是 3 个 " 连在一起。
TrimOptions
Remove spaces at the beginning and end of fields
csv.Configuration.TrimOptions = TrimOptions.Trim;
PrepareHeaderForMatch
PrepareHeaderForMatch 定义了属性名称与标题进行匹配的函数。标题和属性名称均通过该函数运行。此功能可用于删除标题中的空格,或者当标题和属性名称大小写不一致时统一大小写后比较。
csv.Configuration.PrepareHeaderForMatch = (string header, int index) => header.ToLower();