【C#】CsvHelper User Manual

【C#】CsvHelper User Manual

CsvHelper is a .NET library for reading and writing CSV files. Extremely fast, flexible, and easy to use.

Last updated 1/19/2024 10:26 PM
丹枫无迹
10 min read
Category
.NET
Tags
.NET C# CSV

The code in this article is based on CsvHelper 15.0.5

Introduction

CsvHelper is a .NET library for reading and writing CSV files. It is extremely fast, flexible, and easy to use.

CsvHelper is built on .NET Standard 2.0 and works almost everywhere.

GitHub URL: https://github.com/joshclose/csvhelper

Modules

Module Function
CsvHelper Core classes for reading and writing CSV data.
CsvHelper.Configuration Classes for configuring CsvHelper read/write behavior.
CsvHelper.Configuration.Attributes Attributes for configuring CsvHelper.
CsvHelper.Expressions Classes for generating LINQ expressions.
CsvHelper.TypeConversion Classes for converting CSV fields to and from .NET types.

Reading

Test class

public class Foo
{
    public int ID { get; set; }

    public string Name { get; set; }
}

CSV file data

ID,Name
1,Tom
2,Jerry

Reading all records

using (var reader = new StreamReader("foo.csv"))
{
    using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
    {
        var records = csv.GetRecords<Foo>();
    }
}

When reading a CSV file, blank lines are ignored. If a blank line contains spaces, an error will occur. If the CSV file is edited by Excel, blank lines become lines containing only the delimiter ,, which will also cause an error.

Reading records one by one

using (var reader = new StreamReader("foo.csv"))
{
    using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
    {
        while (csv.Read())
        {
            var record = csv.GetRecord<Foo>();
        }
    }
}

The GetRecords<T> method returns an IEnumerable<T> via yield and does not load all content into memory at once, unless you call ToList or ToArray. So this one-by-one reading style is not necessary.

Reading individual fields

using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
    csv.Read();
    csv.ReadHeader();

    while (csv.Read())
    {
        var id = csv.GetField<int>(0);
        var name = csv.GetField<string>("Name");
    }
}

When reading row by row, you can ignore the header row, but not here.

csv.Read(); reads the header line. Without it, the first iteration of the while loop would read the header, causing an error.

csv.ReadHeader(); assigns the header values. Without it, csv.GetField<string>("Name") will fail with "header not found".

Using TryGetField can prevent unexpected errors.

csv.TryGetField(0, out int id);

Writing

Writing all records

var records = new List<Foo>
{
    new Foo { ID = 1, Name = "Tom" },
    new Foo { ID = 2, Name = "Jerry" },
};

using (var writer = new StreamWriter("foo.csv"))
{
    using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
    {
        csv.WriteRecords(records);
    }
}

Writing records one by one

using (var writer = new StreamWriter("foo.csv"))
{
    using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
    {
        foreach (var record in records)
        {
            csv.WriteRecord(record);
        }
    }
}

Writing field by field

using (var writer = new StreamWriter("foo.csv"))
{
    using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
    {
        csv.WriteHeader<Foo>();
        csv.NextRecord();

        foreach (var record in records)
        {
            csv.WriteField(record.ID);
            csv.WriteField(record.Name);
            csv.NextRecord();
        }
    }
}

Attributes

Index

The Index attribute is used to specify field order.

When reading a file without a header, fields can only be determined by order.

public class Foo
{
    [Index(0)]
    public int ID { get; set; }

    [Index(1)]
    public string Name { get; set; }
}

using (var reader = new StreamReader("foo.csv"))
{
    using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
    {
        csv.Configuration.HasHeaderRecord = false;

        var records = csv.GetRecords<Foo>().ToList();
    }
}

The configuration csv.Configuration.HasHeaderRecord = false tells the CsvReader that there is no header. This line is mandatory; otherwise, the first row will be treated as a header and skipped, causing the last row to be missing. If there are many records, this bug can be hard to spot.

When writing to a file, fields are written in Index order. If you don't want to write a header, you also need to add csv.Configuration.HasHeaderRecord = false;

Name

If the field name does not match the column name, use the Name attribute.

public class Foo
{
    [Name("id")]
    public int ID { get; set; }

    [Name("name")]
    public string Name { get; set; }
}

NameIndex

NameIndex is used to handle duplicate column names in a CSV file.

public class Foo
{
    ...

    [Name("Name")]
    [NameIndex(0)]
    public string FirstName { get; set; }

    [Name("Name")]
    [NameIndex(1)]
    public string LastName { get; set; }
}

Ignore

Ignore the field.

Optional

If a matching field is not found during reading, it is ignored.

public class Foo
{
    ...

    [Optional]
    public string Remarks { get; set; }
}

Default

When a field read is empty, the Default attribute can specify a default value.

The Default attribute only works during reading; it does not replace empty values with the default during writing.

NullValues

public class Foo
{
    ...

    [NullValues("None", "none", "Null", "null")]
    public string None { get; set; }
}

When reading a file, if a CSV field's value is empty, the read value is "", not null. After marking with the NullValues attribute, if a CSV field's value matches one of the specified values, the result is null.

If Default attribute is also applied, this attribute has no effect.

Pitfall: This attribute does not work during writing. This can cause read/write inconsistency.

Constant

The Constant attribute specifies a constant value for a field. This value is used for both reading and writing, regardless of any other mapping or configuration.

Format

Format specifies the string format used for type conversion.

For example, numbers and date/time types often need a specific format.

public class Foo
{
    ...

    [Format("0.00")]
    public decimal Amount { get; set; }

    [Format("yyyy-MM-dd HH:mm:ss")]
    public DateTime JoinTime { get; set; }
}

BooleanTrueValues and BooleanFalseValues

These two attributes are used to convert bool values to a specified representation.

public class Foo
{
    ...

    [BooleanTrueValues("yes")]
    [BooleanFalseValues("no")]
    public bool Vip { get; set; }
}

NumberStyles

public class Foo
{
    ...

    [Format("X2")]
    [NumberStyles(NumberStyles.HexNumber)]
    public int Data { get; set; }
}

The most useful ones are NumberStyles.HexNumber and NumberStyles.AllowHexSpecifier, which have similar effects. This attribute only works during reading; it does not convert to hexadecimal during writing. This can cause read/write inconsistency, so use the Format attribute to specify the output format.

Mapping

If you cannot add attributes to the class to be mapped, you can use the ClassMap approach.

Using mapping is equivalent to using attributes, and the pitfalls are the same. The following example implements the functionality of the attributes above using properties.

public class Foo2
{
    public int ID { get; set; }

    public string Name { get; set; }

    public decimal Amount { get; set; }

    public DateTime JoinTime { get; set; }

    public string Msg { get; set; }

    public string Msg2 { get; set; }

    public bool Vip { get; set; }

    public string Remarks { get; set; }

    public string None { get; set; }

    public int Data { get; set; }
}

public class Foo2Map : ClassMap<Foo2>
{
    public Foo2Map()
    {
        Map(m => m.ID).Index(0).Name("id");
        Map(m => m.Name).Index(1).Name("name");
        Map(m => m.Amount).TypeConverterOption.Format("0.00");
        Map(m => m.JoinTime).TypeConverterOption.Format("yyyy-MM-dd HH:mm:ss");
        Map(m => m.Msg).Default("Hello");
        Map(m => m.Msg2).Ignore();
        Map(m => m.Vip)
            .TypeConverterOption.BooleanValues(true, true, new string[] { "yes" })
            .TypeConverterOption.BooleanValues(false, true, new string[] { "no" });
        Map(m => m.Remarks).Optional();
        Map(m => m.None).TypeConverterOption.NullValues("None", "none", "Null", "null");
        Map(m => m.Data)
            .TypeConverterOption.NumberStyles(NumberStyles.HexNumber)
            .TypeConverterOption.Format("X2");
    }
}

Before using the mapping, register it:

csv.Configuration.RegisterClassMap<Foo2Map>();

ConvertUsing

ConvertUsing allows you to implement type conversion using a delegate method.

// Constant
Map(m => m.Constant).ConvertUsing(row => 3);

// Combine two columns
Map(m => m.Name).ConvertUsing(row => $"{row.GetField<string>("FirstName")} {row.GetField<string>("LastName")}");

Map(m => m.Names).ConvertUsing(row => new List<string> { row.GetField<string>("Name") } );

Configuration

Delimiter

Separator/delimiter

csv.Configuration.Delimiter = ",";

HasHeaderRecord

This configuration was mentioned earlier; whether to treat the first row as a header.

csv.Configuration.HasHeaderRecord = false;

IgnoreBlankLines

Whether to ignore blank lines, default true.

csv.Configuration.IgnoreBlankLines = false;

It cannot ignore a line that contains only spaces or ,.

AllowComments

Whether to allow comments, which start with #.

csv.Configuration.AllowComments = true;

Comment

Gets or sets the character used to indicate commented-out lines. Default is #.

csv.Configuration.Comment = '/';

BadDataFound

Sets a function that triggers when data is malformed; can be used for logging.

IgnoreQuotes

Gets or sets a value indicating whether quotes should be ignored during parsing and treated like any other character.

Default is false. If a string contains quotes, three consecutive " are required for one " to appear in the read string; one is ignored, two cause an error.

If true, " will be returned literally as part of the string.

csv.Configuration.IgnoreQuotes = true;

CsvWriter does not have this property; once a string contains ", it will be written as three consecutive ".

TrimOptions

Trims leading/trailing spaces from fields.

csv.Configuration.TrimOptions = TrimOptions.Trim;

PrepareHeaderForMatch

PrepareHeaderForMatch defines a function that processes both header names and property names before matching. This can be used to remove spaces in headers or unify casing when header and property names differ in case.

csv.Configuration.PrepareHeaderForMatch = (string header, int index) => header.ToLower();
Keep Exploring

Related Reading

More Articles