Yesterday I wrote about regex that match string inside quotes and comma separated. Today, I was given task to process CSV file and upload it into DB. The CSV file is structured like this:
Column1,Column2,Column3 "Value 1",1, "This value contains Newline" "Value 2",2,"One line, and has comma" "Value 3",3,"A short, but contains comma and new line &ámp; HTML encoded "
I did wrote something about parsing CSV in C# and in JavaScript before. But both failed to handle above example.
Since I am not required to convert the string value to its actual data type, I can use a simple iteration to parse the CSV.
public List<List<string>> ReadCSVFileToList(StreamReader sr, char SeparatorChar) { var res = new List<List<string>>(); var curList = new List<string>(); var curString = ""; var isInQuote = false; while (!sr.EndOfStream) { var chr = (char)sr.Read(); switch (chr) { case '"': isInQuote = !isInQuote; break; case '\n': if (isInQuote) curString += '\n'; else { curList.Add(curString); curString = ""; res.Add(curList); curList = new List<string>(); } break; default: if (isInQuote) { curString += chr; } else { if (chr == SeparatorChar) { curList.Add(curString); curString = ""; } else curString += chr; } break; } } return res; }
I hope it helps!