browse by category or date

FYI, I am still slowly reading this book:

Today I reached a page where it gives example of how to use regular expression to split string shown below:

“Acme, Inc”,”Excalibur Pte Ltd”,”Blackrock, Pvt”

Into collection of sub-strings shown below:

Acme, Inc
Excalibur Pte Ltd
Blackrock, Pvt

The regex pattern that the book gives is:


var rgx = new Regex("(?:^|,)(?=[^\"]|(\")?)\"?((?(1)[^\"]*|[^,\"]*))\"?(?=,|$)");

When I saw this pattern, I was totally shell-shocked by my inability to comprehend it. So here’s my attempt to understand it. First, I am breaking the groups found within the pattern to help me digest it.

(?:^|,)

According to the documentation, (?:subPattern) is a non-capturing group. It means this pattern will match to the beginning of line or comma, but it will not create a new group.

(?=[^\"]|(\")?)

(?=subPattern) means Zero-width positive lookahead assertion. So we expect anything other than quote or a quote, appear after the previous group. But the matching string can still be use by the next group.

\"?((?(1)[^\"]*|[^,\"]*))\"?

This pattern will capture everything, except comma

(?=,|$)

Another non-capturing group that will match comma or end of the string

That’s all, I hope it helps!

GD Star Rating
loading...
C# Regex Pattern To Split Strings Separated By Comma Outside Quotation Marks, 4.0 out of 5 based on 2 ratings

Possibly relevant:

About Hardono

Howdy! I'm Hardono. I am working as a Software Developer. I am working mostly in Windows, dealing with .NET, conversing in C#. But I know a bit of Linux, mainly because I need to keep this blog operational. I've been working in Logistics/Transport industry for more than 11 years.

Incoming Search

c#, regex

No Comment

Add Your Comment