FYI, I am still slowly reading this book:
Today I reached a page where it gives example of how to use regular expression to split string shown below:
“Acme, Inc”,”Excalibur Pte Ltd”,”Blackrock, Pvt”
Into collection of sub-strings shown below:
Excalibur Pte Ltd
The regex pattern that the book gives is:
var rgx = new Regex("(?:^|,)(?=[^\"]|(\")?)\"?((?(1)[^\"]*|[^,\"]*))\"?(?=,|$)");
When I saw this pattern, I was totally shell-shocked by my inability to comprehend it. So here’s my attempt to understand it. First, I am breaking the groups found within the pattern to help me digest it.
(?=subPattern) means Zero-width positive lookahead assertion. So we expect anything other than quote or a quote, appear after the previous group. But the matching string can still be use by the next group.
This pattern will capture everything, except comma
Another non-capturing group that will match comma or end of the string
That’s all, I hope it helps!