
So, I was trying to match results from /proc/pid/stat
which gives a line that looks like this:
2321 (squid) S 1 2321 2321 0 -1 4202752 4841 0 0 0 530 577 0 0 20 0 1 0 24716 36311040 4417 18446744073709551615 1 1 0 0 0 0 0 4096 85571 18446744073709551615 0 0 17 0 0 0 1945 0 0
And where the 10th entry is the number of minor faults and the 12th entry is the number of major faults (as you can see here, Squid has had 4841 minor faults and 0 major faults since it was restarted when I changed IP address).
So a RegEx seemed to be the way to go and my first attempt looked like this:
(\\S)+\\s(\\S)+\\s(\\S)+\\s(\\S)+\\s(\\S)+\\s(\\S)+\\s(\\S)+\\s(\\S)+\\s(\\S)+\\s(\\S)+\\s(\\S)+\\s(\\S)+
But this did not work … well, it matched the line but it gave me bad results. [Note: \s
matches whitespace, \S
matches non-whitespace.]
I am sure you are all cleverer than me and saw the flaw straight away – but it took me some time to figure it out: (\\S)+
would treat only the first character of 4841 as a match – what I needed to use was (\\S+)
which matched the group and not just the character.
And… further to my querying of the poorly written GNU RegEx documentation, the nmatch parameter should be one bigger than the number of groups expected to be matched – in the above case that means 13.
Related articles
- RegEx for Those Who Hunger for Regular Expressions (arnoldit.com)
- Testing Java Regular Expressions: A Useful Harness (singztechmusings.wordpress.com)
- RegName, Fast Regular Expression Based File Name Changer (ghacks.net)
- Unix Fight! – Sed, Grep, Awk, Cut and Pulling Groups out of a PowerShell Regular Expression Capture (hanselman.com)
- Regular Expression Library Website (ninetwentyoneblog.wordpress.com)
- Famous Perl One-Liners Explained, Part VII: Handy Regular Expressions (catonmat.net)
- Regex Toolkit, Prayer-Based Parsing, Bad Examples (chrisjwarwick.wordpress.com)
- XPath support for the Html Agility Pack on Windows Phone (socialebola.wordpress.com)