Archive for July 28, 2011
Pattern Matching Hyphen-Minus Sign in Bash
I was trying to use the sed
command to perform some changes to a text and stepped into an interesting “problem”; pattern matching the minus-hyphen (-) symbol.
Assume we have the following text:
something SoMeThiNg some-thing soMe_thing
and we want to match all the different versions of the word with one expression (one by one).
My initial idea was to use this regular expression:
's/[a-zA-Z\-\_]*/matched/' |
Naturally, I tried to escape the – sign. As you can see from the output, this doesn’t work:
$ sed 's/[a-zA-Z\-\_]*/matched/' test matched matched matched-thing matched |
The minus sign is not matched, because of its special meaning (setting ranges). In order to make the expression work, you need to move the “-” either in the beginning or in the end of the expression:
$ sed 's/[a-zA-Z\_-]*/matched/' test matched matched matched matched $ sed 's/[-a-zA-Z\_]*/matched/' test matched matched matched matched |
and leave it un-escaped!