Regular expressions are characters and meta-characters that are used to identify parts of text. Using meta-characters, pattern matching can be very specific or generalized. They are used to find and take actions on text. The following is not a comprehensive list, but does cover the majority of operations that are used when parsing logs/text/files.
Note: For quick testing/building of regular expressions sites such as RegExr.com can really help.
These are used to specify a position within a string or line.
^
Start of a line/string
$
End of a line/string
\A
Start of a string
\Z
End of a string
These are used to specify a particular type of character.
.
Any character except a newline
\c
Control character
\d
Digit character (e.g. 0-9)
\D
Non-digit character
\n
New line character
\O
Octal digit
\r
Carriage return character
\s
Whitespace character (tab/space/etc)
\S
Non-whitespace character
\t
Tab character
\w
Word
\W
Non-word
\x
Hexadecimal digit
These are alternative nomenclatures for specifying character types under the POSIX standard.
[:upper:]
Uppercase characters [A-Z]
[:lower:]
Lowercase characters [a-z]
[:digit:]
Any digit character [0-9]
[:space:]
Any space character (space/tab/etc)
[:alpha:]
Any uppercase or lowercase alphabetical character [A-Za-z]
[:alnum:]
Any uppercase, lowercase, or digit character [A-Za-z0-9]
[:punct:]
Any punctuation character
[:xdigit:]
Any hexadecimal digit
[:cntrl:]
Any control character
These are used to specify how many times the preceding pattern has to match. For example:
\d{3}-?\d{2}-?\d{4}
Matches a Social Security Number (SSN) format either with or without dashes.
*
Zero or more instances of the previous pattern (or single character)
+
One or more instances of the previous pattern (or single character)
?
Zero or one instance (only, not more than) of the previous pattern (or single character)
{NUMBER}
Exactly NUMBER instances
{NUMBER,}
NUMBER or more instances
{NUMBER_A, NUMBER_B}
NUMBER_A to NUMBER_B instances
These are used to specify how matching occurs and are used to make more complex patterns
[ ]
Specify a range
[A-M]
Single character in the range inclusive between "A" and "M" (e.g. "A", "B", "C", "D", ... "K", "L", "M")
[1-4]
Single digit in the range inclusive between 1 and 4 (e.g. 1, 2, 3, 4)
(A|B)
Single character that is either "A" or "B"
[ABC]
Single character that is either "A" or "B" or "C"
[^ABC]
Single character that is not "A" and not "B" and not "C"