Regex with Class

Ruby's regex engine defines a lot of shortcut character classes. Besides the common meta characters (\w, etc.), there is also the POSIX style expressions and the unicode property syntax. This is an overview of all character classes:

Meta Chars

Char Negation ASCII Unicode
. - ¹ Any ¹ Any
\X - Any Grapheme clusters (\P{M}\p{M}*)
\d \D [0-9] ² ASCII plus Decimal_Number (Nd)
\h \H [0-9a-fA-F] Like ASCII
\w \W [0-9a-zA-Z_] ² ASCII plus Letter (LC / Ll / Lm / Lo / Lt / Lu), Mark (Mc / Me / Mn), Number (Nd / Nl / No), Connector_Punctuation (Pc)
\s \S [ \t\r\v\n\f] ² ASCII plus Separator (Zl / Zp / Zs)
\R - [\n\v\f\r],\r\n ASCII plus …, Line_Separator (Zl), Paragraph_Separator (Zp)

¹ Will only match linebreaks with /m flag
² You'll need to manually turn on unicode matching for these to work

POSIX and Unicode Property Style

POSIX Negation Property Negation³ ASCII Unicode
[:alnum:] [:^alnum:] \p{Alnum} \p{^Alnum} [0-9a-zA-Z] Letter (LC / Ll / Lm / Lo / Lt / Lu), Mark (Mc / Me / Mn), Decimal_Number (Nd)
[:alpha:] [:^alpha:] \p{Alpha} \p{^Alpha} [a-zA-Z] Letter (LC / Ll / Lm / Lo / Lt / Lu), Mark (Mc / Me / Mn)
[:ascii:] [:^ascii:] \p{ASCII} \p{^ASCII} [\x00-\x7F] Like ASCII
[:blank:] [:^blank:] \p{Blank} \p{^Blank} [ \t] \t, Space_Separator (Zs)
[:cntrl] [:^cntrl:] \p{Cntrl} \p{^Cntrl} [\x00-\x1F], \x7F Other (Cc / Cf / Cn / Co / Cs)
[:digit:] [:^digit:] \p{Digit} \p{^Digit} [0-9] ASCII plus Decimal_Number (Nd)
[:graph:] [:^graph:] \p{Graph} \p{^Graph} [\x21-\x7E] ALL, EXCEPT: Separator (Zl / Zp / Zs), Control (Cc), Unassigned (Cn), Surrogate (Cs)
[:lower:] [:^lower:] \p{Lower} \p{^Lower} [a-z] Lowercase_Letter (Ll)
[:print:] [:^print:] \p{Print} \p{^Print} [\x20-\x7E] ALL, EXCEPT: Line_Separator (Zl), Paragraph_Separator (Zp) , Control (Cc), Unassigned (Cn), Surrogate (Cs)
[:punct:] [:^punct:] \p{Punct} \p{^Punct} [!-/:-@\[-`{-~] Punctuation (Pc / Pd / Pe / Pf / Pi / Po / Ps)
[:space:] [:^space:] \p{Space} \p{^Space} [ \t\r\v\n\f] ASCII plus Separator (Zl / Zp / Zs)
[:upper:] [:^upper:] \p{Upper} \p{^Upper} [A-Z] Uppercase_Letter (Lu)
[:xdigit:] [:^xdigit:] \p{XDigit} \p{^XDigit} [0-9a-fA-F] Like ASCII
[:word:] [:^word:] \p{Word} \p{^Word} [0-9a-zA-Z_] ASCII plus Letter (LC / Ll / Lm / Lo / Lt / Lu), Mark (Mc / Me / Mn), Number (Nd / Nl / No), Connector_Punctuation (Pc)

³ An alternative way of negating unicode properties is \P{Property}

More Properties

The above groups are only the tip of the iceberg. Using the \p{} syntax, you can match for a lot more unicode properties, see Episode 41: Proper Unicoding for details!

Further Reading

More Idiosyncratic Ruby