Regular Expressions

Instructions

Legal Syntax

Repetition quantifiers:

Symbol	Function	Example	Explanation
*	Zero or more	ab*c	The letter "a" followed by zero or more "b"s followed by a "c"
+	One or more	ab+c	The letter "a" followed by one or more "b"s followed by a "c" (this is equivalent to abb*c)
?	Zero or one (i.e., optional)	abc?	The letters "ab" optionally followed by one "c"
{n}	Exactly n	(abc?){4}	Four repetitions of either "ab" or "abc" (can be mixed)
{n,}	n or more	a{4,}bc	At least four "a"s followed by "bc"
{n,m}	At least n and no greater than m	ab*c{3,152}	The letter "a" followed by 0 or more "b"s followed by anywhere from 3 to 152 "c"s

Alternation with "|"

(bcd|d*|f){3} means 3 instances of either "bcd", any number of "d"s, or an "f"

Character Classes

\s	Any whitespace (currently only includes the space character, since no allowed alphabets include any other whitespace characters
\w	Any letter, number, or underscore
\d	Any digit
\S \W \D	The complements of the above character classes
[a-h]	The square brackets enclose a list of character classes, which can include letter/number ranges
[^abc0-5]	Opening the square brackets with a "^" negates the class - so this class is all characters except for "a","b","c","0","1","2","3","4", and "5"
a[^b]+c	Bracket-classes (and backslash classes) can be treated as a single character - this regex matches the letter "a" followed by one or more non-"b" characters, followed by a "c"

Other Notes

Escaping

The following characters must be escaped with a backslash in any context: ()[]{}|\.?-*$^
with perhaps the only exception being that a dash can be used at the beginning of a character class with no special meaning.
A backslash may be used to escape any character whatsoever, if you feel so inclined.

Anchors and Backreferences

Anchors and backreferences are not allowed at all. Backreferences extend regular expressions out of the regular languages (and so cannot be represented with an NFA). Anchors are either also that way, or they're just difficult to implement and I don't care. You can, however, distinguish between matching the entire string or matching any part of the string (as you might normally do by wrapping the regex in ^$) using the provided checkbox below.

Lazy vs. Eager

Lazy/eager quantifiers are not used, since the automata only give boolean responses.

Whitespace

For the moment, at least, the only whitespace allowed is regular spaces. If you include any of \n \r \t in your regex, they will count as the letters n, r, or t. So \s is currently identical to a simple space.

Non-ASCII characters

Just don't do it. It's not available as an alphabet choice anyhow.

Regular Expressions