Regular Expressions
What are Regular Expressions?
In JavaScript, a regular expression object
-
is an instance of
RegExp()
constructor function, and -
represents a sequence of characters constituting a pattern that can be used to match a sequence or multiple sequences of characters in strings.
RegExp Constructor
In JavaScript to initialize a regular expression object a literal syntax (//)
or a constructor initialization syntax (new
) can be used .
const foo = /oach/
'Roach'.match(foo) // => ['oach', index: 1, input: 'Roach', groups: undefined]
'Kelpie'.match(foo) // => null
const bar = new RegExp('ike')
'Mike'.match(bar) // => ['ike', index: 1, input: 'Mike', groups: undefined]
'Leo'.match(bar) // => null
foo.constructor // => ƒ RegExp() { [native code] }
bar.constructor // => ƒ RegExp() { [native code] }
Regular expressions initialized with the literal syntax are compiled at script load but regular expressions initialized with the constructor syntax are compiled at runtime.
Regular Expression Pattern Syntax
Regular expression patterns include:
-
simple alphanumeric characters - matched outright (e.g.
michael
), -
special characters - allowing for matching a group of simple characters (e.g. digits
\d
, alphanumerics\w
, spaces\s
), a combination of characters (e.g. matching one or more ofa
a+
), etc.
Special characters can be escaped with \
to be used outright.
Special Characters
Special characters are used for:
-
matching any characters using the dot character (
.
), -
matching types of characters - digits (
\d
), non-digits (\D
), alphanumerics\w
, non-alphanumerics (\W
), spaces\s
, non-spaces (\S
), etc., -
indicating specific match locations - e.g. the beginning of a string (
^
), the end of a string ($
), before a specific pattern (foo(?=bar)
), etc. -
matching logical disjunctions - either
foo
orbar
or both (foo|bar
) , any ofa
,b
orc
([abc]
or[a-c]
) -
matching logical negation - e.g. any character but
a
([^a]
) -
matching groups of characters using
()
, -
quantifying the number of characters that comply with the match - zero or more (
*
), one or more (+
), zero or one (?
), n times (a{n}
), at least n times (a{n, }
), between n and m times (a{n, m}
)
RegExp() Methods
The most notable RegExp()
instance methods are:
-
RegExp.prototype.exec()
- returns an array of data on match ornull
otherwise, and -
RegExp.prototype.test()
which returnstrue
on match orfalse
otherwise.
const myRegex = /abc/
myRegex.exec('abc') // => ['abc', index: 0, input: 'abc', groups: undefined]
myRegex.exec('123') // => null
myRegex.test('abc') // => true
myRegex.test('123') // => false
String() Regex Methods
The most notable String()
instance methods are:
-
String.prototype.match()
- returns an array of data with capturing groups on match andnull
otherwise, -
String.prototype.search()
- returns index of a match or-1
otherwise, -
String.prototype.replace()
- replaces the matched pattern with a provided string, -
String.prototype.split()
- splits a provided string in place or places of the pattern match.
const myRegex = /abc/
const foo = 'xyz.abc.xyz.abc.xyz.abc'
foo.match(myRegex) // => ['abc', index: 4, input: 'xyz.abc.xyz.abc.xyz.abc', groups: undefined]
foo.search(myRegex) // => 4
foo.replace(myRegex, '123') // => 'xyz.123.xyz.abc.xyz.abc'
foo.split(myRegex) // => (4) ['xyz.', '.xyz.', '.xyz.', '']
TODO
Flags
Regexes can be appended with flags that can affect the behaviour of their patterns.
Flags are attached at the initialization of a regex.
const myRegexA = /foo/flags
const myRegexB = new RegExp(bar, flags)
Global Search
One of the most commonly used flags - often used with String.prototype.replace()
- is g
which stands for global search.
const nonGRegex = /abc/
const gRegex = /abc/g
const foo = 'abc.abc.abc'
foo.replace(nonGRegex, '123') // => '123.abc.abc'
foo.replace(gRegex, '123') // => '123.123.123'
Case Insensitive Search
Another commonly used flag is i
which stands for case-insensitive search.
const nonIRegex = /abc/
const iRegex = /abc/i
const foo = 'ABC'
foo.match(nonIRegex) // => null
foo.match(iRegex) // => ['ABC', index: 0, input: 'ABC', groups: undefined]
Multi-line Search
To perform a multi-line search use the m
flag.
Combining Flags
Multiple flags can be used simultaneously.
const multiRegex = /abc/gi
const foo = 'ABC.ABC.ABC'
foo.replace(multiRegex, '123') // => '123.123.123'