Regular Expressions – Starting from basics

This entry is part 1 of 2 in the series Learning Regular Expressions

After taking a quick look at few common examples of regular expressions, we are taking another dig at regular expressions.

The Regular expressions originated from the grep program of Unix systems. The grep program helps in working with strings and manipulating text. PHP has support for regular expressions. Before the release of PHP 5.3, PHP supported two parsing engines for Regular Expressions – POSIX and PCRE (Perl Compatible Regular Expression).

To uniquely identify the two engines, the functions belonging to POSIX are prefixed with ereg_, while the functions belonging to PCRE begin with preg_.  The POSIX engine was deprecated since PHP 5.3, so it is recommended to use the PCRE engine. The PCRE syntax is compatible with JavaScript and many other programming languages.

Why do we need Regular Expressions?

Most common use of regular expressions is string manipulation by searching and replacing text patterns. To put the text in a required format, you can find a pattern and replace it.

The regular expressions can match the input against a fixed format. Thus, the regular expressions are used in form validations. The most common use is the email format validation. Okay, let’s see how an email format validation expression would look like

 /^[w.-]+@([w-]+.)+[a-z]+$/i 

This looks weird! But, once you master the language of regular expressions, you can create any pattern.

Constructing a Regular Expression

Traditionally, the forward slashes (/) are used as the delimiters for a regular expression. The regular expression pattern is enclosed within these delimiters. Alternatively, hash (#) or any character other than alphabets, numbers and backslash () can be used as the regular expression delimiter. So, if you are looking to search the term “refulz” in a string, then the regular expression will be

/refulz/

To write the PHP code, we will use the preg_match() function of PHP. We will match the above pattern in a PHP example.

$text = 'refulz has a developers blog';

if (preg_match('/refulz/', $text))

{

$output = '$text contains refulz';

}

else

{

$output = '$text does not contain refulz';

}

//Output: $text contains refulz

The regular expressions are case sensitive. So, a lower case character will match a lowercase character only. Therefore, in the above example, the pattern “/refulz/” will only match the exact string “refulz”. However, you can use a pattern modifier to do a case-insensitive search.

Pattern Modifiers are a single character flags that are followed by the ending delimiter of the expression. The pattern modifier for performing a case insensitive search is “i”. So, our expression will become

"/refulz/i"

With this pattern modifier, the expression will also be able to match strings like “REFULZ”, “Refulz”, “ReFulz” and so on.

This post marks the start of the Learning Regular Expression series. In the next post of the series, we will understand the meaning of useful special characters.

Series Navigation