Regex, preg_match() introduction

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Regex, preg_match() introduction

    The preg_match() searches passed string for a match to the regular expression given in pattern and basically looks like this:
    PHP Code:
    preg_match('/^[pattern]$/i''string'); 
    ... where / is delimiter, ^ and $ are specifiers (or meta-characters) and i is pattern modifier.

    For example, if you want to change outdated ereg (Which coses th error Deprecated: Function eregi() is deprecated...) to preg, it goes like this. You have:
    PHP Code:
    ereg('pattern''string'); 
    .. and to switch you are changing to preg_match and you must add delimiters into pattern and it looks like this:
    PHP Code:
    preg_match('/pattern/''string'); 
    If you have eregi() which is case-insensetive, it is the place in regex where modifiers are matters, in this case modifier i so now it looks:
    PHP Code:
    preg_match('/pattern/i''string'); 
    Delimiters can be any character (not letter or space), such as / or # and so on. Important is that matching the delimiter character inside the pattern must be escaped,
    manually with backslash \ or using addcslashes() or quotemeta().

    Specifiers or meta-characters are like pattern commands and there are:
    Code:
    \ 
    general escape character with several uses
    ^ 
    assert start of subject (or line, in multiline mode)
    $ 
    assert end of subject (or line, in multiline mode)
    . 
    match any character except newline (by default)
    [ 
    start character class definition
    ] 
    end character class definition
    | 
    start of alternative branch
    ( 
    start subpattern
    ) 
    end subpattern
    ? 
     extends the meaning of (, also 0 or 1 quantifier, also makes greedy quantifiers lazy 
    * 
    0 or more quantifier
    + 
    1 or more quantifier
    { 
    start min/max quantifier
    } 
    end min/max quantifier
    Part of a pattern that is in square brackets is called a "character class". In a character class the only meta-characters are:
    Code:
    \ 
    general escape character
    ^ 
    negate the class, but only if the first character
    - 
    indicates character range
    ] 
    terminates the character class
    Modifiers are used to change the meaning of specifiers and pattern itself, most comonly used are:

    Code:
    [B]i[/B] Case-insensetive
    
    [B]s[/B] Singleline
    
    [B]m[/B] Multiline
    
    [B]D[/B] Changes meaning of $ to match real end of string without new line \n
    
    [B]Z[/B] Matches new line \n also
    A few examples:
    Code:
    Regular Expression	Will match...
    foo	The string "foo"
    ^foo	"foo" at the start of a string
    foo$	"foo" at the end of a string
    ^foo$	"foo" when it is alone on a string
    [abc]	a, b, or c
    [a-z]	Any lowercase letter
    [^A-Z]	Any character that is not a uppercase letter
    (gif|jpg)	Matches either "gif" or "jpeg"
    [a-z]+	One or more lowercase letters
    [0-9\.\-]	?ny number, dot, or minus sign
    ^[a-zA-Z0-9_]{1,}$	Any word of at least one letter, number or _
    ([wx])([yz])	wy, wz, xy, or xz
    [^A-Za-z0-9]	Any symbol (not a number or a letter)
    ([A-Z]{3}|[0-9]{4})	Matches three letters or four numbers
    And,
    Code:
    Regular expression (pattern)	Match (subject)	 Not match (subject)	Comment
    world	Hello world	Hello Jim	Match if the pattern is present anywhere in the subject
    ^world	world class	Hello world	Match if the pattern is present at the beginning of the subject
    world$	Hello world	world class	Match if the pattern is present at the end of the subject
    world/i	This WoRLd	Hello Jim	Makes a search in case insensitive mode
    ^world$	world	Hello world	The string contains only the "world"
    world*	worl, world, worlddd	wor	There is 0 or more "d" after "worl"
    world+	world, worlddd	worl	There is at least 1 "d" after "worl"
    world?	worl, world, worly	wor, wory	There is 0 or 1 "d" after "worl"
    world{1}	world	worly	There is 1 "d" after "worl"
    world{1,}	world, worlddd	worly	There is 1 ore more "d" after "worl"
    world{2,3}	worldd, worlddd	world	There are 2 or 3 "d" after "worl"
    wo(rld)*	wo, world, worldold	wa	There is 0 or more "rld" after "wo"
    earth|world	earth, world	sun	The string contains the "earth" or the "world"
    w.rld	world, wwrld	wrld	Any character in place of the dot.
    ^.{5}$	world, earth	sun	A string with exactly 5 characters
    [abc]	abc, bbaccc	sun	There is an "a" or "b" or "c" in the string
    [a-z]	world	WORLD	There are any lowercase letter in the string
    [a-zA-Z]	world, WORLD, Worl12	123	There are any lower- or uppercase letter in the string
    [^wW]	earth	w, W	The actual character can not be a "w" or "W"
    So, if you want to check is string contains only case-insensetive letters it looks like this:
    PHP Code:
    preg_match('/^[a-z]$/i''string'); 
    If you want to match numbers betwen letters in the string it looks like this:
    PHP Code:
    preg_match('/[0-9]/''string'$matched);
    echo 
    $matched[0]; 
    There are more shortcuts like \d (digits) or \w (letter, number and underscore) or \b (word bundle).

    For example \s is usually mistaken for white space, it is new lines and tabs also. For matching a white space use \p{Zs} and for all new lines \R .
    Those matters because difference of matching a white space in single line string and/or matching lines when reading/writing with file (where m modifier should be set).

    Example of matching alone in the string case-insensetive letters, numbers, underscore and white space without new line at the end:
    PHP Code:
    preg_match('/^[\w\p{Zs}]+$/iD''string'); 
    Ranges are defined within {} and for example range from 1 to 50 is {1,50}
    PHP Code:
    preg_match('/^[\w\p{Zs}]{1,50}$/iD''string'); 
    If you want to check if the first in the string is letter followed by \w or white space in range between 1-50 it looks like this:
    PHP Code:
    preg_match('/^[a-z]{1,1}[\w\p{Zs}]{1,50}$/iD''string'); 
    Useful links:
    PHP: Delimiters - Manual
    PHP: Meta-characters - Manual
    PHP: Possible modifiers in regex patterns - Manual
    PHP: Escape sequences - Manual
    PHP: quotemeta - Manual
    Regular-Expressions.info - Regex Tutorial, Examples and Reference - Regexp Patterns
    PHP regular expression tutorial
    Using Regular Expressions with PHP
    Last edited by arnage; 16.04.12, 10:36.
    <!DOCTYPE html PUBLIC "-//WAPFORUM.RS

    #2
    I think that's it for starters, if i missed something or someone has a question please post.
    <!DOCTYPE html PUBLIC "-//WAPFORUM.RS

    Comment


      #3
      Originally posted by arnage View Post
      I think that's it for starters, if i missed something or someone has a question please post.
      I know some tricks with preg_match,all,replace,etc ... but when it come's to real sh*t i feel like a newbie .. tnx for this usefull post !

      Others should tnx because nowdays no one takes time to write this kind of tuts'. . .

      Nice one here . keep it up !
      This is ten percent luck, twenty percent skill
      Fifteen percent concentrated power of will
      Five percent pleasure, fifty percent pain

      And a hundred percent reason to remember the name!

      Comment


        #4
        Nice. . Thankz for the tut budDy. .
        It's n0t that i am afraid to die. Its just that if i die, wh0 wilL loVe her as muCh as i Do?

        Comment

        Working...
        X