Given an input string (s) and a pattern (p), implement regular expression matching with support for ‘.’ and ‘*’.

Note:

‘.’ Matches any single character.
‘*’ Matches zero or more of the preceding element.

Example 1:
Input:
Pattern: a.b
	Valid Strings:
		acb - any character can be present in place of "."
		aab - same reason as above
		adb - same reason as above

Invalid Strings:
	ab - Because in middle some element should be there to fill the place of "."
	acab - "." can hold only 1 element
	cb - because "a" is not present.

Example 2:
	Pattern: a*b
		Valid Strings:
			b - because * matches 0 or more character of the element behind it
			ab - same reson as above
			aab 
			aaab
	In-Valid Strings:
		a - because "b" should be present at the end.
		acb - because * accepts elements of preceding element. As * is preceding by "a". This pattern is invalid.

We solve this by using bottom up dynamic programming approach.

Example 3: 
	Pattern: a * b . * y

	Here 
		a* will accept 0 or more occurance of a

		b one "b" should be present

		. * 0 or more occurance of any character

		y one "y" at the end should be prensent.

	Valid Strings:
		by
		bly
		ably

	In-Valid Strings:
		ay
		ab

First we take a boolean 2D matrix “bArray[i] [j] ” to store the truth values. Then we iterate through all the elements and fill the boolean array.

Where “i” represents the index of string “s”.
and “j” represents the index of pattern “p”

For any value of bArray[i] [j] will be equall to below 4 condition:


Condition 1:
	The value of bArray[i] [j] will be equal 
        to the value of  bArray[i -1 ] [j -1] when 
        string [i] == pattern [j] || pattern [j] == '.'

Condition 2:
	The value of bArray[i] [j] will be equal 
        to the value of  bArray[i] [j -2] if the pattern [j] == "*". 

Condition 3:
	The value of bArray[i] [j] will be equal 
        to the value of  bArray[i - 1] [j] when 
        string [i] == pattern [j - 1] || pattern [j - 1] == '.'

Condition 4:
	The value of bArray[i] [j] will be false, when string [i] is not equal to pattern [j].

We shall consider an example:

Pattern = x a * b . c
String = x a a b y c

The above strings can be visualized as shown below:

There bArray[0] [0] is true. Because we have considered the string is empty and pattern is also empty. Hence it is True.

0th column is false. Because we have considered the string is empty but the pattern is not. Hence we have written it as false.

0th row is also false. Because our pattern doesn’t allow empty strings. If our pattern is of type ” a * b *” then it would accept empty strings.

Lets start by taking the elements from bArray[1] [1], string[1] = x and pattern [1] = x. Then we have to apply condition 1. hence the value of bArray[1] [1] will be bArray[i-1] [j-1] i.e bArray[0] [0] that is true.
Hence bArray[1] [1] = true.
bArray[1] [2], string[1] = x, pattern[2] = a. Condition 4 is applied. Hence the value of bArray[1] [2]  will be false.
bArray[1] [3], string [1] =x, pattern[3] = “*”. In this case, we can have 0 or more occurance of a. Hence condition 2 is applied, i.e. bArray[i] [j- 2] the value is true.
Here the pattern is “x a *” and the string is “x”, it can match 0 or more occurance of a. Hence it is true.
Hence bArray[1] [3] = true.
bArray[1] [4], string[1] = x, pattern[4] = b contidtion 4, false.
bArray[1] [5], string[1] = x, pattern[5] = “.”, Condition 1. The value of bArray[i – 1] [j – 1], bArray[0] [4] is false. Hence the result is false.
bArray[1] [6], string[1] = x, pattern[6] = c. Condition 4, false.
Final representation after pass 1 is below:
 

Final represenation of the matrix after all the passes:

The final answer is “True” hence the string matches the pattern.

Time complexity is O(m * n), because for every string in m = 1, 2, . . and for every element in pattern n = 1, 2, . . . we are calling the function once.

Space complexity is also O(m * n), because we save the boolean entries in temp 2d array.

Solution in C++

#include<iostream>
#include<string>
using namespace std;

bool checkRegrex (string str, string pattern)
{
    bool bArray[str.length()] [pattern.length()];

    memset(bArray, false, sizeof(bool)*(str.length()+1)*(pattern.length()+1));

    bArray[0] [0] = true;

// to handle sitations like  a* or a*b* or a*b*c*
    for(int j = 1; j<=pattern.length() ; j++)
    {
        bArray[0][j] = pattern[j-1] == '*' ? bArray[0][j-2] : false;
    }

// we execute codition 1, 2, 3 and 4 in this for loop
      for(int i = 1; i <= str.length(); i++)
    {
        for(int j = 1; j <= pattern.length(); j++)
        {

            if (pattern[j - 1] == '.' || pattern[j - 1] == str[i - 1]) 
            {
                    bArray[i][j] = bArray[i-1][j-1];
            } 
            else if (pattern[j - 1] == '*')  
            {
                    bArray[i][j] = bArray[i][j - 2];
                    if (pattern[j-2] == '.' || pattern[j - 2] == str[i - 1]) 
                    {
                        bArray[i][j] = bArray[i][j] | bArray[i - 1][j];
                    }
            } else {
                    bArray[i][j] = false;
                }

        }
    }


   return bArray[str.length()][pattern.length()];
}


int main()
{
      
    bool result = false;
       result = checkRegrex("abab", "a*");

       if(result)
       {
            cout<<"Pattern and string is a match"<<endl;
       }
       else
       {
            cout<<"Pattern and string is NOT a match"<<endl;
       }
}
Write a Comment

Leave a Comment

Your email address will not be published. Required fields are marked *