Although there exists the most efficient "Knuth-Morris-Pratt" algorithm of string pattern matching. I tried to achieve the same result with a different approach. But I am not sure about it's efficiency. I am a novice in programming. Can anyone show me how to analyze this algorithm and find it's efficiency?
Algorithm:
Step 1: Calculate the integer value of the pattern by adding the integer values of each of the characters in it. Let it be "m"
Step2: Calculate the integer values of parts of the string, with the same length as that of the pattern in the similar way as described in step 1.
Step3 : Store these values in an array.
Step 4 : Compare the values in array with the "m" value from step1.
Step 5: If the values match then check if the part of the string matches with the pattern. If it does then return true. If the pattern does not match at all then return false.
I have coded the algorithm in Java. Here is the code.
public class StringPatternMatchingAlgorithm {
public static boolean patternMatching(String str,String pattern){
int patternLength = pattern.length();
int patternSum = 0;
int[] patternSumArray = new int[(str.length() - patternLength)+1];
int patternSumArrayIndex = 0;
int patternSumToCheck = 0;
String toCompare = "";
for(int i=0; i<pattern.length();i++){
patternSumToCheck = patternSumToCheck + pattern.charAt(i);
}
for(int i=0; i<((str.length())-(pattern.length()-1));i++){
patternSum = 0;
for(int j=i; j<patternLength;j++){
patternSum = patternSum + str.charAt(j);
}
patternLength++;
patternSumArray[patternSumArrayIndex] = patternSum;
if(patternSumArrayIndex < patternSumArray.length-1 ){
patternSumArrayIndex++;
}
}
for(int i=0; i<patternSumArray.length;i++){
if(patternSumArray[i] == patternSumToCheck){
toCompare = "";
for(int j=i; j<(i+pattern.length());j++){
toCompare = toCompare + str.charAt(j);
}
if(toCompare.equals(pattern)){
return true;
}
}
}
return false;
}
public static void main(String[] args) {
String str = "cdxabie";
String pattern = "bie";
System.out.println(patternMatching(str, pattern));
}
}
The code can be run here
Edit: As suggested in the comments to the first version of algorithm, I have committed a modification. I used two variables "forthIndex" which is initialized to pattern's length and "backIndex" initialized to 0. Also, I calculated the integer sum of number of characters, equal to the length of pattern from index 0 and stored it in a variable named "patternSum" only once. Then onwards the algorithm adds the integer value of character at "forthIndex" to "patternSum" and substracts the integer value of character at "backIndex" from "patternSum" simultaneously, as the string is traversed. This eliminates the recalculation of integer sum of characters in string as it was in the former version of algorithm.
Here is the improved code:
public class StringPatternMatchingAlgoVariant1 {
public static boolean matchPattern(String str,String pattern){
int patternLength = pattern.length();
int patternSum = 0;
int[] patternSumArray = new int[(str.length() - patternLength)+1];
int patternSumArrayIndex = 0;
int patternSumToCheck = 0;
String toCompare = "";
int backIndex = 0;
int forthIndex = pattern.length();
for(int i=0; i<pattern.length();i++){
patternSumToCheck = patternSumToCheck + pattern.charAt(i);
patternSum = patternSum + str.charAt(i);
}
patternSumArray[patternSumArrayIndex] = patternSum;
patternSumArrayIndex++;
for(int i=0; forthIndex<str.length();i++){
patternSum = patternSum + str.charAt(forthIndex) - str.charAt(backIndex);
forthIndex++;
backIndex++;
patternSumArray[patternSumArrayIndex] = patternSum;
patternSumArrayIndex++;
}
for(int i=0; i<patternSumArray.length;i++){
if(patternSumArray[i] == patternSumToCheck){
toCompare = "";
for(int j=i; j<(i+pattern.length());j++){
toCompare = toCompare + str.charAt(j);
}
if(toCompare.equals(pattern)){
return true;
}
}
}
return false;
}
public static void main(String[] args) {
String str = "cdxabie";
String pattern = "bie";
System.out.println(matchPattern(str, pattern));
}
}
The new version of the code can be run here