1

I have a special list of spam words that are in an array. When a user submits a string text, I want to know if it contains these words. How do I do this? I wrote this code

$my_words = 'buy;sell;shop;purchase';
$array_words = explode(';' ,$my_words); //convert the list to an array
$input_text = 'we can sell your products'; //user will input this text

foreach($array_words as $word){
    if(strpos($input_text  ,$word) !== false)   return 'You can't use from this text';
}

But if I have many special words, the foreach slows down the execution speed.

1
  • try with regex, put your words into regex and check with OR condition. Commented Nov 18, 2020 at 6:49

2 Answers 2

3

You may form a regex alternation based on your semicolon-delimited string, and then search for that:

$my_words = 'buy;sell;shop;purchase';
$regex = '/\b(?:' . str_replace(";", "|", $my_words) . ")\b/";
$input_text = 'we can sell your products';
if (preg_match($regex, $input_text)) {
    echo "MATCH";
}
else {
    echo "NO MATCH";
}

Here is an explanation of the (exact) regex built and used above:

/\b(?:buy|sell|shop|purchase)\b/

\b            match the start of a word, which is either
(?:
    buy
    |         OR
    sell
    |         OR
    shop
    |         OR
    purchase
)
\b            match the end of a word
5
  • can you tell me more about the regex? what '/\b' and '?:' mean?
    – Learner231
    Commented Nov 18, 2020 at 6:50
  • Thank you, and regex can work for Arabic letters without problem?
    – Learner231
    Commented Nov 18, 2020 at 6:53
  • Yes, PHP regex can be made to work with any UTF-8 language (but that's not exactly what you asked above). Commented Nov 18, 2020 at 6:55
  • I checked this code, it have a little different with strpos. regex not work for spaces. for example: $input_text = 'we can sellyour products'; , the regex will show NO MATCH but strpos will show MATCH
    – Learner231
    Commented Nov 18, 2020 at 7:18
  • 1
    @Learner231 Well if you just want to match substrings, then remove the \b from the regex pattern. But, keep in mind that by doing this the word seller would then become a match for the keyword sell. Is that the logic you really want? Commented Nov 18, 2020 at 7:23
0

You could use | as word separator and then use regular expression to check

$my_words = 'buy|sell|shop|purchase';
$input_text = 'we can sell your products'; //user will input this text
return preg_match('#(?:'.$my_words.')#i', $input_text);

Use i flag to disable case sensitivity

1
  • Code changed after my comment. But it still checks for containing, not a word. 'we cansell your products' <-- "cansell" will also return true, which should return false. Commented Nov 18, 2020 at 8:40

Not the answer you're looking for? Browse other questions tagged or ask your own question.