0

As the title depicts, what is the logic behind implementing noise words in fulltext searches to avoid these words being searched? I mean, what if someone searches "to be or not to be"? No result shown? I'll highly appreciate if someone can tell me the logic behind, since I'm about to disable the ft_stopword_file.

3 Answers 3

2

The reason for these stop words is so that the full-text index doesn't become bloated. It aids in performance and storage. If you included all stop words (or disable them) then it would degrade the full-text searching to a certain extent.

1
  • So I better do not change the file... What about "to be or not to be"? How to search this?
    – Shaokan
    Commented Oct 21, 2011 at 0:05
1

If you disable the stop words then the performance will decrease dramatically. The workaround for this is to either check in your php code to see whether the stop words are in common in the search query and adapt a 'LIKE' search for those queries, or simply use sphinx as a search engine. The logic behind the stop words is to disable searching words like 'is,are,be,there,not' etc etc...

1

The logic is that these words are so common, that they will create large index nodes and degrade the system as well as be useless to users since the words "to" and "be" are so common and contextless.

A better method of indexing would be ngrams to find quoted phrases like "to be" but this kind of indexing is pretty rare.

Not the answer you're looking for? Browse other questions tagged or ask your own question.