Is it feasible to use LLM for natural language detection?

Question

Huge models are trained on data in multiple languages, so I would like to use such model to detect the language of the input. I would extract paragraphs from a webpage, then have the AI analyze the text and spit out something like "the majority of the text is in English, small parts are in German and Swedish".

Is it a feasible application for an LLM? Or will a simple frequency analysis for language detection be more accurate and efficient?

An LLM probably can do this with reasonable accuracy, but it's like swatting flies with a sledgehammer. You'd get comparable if not better performance from purely statistical methods, while requiring orders of magnitude fewer computing resources. — Mark, Commented Apr 18 at 1:56
I don't personally find downvotes offensive, so feel free to cast any vote :) — tpimh, Commented Apr 18 at 19:00
@FranckDernoncourt, I'd guess that it's because this is yet another case of "Hi, I found this cool-looking hammer. How do I use it to install screws?" — Mark, Commented Apr 18 at 23:36
The question is not "how do I use it", but rather "is it a good tool for this purpose". And "no" is an acceptable answer if it's backed by something. — tpimh, Commented Apr 22 at 6:56

Franck Dernoncourt · Accepted Answer · 2024-04-18 19:18:43Z

4

Simple frequency analysis for language detection is orders of magnitude more efficient from a computational standpoint, and I'm guessing at least as good as LLMs. Just need to look a n-gram char stats (typically yields >99% accuracy in most language detection scenarios), with sliding windows if suspecting code-mixing (code-mixing significantly degrades language detection accuracy).

edited Apr 18 at 19:18

answered Apr 18 at 15:55

Franck Dernoncourt

2,8433 silver badges23 bronze badges

Add a comment |

Stack Exchange Network

Is it feasible to use LLM for natural language detection?

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
llm
or ask your own question.

Hot Network Questions

Is it feasible to use LLM for natural language detection?

1 Answer 1

Not the answer you're looking for? Browse other questions tagged llm or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
llm
or ask your own question.