From the question How long is a "token"? we learn that tokens are commonly around 4 characters. So it seems plausible that LLMs might therefore prefer to have word boundaries coincide with token boundaries. E.g. maybe ChatGPT, say, has a bias towards (4n-1)-character words (-1 for a whitespace character).
Question: Does the length of a token give LLMs a preference for words of certain lengths?
I didn't find the answer by Google; I asked Koala.sh and it said Language models do not have a preference for words of certain lengths, and Assistant said Language models like GPT-3.5, which is based on transformer architecture, do not inherently have a preference for words of certain lengths. However, neither AI explained their reasoning; I wonder if there's an inherent reason for this, or research into this topic.
(Note this question is not about Google, Koala.sh, nor Assistant in particular; I'm just showing my attempts at finding an answer myself, as is generally expected when writing questions.)