I have a situation where a server may arbitrarily break up transmitted UTF-8 string data, including in the middle of a UTF-8 sequence. In the websocket proxy that is receiving this data before it goes to the client, I want to detect that case and have the proxy wait for the next packet from the server and concatenate it with the prior one before sending to the client.
Assuming I am seeing the data from the server as a simple array of bytes, what is the simplest logic I can use to reliably detect the case where those bytes end in the middle of a UTF-8 sequence?