I'm searching for the correct way to pre-process my subtitles files before hard-coding them into video clips.
Currently, ffmpeg does not process RTL (right-to-left) languges properly; I have detailed the problem here: https://superuser.com/questions/1679536/how-to-embed-rtl-subtitles-in-a-video-hebrew-arabic-with-the-correct-lan
However, there could be 2 programmatic solutions:
- adding certain unicode control characters can fix (or partially fix) the text, which is then fed into ffmpeg, giving good results.
- character 0x200F at the end of a hebrew clause, after punctuation
- character 0x202B, I haven't yet learned its usage.
- I can edit the text so that it will produce the correct results on ffmpeg. But that requires smart BiDi algorithm.
Do you know how to preprocess such text?
(this is NOT an encoding question. It is about RTL/LTR algorithm to use.)
Thank you