I have this pdf file


which contains English and Arabic language. I want to convert it to .docx file and there are many ways but none of them give excellent results.

  • If I open the pdf file using MS Office 2016 I get the right formatting and all Arabic letters are converted correctly. But I lose almost every shape/drawings that are in the document.
  • If I convert the pdf file to .docx file I get all the shapes/drawings but then I get a bunch of page breaks, section breaks, column breaks etc. and in addition 70% of the Arabic words are not converted correctly.
  • I can get rid of the breaks using a VBA code and I can repair most of the bad converted Arabic words using another code but there are still many words left where I have to correct them manually.
  • Using google docs gives a bunch of mess.
  • Using Abbyy finereader also results in a mess of words.

Some useful information:

  • This file was created using MS Office word 2013. I lost the original files because my hard drive broke and only did backup the pdf files. Everything has the same font which is (times new roman)

Edit: I used adobe acrobat pro to convert from pdf to .docx file

If I just copy and paste the Arabic words from pdf to word document using the paste option " keep text only " I get almost perfect results. But I have over 250 pages and this will consume time that I don't have.

  • Sorry to say, I don't think there would be a better tool for conversion. As you already described all methods comes with a price. It seems you're familiar with macros; maybe you can create one which loops through the version with the Arabic words right and copies them to the one with right formatting, replacing the wrongly formatted words. Commented Jan 4, 2019 at 9:18
  • That is a very smart approach but I don't know to achieve this. I have only basic understanding in macros. Using different programs to convert the file results in different layout. I forgot to mention that I used adobe acrobat pro dc latest version to convert from pdf to .docx file. Commented Jan 4, 2019 at 10:35

1 Answer 1


I try out this online converter https://pdf2doc.com/it/

Convert the file from PDF to DOC and open it with libreoffice give a acceptable result (mostly seems to have only pagination to be adjusted).

here the result: https://1drv.ms/f/s!Aj15LBU4peCjmZZp1BZZ7l9hwC3cqg

anyway the conversion cannot be done at 100% due the MS proprietary format of Office suite, for this reason if you use third party converter at last you loose the format, open the doc I provide you with libreoffice with word 2016 the result is not so good.

a screenshot of the doc file opened from libreoffice and word 2016:

enter image description here

  • This is a nice approach. Only problem is that all the Arabic words are messed up. The first letter is now the last letter and vice versa. For example this English word: "letter" -> "rettel" This what happens in your file. This can be solved using a Macro. Right? Commented Jan 4, 2019 at 10:40
  • Edit: Not all the words are messed up. On the first examination I think about 30% of the words are messed up. Commented Jan 4, 2019 at 10:46
  • It could be fixed by a macro, what I think is append is the converter when realize is Arabic set the write from right to left but it is not precise as a human; you could try with different tools and get a bit better result, but getting a conversion out of the box I think is not realizable.
    – AtomiX84
    Commented Jan 4, 2019 at 10:55

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .