1

I have sample content in the link here. it is plain text. How do any solutions to convert this text content to original pdf file? (This content I got from MTOM service)

I got so far from this source multipart/mixed that contain both json and binary content as text.

--uuid:dba94a0e-2d99-4675-9781-2a736995bdc8
Content-Type: application/json;charset=UTF-8
Content-Transfer-Encoding: binary
Content-ID: <jsonInfos>

{"messages":[{"id":"0","type":"INFOS","messageContent":"La requête a été traitée avec succès","replacementValues":[]}]}
--uuid:dba94a0e-2d99-4675-9781-2a736995bdc8
Content-Type: application/octet-stream
Content-Transfer-Encoding: binary
Content-ID: <label>

%PDF-1.3
%����
12 0 obj
<<
/BitsPerComponent 8
/ColorSpace /DeviceRGB
/Filter [/FlateDecode /DCTDecode]
/Height 80
/Length 2486
/Name /Obj0
/Subtype /Image
/Type /XObject
/Width 119
>>
stream
x���{<��ǟ1f��$1rY�{�QY  �a�Les�-jܧ��Qm��R4!wi&׉�Y�$32��h�1�f�Sg�9�:����y^�?���|���|�5�lr���`0p@:�N)�@"d�H�Bʡ7����h���6��lݪ�������a5t���j��k`h��Hg�  �K}��S
....
....
....
startxref
101943
%%EOF

--uuid:dba94a0e-2d99-4675-9781-2a736995bdc8--

I tried in python:

with open('tmp.txt', 'r') as tmp:
     with open('sample.pdf', 'wb') as sample:
          sample.write(tmp.read().encode('utf-8'))
7
  • 1
    What have you tried so far? What you posted doesn't do anymore than renaming the file to .pdf.
    – Klaus D.
    Commented Feb 25, 2020 at 4:54
  • @KlausD. Because there no original PDF file. MTOM response has only content as text.
    – Sen Sokha
    Commented Feb 25, 2020 at 4:57
  • I have content PDF as binary text, but I want it to original PDF. Problem is incompatible between Rest service and SOAP Message Transmission Optimization Mechanism service.
    – Sen Sokha
    Commented Feb 25, 2020 at 5:02
  • 1
    "MTOM response has only content as text" - this is wrong, MTOM responses are not text, even if they look like text and parts of it later are extracted as text. You have to treat MTOM responses as binary data, from the very start, i.e. already when you receive them.
    – mkl
    Commented Feb 25, 2020 at 6:26
  • @mkl I have integrated enterprise web service. I send request to them in rest (json) api format. but the response I got multipart/mixed from the service. Do any package work and extract data from MTOM payload?
    – Sen Sokha
    Commented Feb 25, 2020 at 6:43

3 Answers 3

1

You cannot write to pdf files like you write to normal text files. There are libraries in python to write pdf files. you can try pdfrw.

the data you are going to write to pds can have attributes(other than the text you save in text files) follow the samples to do what you actually need:

from pdfrw import PdfWriter
y = PdfWriter()
y.addpage(data)
y.write('result.pdf')
1
  • Thanks for answer! this way is how to produce pdf. but I want to save binary content in text to original file.
    – Sen Sokha
    Commented Feb 25, 2020 at 5:09
0

You can consider using FPDF to generate PDF file. PFB the sample code.

from fpdf import FPDF

with open('tmp.txt', 'r') as tmp:
    wpdf = FPDF()
    wpdf.set_font('arial', '', 12)
    wpdf.add_page()
    wpdf.set_xy(10, 5)
    for line in tmp:
        wpdf.cell(50, 5, txt=line, ln=1, align="L")
    wpdf.output('sample.pdf', 'F')`enter code here`


Please refer below link for more information. https://pyfpdf.readthedocs.io/en/latest/Tutorial/index.html

0

You cannot get back your original pdf files from plain text files only. Because while exporting to txt, the converter chops of a lot of information like color encoding, structure, font data etc. However, if you just want to create pdf from txt, you can use wkhtmltopdf and pdfkit to achieve that.

  • Install wkhtmltopdf via apt-get install wkhtmltopdf

  • Install pdfkit via pip install pdfkit.

Now you can just do this:

import pdfkit

pdfkit.from_file("tmp.txt", "sample.pdf")

This will return:

libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
Loading page (1/2)
Printing pages (2/2)                                               
Done                                                           
True

The pdf file should look like this:

enter image description here

1
  • if you said while exporting to txt, the converter chops of a lot of information like color encoding, structure, font data etc, I won't get original content from binary string in text file. but I don't know why MTOM service can send binary as string in response over network. So your output what convert to PDF, it is payload response from web service that contain header multipart/mixed without original download file.
    – Sen Sokha
    Commented Feb 25, 2020 at 6:25

Not the answer you're looking for? Browse other questions tagged or ask your own question.