2

I have a 100kb PDF file that we'll call Test.pdf. I'm using FTP to put Test.pdf on my website. However, the PDF is corrupted when it arrives on the website. So as a diagnostic test, I ran:

$ md5sum Test.pdf
[md5sum a]
$ [ftp upload Test.pdf]
$ [ftp download Test.pdf]
$ md5sum Test.pdf
[md5sum b]

So at some point in the uploading process, the file is being corrupted! This is baffling me. I've never had this problem with any other filetype. I also tried using my website provider's manual upload client, but ran into the same problem. What's going on here?

4
  • 3
    You are probably uploading in ascii instead of binary mode.
    – McKracken
    Commented Dec 28, 2013 at 0:29
  • Ah! How do I upload in binary mode via the command-line?
    – Newb
    Commented Dec 28, 2013 at 0:31
  • Never mind, got it!
    – Newb
    Commented Dec 28, 2013 at 0:35
  • @ekaj I've answered my question below.
    – Newb
    Commented Dec 28, 2013 at 0:41

2 Answers 2

5

You already self answered, but I think I can do better than Apparently certain types of files need to be uploaded in binary.

First some small background information:

1: Computers, bits and bytes.

The smallest part of information in a computer is a bit. A bit is either true or false, ) or 1, high voltage or ground, ...

The bits are grouped into small sets. For almost all modern computers in groups of eight. We call this a byte.

A set of 8 bits / 1 bytes, can have 256 different values, starting at
00000000 meaning 0
00000001 meaning 1
00000010 meaning 2
00000011 meaning 3 (both 2+1 are set)
00000100 meaning 4
...
11111111 meaning 255

2: ASCII.

ASCII is a set of 128 characters, numbered 0 to 127. You only need 7 bits for this. On ye old days this was all you needed for communication. Just the regular 26 letter in the western alfabet, the number 0 to 9 and some special codes sunch as 7: Ring the bell or beep.

These days we define much more characters. We use UTF-16 and unicode, allowing chinese, japanese, right-to-left language etc etc. Back in ye old days we did not yet have support for this in common places.

3: Lastly: Bandwidth is/was expensive.

We send all 8 bits of a bit to a destination when you know that you only need 7 of them to represent the text? If you do things in a smart way you can save 1/8th bandwidth.

That might not sound as much to use today, but in the era when the Europe to US connection a 1200 baud dial-in line (that is about 0.1KB/sec!) every little bit helped.

So suppose I want to write "Hello".

I can look that up in the ASCII table and I will discover that your computer would store that in four bytes containing this:

H        e        l        l        o
01001000 01100101 01101100 01101100 01101111  

Note that the first bits of all letters is 0. I might just as well remember this part:

H        e        l        l        o
 1001000  1100101  1101100  1101100 1101111  

The first example has 32 bits (4 bytes, each 8 bits of information).
The second example only has 28 bits. It is more efficient.

This makes it the preferred method of transferring text. However leaving out the first bit will break anything which is not text. Thus the FTP protocol was designed twith two options: ASCII mode (efficient for text), and BINary mode (transfer as it is).


OK, with all that known:

You transferred binary files (e.g PDF's) in ASCII mode, which did not transmit all information. Thus the resulting files arrived mangled on the destination

To transfer anything but plain old text, use the 'bin' command on the FTP prompt or tick the 'bin' option of you use a GUI.

I hope that answers the "What's going on here?" :)

0
1

The problem was that I was uploading Test.pdf in ascii mode, not binary mode. Apparently certain types of files (e.g. .pdf, .zip) need to be uploaded in binary, rather than ascii mode. (This presumably has something to do with the systems-level representation of the file.) This was easily fixed by changing the upload-mode to binary in ftp, using the command binary, as such:

$ ftp [myserver]
ftp> binary
ftp> put Test.pdf

Here is a helpful reference.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .