14
\$\begingroup\$

Manchester coding is a telecom protocol used in radio communications that guarantees bit transitions at a regular interval so a receiver can recover the clock rate from the data itself. It doubles the bitrate, but is cheap and simple to implement. It is widely used by amateur radio operators.

The concept is very simple: at a hardware level, the clock and data lines are simply XORed together. In software, this is portrayed as converting an input stream of bits into a double-rate output stream, with each input '1' translated to a '01' and each input '0' translated to a '10'.

This is an easy problem, but open to a lot of implementations because of its bitstream nature. That is, the encoding is conceptually a bit-by-bit process instead of a byte-by-byte process. So we all agree on endianness, the least significant bits of the input become the least significant byte of the output.

Golfing time! Write a function that, given an arbitrary length array of bytes, returns an array of that data manchester encoded.

Input and output should be considered little-endian, least significant byte first, and least significant BIT first in the bit stream.

ASCII bitstream drawing:

bit #      5 4 3 2 1 0                                5  4  3  2  1  0
IN ------- 1 0 1 0 1 1 ---> [manchester encoder] ---  01 10 01 10 01 01 ----> OUT

Examples:

Example 1 (hex):
       LSB              MSB     <-- least sig BYTE first
IN : [0x10, 0x02]  
OUT: [0xAA, 0xA9, 0xA6, 0xAA]  

Example 1 (binary):
      msb  lsb                      msb  lsb  <-- translated hex, so msb first
BIN: [00010000, 00000010]                     <-- least sig NIBBLE...
BIN: [10101010, 10101001, 10100110, 10101010] <-- becomes least sig BYTE
         LSB                           MSB

Example 2
IN :  [0xFF, 0x00, 0xAA, 0x55]  
OUT: [0x55, 0x55, 0xAA, 0xAA, 0x66, 0x66, 0x99, 0x99]

Example 3
IN : [0x12, 0x34, 0x56, 0x78, 0x90]  
OUT: [0xA6, 0xA9, 0x9A, 0xA5, 0x96, 0x99, 0x6A, 0x95, 0xAA, 0x69] 

Example 4
IN : [0x01, 0x02, 0x03, 0xF1, 0xF2, 0xF3]  
OUT: [0xA9, 0xAA, 0xA6, 0xAA, 0xA5, 0xAA, 0xA9, 0x55, 0xA6, 0x55, 0xA5, 0x55]

Rules:

  • Solution only requires algorithm to convert input to output.
  • Acquiring input and printing output are NOT a required part of the solution, but may be included. You are encouraged to provide your test/print code if not included in your solution.
  • Input is an array of 8-bit bytes (whatever that may mean in your language of choice), NOT a text string. You can use strings as the storage format if convenient in your language, but non-printable characters (i.e. 0xFF) must be supported. Input can also take a length if necessary.
  • Memory for output must be allocated by your routine, not provided. edit: unnecessary requirement
  • Output is also an array of 8-bit bytes, and a length if necessary.
  • Must support at least 16KB input
  • Performance must not be too horrible: < 10s for 16KB
  • Least-significant byte first in memory.

Side-channel challenge:

  • Challenge another user's answer by proving your code is faster, more memory efficient, or produces a smaller binary!

Get golfing! Shortest code wins!

\$\endgroup\$
3
  • 2
    \$\begingroup\$ "Memory for output must be allocated by your routine, not provided." That seems like quite a strange requirement since many languages have completely automatic memory allocation. \$\endgroup\$ Commented Apr 13, 2011 at 16:45
  • \$\begingroup\$ What on Earth possessed you to use such a bizarre bit order? \$\endgroup\$ Commented Apr 13, 2011 at 17:09
  • \$\begingroup\$ The bit order makes sense when you consider the physical medium this is used for; this algorithm is for a stream of individual bits travelling through the air. The fact that we have to store it in memory, and that we write hex msb->lsb, makes it a little tricky to keep track of. \$\endgroup\$
    – mrmekon
    Commented Apr 13, 2011 at 19:05

9 Answers 9

6
\$\begingroup\$

GolfScript 28 characters

{2{base}:|~4|43691-~256|~\}%

Equivalent version without obfuscating optimization:

{2base 4base 43691-~256base~\}%

The code accept input as an array of integers, and return ditto.

For each number in the array the number is converted to base 2 array form, it is then converted back to a number as if it was base 4, this has the effect of spacing out the bits with a 0 in between each. 43691 is then subtracted from the number, and the result is binary inverted, this is equivalent to subtracting the number from 43690 (43690 = 0b1010101010101010). The number is then split into two parts by converting it to a base 256 array, the array is decomposed and the order of the two resulting numbers is inverted.

Example input:

[1 2 3 241 242 243]

Example output:

[169 170 166 170 165 170 169 85 166 85 165 85]
\$\endgroup\$
4
  • \$\begingroup\$ That's ridiculously short, and very clever! Though it doesn't seem to meet the 16KB in <10s performance suggestion, at least for me; yours takes 43s on my dual-core Mac to convert an array of 16384 1's. In comparison, my massive (2419 char) python implementation takes 0.06s for 16KB. \$\endgroup\$
    – mrmekon
    Commented Apr 13, 2011 at 20:11
  • \$\begingroup\$ It takes less than 5 seconds on my machine (Win 7), and most of that is converting the array to text output, which as far as I read your question isn't part of the requirement, but GolfScript does so automatically with anything left on the stack after execution. One could simply make the code drop the result rather than print it (add ; to the end of the code). If you want to see the output (though that is not part of the question specs.) I know two tricks for speeding it up, redirect it to a file, and print it explicitly in small chunks using print commands: {2{base}:|~4|43691-~256|~p p}% \$\endgroup\$ Commented Apr 13, 2011 at 20:54
  • \$\begingroup\$ In an ubuntu vm (on windows), I get 8s for 16kb. On a mac with a better cpu it took 1m18. I'd guess the ruby than ships with OSX is just terribly slow \$\endgroup\$
    – gnibbler
    Commented Apr 14, 2011 at 0:38
  • \$\begingroup\$ Looks like the ruby printing is disgustingly slow on my machine. Only 2s with printing turned off and Ruby 1.9 (and 5s with the native OSX version). That's much better! \$\endgroup\$
    – mrmekon
    Commented Apr 14, 2011 at 2:02
3
\$\begingroup\$

c -- 224 characters

I believe that this is functional, including the allocation of memory requirement since dropped.

#include <stdlib.h>
int B(char i){int16_t n,o=0xFFFF;for(n=0;n<8;++n)o^=((((i>>n)&1)+1))<<(2*n);
return o;}char* M(char*i,int n){char*o=calloc(n+1,2),*p=o;do{int r=B(*i++);
*p++=0xFF&r;*p++=(0xFF00&r)>>8;}while(--n);return o;}

The working part of the code is a loop over the bits of each character, noting that ((bit+1) exclusive-or 3) is the output bit pair, and applying lots of shifting and masking logic to get everything to line up.

As is c's wont, it works on the data as characters. The test scaffold won't accept 0 bytes (because c treats them as string ending), but the working code has no such limitation.

It might be golfed little more by copy the byte conversion work inline.

Test run (with improved test scaffold):

$ gcc -g manchester_golf.c
$ ./a.out AB xyz U
'AB':
[ 0x41, 0x42 ]
[ 0xa9, 0x9a, 0xa6, 0x9a ]
'xyz':
[ 0x78, 0x79, 0x7a ]
[ 0x6a, 0x95, 0x69, 0x95, 0x66, 0x95 ]
'U':
[ 0x55 ]
[ 0x99, 0x99 ]

Commented, less machine dependent, and with test scaffold

/* manchester.c
 *
 * Manchester code a bit stream least significant bit first
 *
 * Manchester coding means that bits are expanded as {0,1} --> {10, 01}
 *
 */
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#include <stdint.h>
#include <string.h>

/* Caller must insure that out points to a valid, writable two byte
   buffer filled with 0xFF */
int16_t manByte(char i){
  int16_t n,o=0xFFFF;
  printf("Manchester coding byte 0x%hx...\n",i);
  for(n=0; n<CHAR_BIT; ++n)
    o ^= (
      (
       (
        (i>>n)&1) /* nth bit of i*/
       +1) /* +1 */
      ) <<(2*n) /* shifted up 2*n bits */ 
      ;
  printf("\tas 0x%hx\n",o);
  return o;
}

char* manBuf(const char*i, int n){
  char*o=calloc(n+1,2),*p=o;
  do{
    int16_t r=manByte(*i++);
    *p++= 0xFF&r;
    *p++=(0xFF00&r)>>8;
  } while(--n);
  return o;
}

void pbuf(FILE* f, char *buf, int len){
  int i;
  fprintf(f,"[");
  for(i=0; i<len-1; i++)
    fprintf(f," 0x%hhx,",buf[i]);
  fprintf(f," 0x%hhx ]\n",buf[len-1]);
}

int main(int argc, char**argv){
  int i;
  for(i=1; i<argc; i++){
    int l=strlen(argv[i]);
    char *o=manBuf(argv[i],l);
    printf("'%s':\n",argv[i]);
    pbuf(stdout,argv[i],l);
    pbuf(stdout,o,l*2);
    free(o);
  }
  return 0;
}
\$\endgroup\$
3
\$\begingroup\$

J, 36

,@:(3 :'#.2 8$,(,.~-.)4|.y#:~8#2'"0)

Outline of explanation (See J Vocabulary for reference):

  • ,@:(3 :'...'"0) applies the ... to each input "byte" as y, resulting in two bytes (integers) each. The result is flattened by ,.
  • y#:~8#2 is equivalent to 2 2 2 2 2 2 2 2 #: y, or vector of the 8 least significant base-2 digits of y.
  • 4|. swaps the front and back 4 bits by rotating by 4 positions.
  • (,.~-.) is equivalent to 3 :'(-. y) ,. y', or not of the argument 'stitched' to the argument (taking on shape 8 2).
  • #.2 8$, flattens the result giving the bitstream, reshapes to 2 rows of 8, and converts from base 2.

Example usage (J, interactive):

    ,@:(3 :'#.2 8$,(,.~-.)4|.y#:~8#2'"0) 1 2 3 241 242 243
,@:(3 :'#.2 8$,(,.~-.)4|.y#:~8#2'"0) 1 2 3 241 242 243
169 170 166 170 165 170 169 85 166 85 165 85

Speed information (J, interactive):

   manchester =: ,@:(3 :'#.2 8$,(,.~-.)4|.y#:~8#2'"0)
manchester =: ,@:(3 :'#.2 8$,(,.~-.)4|.y#:~8#2'"0)
   data =: 256 | i. 16384
data =: 256 | i. 16384
   100 (6!:2) 'manchester data'
100 (6!:2) 'manchester data'
0.243138

Mean time for 16kb is just under .25s, Intel Core Duo 1.83Ghz or similar.

\$\endgroup\$
3
\$\begingroup\$

Haskell, 76 characters

import Bits
z a=170-sum[a.&.p*p|p<-[1,2,4,8]]
y a=[z a,z$a`div`16]
m=(>>=y)

Test runs:

> testAll 
input      [10, 02]
encoded    [AA, A9, A6, AA]
  pass
input      [FF, 00, AA, 55]
encoded    [55, 55, AA, AA, 66, 66, 99, 99]
  pass
input      [12, 34, 56, 78, 90]
encoded    [A6, A9, 9A, A5, 96, 99, 6A, 95, AA, 69]
  pass
input      [01, 02, 03, F1, F2, F3]
encoded    [A9, AA, A6, AA, A5, AA, A9, 55, A6, 55, A5, 55]
  pass

Performance is well within spec. at 1MB in ~1.2s on my oldish laptop. It suffers because the input is convert to and from a list, rather then processed as a ByteArray.

> dd bs=1m count=1 if=/dev/urandom | time ./2040-Manchester > /dev/null
1+0 records in
1+0 records out
1048576 bytes transferred in 1.339130 secs (783028 bytes/sec)
        1.20 real         1.18 user         0.01 sys

The source, 2040-Manchester.hs, includes the code, tests, and main function for a command line filter.

\$\endgroup\$
3
\$\begingroup\$

OCaml + Batteries, 138 117 characters

let m s=Char.(String.(of_enum[?chr(170-Enum.sum[?d land
p*p|p<-List:[1;2;4;8]?])|c<-enum s/@code;d<-List:[c;c/16]?]))

Tests:

With

let hex s = String.(enum s/@(Char.code|-Printf.sprintf "%02x")|>List.of_enum|>join" ")

The results are:

m "\x12\x34\x56\x78\x90" |> hex;;
- : string = "a6 a9 9a a5 96 99 6a 95 aa 69"
m "\x10\x02" |> hex;;
- : string = "aa a9 a6 aa"
m "\xFF\x00\xAA\x55" |> hex;;
- : string = "55 55 aa aa 66 66 99 99"
m "\x12\x34\x56\x78\x90" |> hex;;
- : string = "a6 a9 9a a5 96 99 6a 95 aa 69"
m "\x01\x02\x03\xF1\xF2\xF3" |> hex;;  
- : string = "a9 aa a6 aa a5 aa a9 55 a6 55 a5 55"

As a benchmark, with:

let benchmark n =
  let t = Unix.gettimeofday() in
  assert(2*n == String.(length (m (create n))));
  Unix.gettimeofday() -. t

I get:

# benchmark 16_384;;
- : float = 0.115520954132080078

on my MacBook.

\$\endgroup\$
1
\$\begingroup\$

Python, 87 chars

M is the function requested in the problem. It calls N for each nybble and splices everything back into a list.

N=lambda x:170-(x&1|x*2&4|x*4&16|x*8&64)
M=lambda A:sum([[N(a),N(a>>4)]for a in A],[])

print map(hex,M([0x10,0x02]))
print map(hex,M([0xff,0x00,0xaa,0x55]))
print map(hex,M([0x12, 0x34, 0x56, 0x78, 0x90]))
print map(hex,M([0x01, 0x02, 0x03, 0xF1, 0xF2, 0xF3]))

generates

['0xaa', '0xa9', '0xa6', '0xaa']
['0x55', '0x55', '0xaa', '0xaa', '0x66', '0x66', '0x99', '0x99']
['0xa6', '0xa9', '0x9a', '0xa5', '0x96', '0x99', '0x6a', '0x95', '0xaa', '0x69']
['0xa9', '0xaa', '0xa6', '0xaa', '0xa5', '0xaa', '0xa9', '0x55', '0xa6', '0x55', '0xa5', '0x55']
\$\endgroup\$
1
+200
\$\begingroup\$

APL (Dyalog Extended), 22 bytes

∊(⌽(2⍴256)⊤43690-4⊥⊤)¨

Try it online!

Port of the GolfScript answer.

∊(⌽(2⍴256)⊤43690-4⊥⊤)¨       Monadic train:
  ⌽(2⍴256)⊤43690-4⊥⊤         Define a helper function taking an integer n:
                    ⊤           Convert n to base 2. Monadic ⊤ is an Extended feature.
                  4⊥            Convert the result from base 4.
                                This puts the 1 digits of n 
                                in odd indices of the intermediate result.
            43960-              Subtract from 43690.
    (2⍴256)⊤                    Convert to 2 base-256 digits, corresponding to
                                nibbles of n.
  ⌽                            Reverse the order of these bytes.
 (                   )¨       Call the helper function for each element of the input
∊                             and flatten the results into a list.
\$\endgroup\$
0
\$\begingroup\$

C, 164 bytes

Takes in a series of hex bytes and converts to manchester binary stream.

#include <stdio.h>
main(int c,char **v){int i,b,x,j=0;while(++j<c{sscanf(v[j],"%x",&b);x=b^0xff;for(i=9;--i;){printf("%d%d",x&1,b&1);x=x>>1;b=b>>1;}printf("\n");}}

#include <stdio.h>
main(int c,char **v){
int i,b,x,j=0;
while(++j<c){
    sscanf(v[j],"%x",&b);
    x=b^0xff;
    for(i=9;--i;){
        printf("%d%d",x&1,b&1);
        x=x>>1;b=b>>1;}
    printf("\n");}}

Test:

./a.out 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 00 10 20 30 40 50 60 70 80 90 a0 b0 c0 d0 e0 f0

Output:

1010101010101010
0110101010101010
1001101010101010
0101101010101010
1010011010101010
0110011010101010
1001011010101010
0101011010101010
1010100110101010
0110100110101010
1001100110101010
0101100110101010
1010010110101010
0110010110101010
1001010110101010
0101010110101010
1010101010101010
1010101001101010
1010101010011010
1010101001011010
1010101010100110
1010101001100110
1010101010010110
1010101001010110
1010101010101001
1010101001101001
1010101010011001
1010101001011001
1010101010100101
1010101001100101
1010101010010101
1010101001010101

16kb test data set generator:

test_data.c:

#include <stdio.h>
void main()
{
int i=16*1024;
while(i--)
{
    printf("0x%02x ", i&0xFF);
}
printf("\n");
}

cc test_data.c -o test_data

1.6G i5dual core time trial:

time ./a.out `./test_data` > test.out 
real    0m0.096s
user    0m0.090s
sys 0m0.011s
\$\endgroup\$
1
  • \$\begingroup\$ Nice first post, but we generally don't try to obfuscate our code. Shorter yes, harder to read no. \$\endgroup\$
    – Riker
    Commented Apr 16, 2016 at 15:16
0
\$\begingroup\$

PHP, 156 bytes

function f($i){foreach($i as$n){$b=str_split(str_replace([0,1,2],[2,'01',10],
str_pad(decbin($n),8,0,0)),8);$o[]=bindec($b[1]);$o[]=bindec($b[0]);}return$o;}

Given the input [0, 1, 2, 3, 4, 5], it returns:

[170, 170, 169, 170, 166, 170, 165, 170, 154, 170, 153, 170]

It encodes 16 KiB of data in 0.015 seconds and 1 MiB of data in about 0.9 seconds.

The ungolfed code, another implementation (longer and about twice slower) and the test cases can be found on my code-golf solutions page on Github.

\$\endgroup\$

Not the answer you're looking for? Browse other questions tagged or ask your own question.