33
\$\begingroup\$

There seems to be this ongoing craze about people tediously learning new keyboard layouts like Dvorak or Neo because it supposedly makes them more productive. I argue that switching keyboard layouts is a bad idea, because it can take you months to get up to speed, and when you're finially 5% faster than the rest, you're screwed if you need to type on a computer that isn't your own.

Also, all these people forget where the real bottleneck in modern communication lies - the telephone keypad.

This is how your average telephone keypad looks like:

The telephone keypad

Letter 'r' is the third letter on button 7; so if you were to type the letter 'r' on a mobile phone, you would press button 7 three times, for 's' you'd press it 4 times, and for 'a' you'd press button 2 once.

Considering this, putting 'e' after 'd' was probably a bad decision - 'e' is the most commonly used letter in the english alphabet, so, if you were to label button 3 "EDF" instead of "DEF", you would save quite a lot of keystrokes.

Moreover, you've probably experienced yourselves that typing 2 letters that share the same button is a nuisance - if you want to write "TU", you can't just hit 8 three times, because that would result in 'V'. So usually you'd write 'T', then hit space, then hit backspace, and then write 'U', which equals 5 button presses instead of 3.


TL;DR

Given these two rules:

  • A letter is typed by pressing a button n times, where n is the position of where the letter is at on the button's label
  • Writing two letters that are typed using the same button requires an additional 2 button presses

What's the telephone keyboard layout that requires the least amount of button presses, given a specific text? You should only use the buttons 2-9, 1 and 0 are reserved for special symbols.

Input

The text which you should find the optimal layout for is supplied via stdin. You don't need to handle anything other than the lowercase alphabet and can assume that the input consists of only that. You can also assume that the input text is reasonably large and every letter is in there at least once, if that helps.

Output

I don't want to put too many constraints on the output, since that sometimes gives some languages advantages over others; so however your language shows arrays is fine, alternatively you can seperate each label with a newline.

There may be multiple possible optimal layouts, you can print any one of them. Here's a simple example:

>> echo "jackdawslovemybigsphinxofquartz" | foo.sh
ojpt
avhz
cen
skm
dyf
wbq
ixu
lgr

Bonus Points

-35 if your algorithm is not brute-forcing all possible layouts (I'm looking at Haskell's `permutations' here)

-3 if your code fits inside a text message (140 chars) and you post a pic of you sending your code to a friend.

This is my first challenge on StackExchange. I'd be happy to hear whether you like it, or have any other feedback about it!

\$\endgroup\$
7
  • 2
    \$\begingroup\$ Welcome on CodeGolf.SE! I don't see any problem with your question, but it's generally a good idea to post your challenge in the sandbox first to get some feedback and remove ambiguities before posting on the main site. \$\endgroup\$ Commented Mar 21, 2014 at 22:04
  • \$\begingroup\$ Ah that's cool, I definitely will in the future. \$\endgroup\$
    – Flonk
    Commented Mar 21, 2014 at 22:05
  • 1
    \$\begingroup\$ My telephone can send a single 60-character SMS, but does not support brackets properly, am I out of the bonus? \$\endgroup\$
    – nanofarad
    Commented Mar 22, 2014 at 17:41
  • 1
    \$\begingroup\$ Nice question! I don't think anybody will be able to avoid the -35 bonus. Even if we restrict ourselves to layouts having 4 characters on two of the keys and 3 on all remaining 6, there are 26! / (2! * 6!) = 280,063,514,671,253,913,600,000 > 2^77 unique permutations, counting simple rearrangements of the keys only once. \$\endgroup\$
    – Dennis
    Commented Mar 24, 2014 at 4:47
  • 2
    \$\begingroup\$ Also, I ask if you people could print the number of button-pressing of your solution. So we'll see who found the best one! \$\endgroup\$ Commented Mar 30, 2014 at 20:27

6 Answers 6

5
\$\begingroup\$

Perl, 333

$_=<>;$c{$&}++while/./g;@c=sort{$c{$b}<=>$c{$a}}keys%c;$d{$&.$1}++while/.(?=(.))/g;sub f{my$x=shift;if(my$c=pop@$x){for(grep!$_[$_],0..7){my@y = @_;$y[$_]=$c;f([@$x],@y)}}else{for(0..7){$z=$_[$_];$c+=$d{$z.$_}+$d{$_.$z}for@{$a[$_]}}$c<$m?($m=$c,@n=@_):1}}while(@c){$m= ~0;f[splice@c,0,8];push@{$a[$_]},$n[$_]for 0..7}print@$_,$/for@a

Here's an attempt to optimize for rule #2. After my comment, above, and in lieu of answers taking that rule into account (cf. high question rating), I thought that I owe some effort here...

Solutions that don't optimize for rule #2 can produce output very far from optimal. I checked long natural English text ("Alice in Wonderland", actually), pre-processed (lower case letters only), and e.g. Perl script from OJW's answer, result being

2: ermx
3: tdfz
4: alp
5: oub
6: ick
7: nwv
8: hgj
9: syq

er alone ruins it, plus some other pairs should never have ended on the same key...

Btw, zxqjvkbpfmygwculdrshnioate are letters sorted, frequency ascending, from that text.

If we try to solve it easy way (hoping for -35 bonus, maybe) and place letters one by one, choosing available key by minimal pair-wise count, we can end with e.g.:

slbx
hdmz
nrf
iuj
ogv
awk
tcp
eyq

I don't post code for this (wrong) solution here. E.g., note, c is more frequent than w and is placed first. tc (ct) pairs are obviously less frequent than ac (ca) - 43+235 against 202+355. But then w ends up with a -- 598+88. We end with pairs aw and tc (964 total), though it would be better ac and tw (635 total). Etc..

So, next algorithm tries to check each 8 remaining (or 2, if last) most frequent letters against letters already on the keypad, and to place them so that pair-wise count is minimal.

$_=<>;                          # Read STDIN.
$c{$&}++while/./g;              # Count letters (%c hash).
@c=sort{$c{$b}<=>$c{$a}}keys%c; # Sort them by frequency, ascending
$d{$&.$1}++while/.(?=(.))/g;    # (@c array), and count pairs (%d hash).

                                # Next is recursive sub that does the job.
                                # Some CPAN module for permutations
                                # would probably do better...
                                # Arguments are reference to array of what's 
                                # left un-placed of current 8-pack of letters,
sub f{                          # and 8 element list of placed letters
    my$x=shift;                 # (or undefs).
    if(my$c=pop@$x){            # Pop a letter from 8-pack (if anything left),
        for(grep!$_[$_],0..7){  # try placing it on each available key, and 
            my@y = @_;          # call sub again passing updated arguments.
            $y[$_]=$c;
            f([@$x],@y)
        }
    }else{                      # If, OTOH, 8-pack is exhausted, find sum of
        for(0..7){              # pairs count of current permutation (@_) and 
            $z=$_[$_];          # letters placed in previous rounds (8-packs).
                                # @a is "array of arrays" - note, we didn't 
                                # have to initialize it. First "8-pack" will
                                # be placed on empty keypad "automatically".
                                # We re-use undefined (i.e. 0) $c.

            $c+=$d{$z.$_}+$d{$_.$z}for@{$a[$_]}
        }
        $c<$m                   # Is sum for current placement minimal?
            ?($m=$c,@n=@_)      # Then remember this minimum and placement.
            :1
    }
}

while(@c){
    $m= ~0;                         # Initialize "minimum" with large enough 
    f[splice@c,0,8];                # number, then call sub with each 8-pack
                                    # (and empty list of placed letters 
                                    # from current round). On return,
                                    # @n will have optimal arrangement.
    push@{$a[$_]},$n[$_]for 0..7    # Then place it permanently on keypad.
}
print@$_,$/for@a                    # Show us what you've done.

Result is:

sdfz
hlmx
nrv
iyp
ogk
acq
twb
euj

I don't like the ac pair (The Cat being one of the characters, after all), but, still that's optimal letter placement for English, if my code is not wrong. Not exactly 'golfing' effort, just some working solution, ugly or not.

\$\endgroup\$
3
\$\begingroup\$

Python3, it's Montecarlo Time!

To solve this problem, I first count how many "clicks" you need with the default keyboard (intially: abc,def,ghi,jkl,mno,pqrs,tuv,wxyz). Then I modify this keyboard and see if it is cheaper (the text is written in less clicks). If this keyboard is cheaper, then it becames the default one. I iterate this process 1M times.

To change the keyboard I first decide how many changes to make (the maximum number of changes is the total number of letters n the keyboard). Then, for every switch, I choose two buttons and two positions and I transfer a character from the first position to the second one.

The maximum number of switches per time is the number of letters in the keyboard becouse it is the minimum number of changes you need to switch from two complete different keyboards. (I want that it is always possible to switch from one keyboard to any other one)

The output of echo "jackdawslovemybigsphinxofquartz" | python .\myscript.py is:

61 ['anb', 'sef', 'hjc', 'iykl', 'odm', 'qgr', 'tuxv', 'wpz']

Where 61 is the number of button pressed to compose a given message.

Characters (no spaces and no comments): 577

I know its long but I'm really new to this stuff.

from random import *
S=['abc','def','ghi','jkl','mno','pqrs','tuv','wxyz']
def P(L): # perform a switch of the keys of the keyboard:to switch from a given keyboard to another, the maximum number of exchanges is the number of the keys.
    R=randint
    N = len(''.join(L))
    W = randint(1,N)   # decide how many switches to perform
    EL = list(L)
    for i in range(0,W):
        B1=R(0,len(EL)-1)   # decide what buttons are considered in the switch
        B2=R(0,len(EL)-1)
        if len(EL[B1])==0: continue   
        P1=R(0,len(EL[B1])-1)       # decide what letter to switch and where
        if len(EL[B2])==0: P2=0
        else:   P2=R(0,len(EL[B2])-1)
        C1 = EL[B1][P1]     
        EL[B1]=EL[B1].replace(C1,'')
        EL[B2]=EL[B2][:P2]+C1+EL[B2][P2:]
    return EL
def U(L,X): # count how many clicks you need to compose the text X
    S=0
    Z=' '
    for A in X:
        for T in L:
            if A in T and Z not in T: S+=1+T.index(A)
            if A in T and Z in T: S+=3+T.index(A) # if the last character was in the same button..here the penality!
        Z=A
    return S
X=input()
n_iter=10**6
L = list(S)
cc=U(L,X)
print(cc,L)
for i in range(0,n_iter): #do some montecarlo stuff
    cc=U(L,X)
    pl=P(L)
    pc=U(pl,X)
    if(cc>pc):
        L=pl 
        print(pc,L)

I found it so funny that I decided to try this algorithm with LO HOBBIT (I have an original copy at home too!). It has 383964 letters and these are the couple clicks vs keypad that I'm finding:

909007 ['abc', 'def', 'ghi', 'jkl', 'mno', 'pqrs', 'tuv', 'wxyz']
879344 ['abkc', 'def', 'gqhi', 'jl', 'mno', 'rs', 'tupv', 'wxyz']
861867 ['abg', 'def', 'qhyi', 'jcl', 'mno', 'r', 'tupxv', 'swkz']
851364 ['abg', 'e', 'qchi', 'jyl', 'mn', 'dr', 'tupxv', 'sowkfz']
829451 ['ag', 'ef', 'qchi', 'jyl', 'mn', 'dbr', 'tupxv', 'sowkz']
815213 ['amg', 'ef', 'qch', 'ojyl', 'i', 'dbnr', 'tupxv', 'swkz']
805521 ['amg', 'ef', 'ch', 'ojyl', 'qi', 'dbnr', 'tupxv', 'swkz']
773046 ['amg', 'ef', 'ch', 'ojyl', 'qi', 'bnr', 'tupxv', 'dswkz']
759208 ['amg', 'eqf', 'ch', 'ojyl', 'i', 'bnr', 'tupxv', 'dswkz']
746711 ['ag', 'ekq', 'clh', 'sojy', 'bi', 'nmfr', 'tupxv', 'dwz']
743541 ['ag', 'ekq', 'clh', 'sojy', 'bi', 'nmfr', 'tpxv', 'dwuz']
743389 ['ag', 'ekq', 'clh', 'sojy', 'i', 'nmfr', 'tpxbv', 'dwuz']
734431 ['ag', 'ekq', 'lh', 'sjy', 'ci', 'nrf', 'tpxbv', 'dowumz']
705730 ['ag', 'oekq', 'lh', 'sjy', 'ci', 'nrf', 'tpxbv', 'dwumz']
691669 ['ag', 'oekq', 'lh', 'nsjy', 'ic', 'rf', 'tpxbv', 'dwumz']
665866 ['ag', 'hokq', 'el', 'nsjy', 'ic', 'rbf', 'tpxv', 'dwumz']
661610 ['agm', 'hokq', 'e', 'nsj', 'ilc', 'rbf', 'tpyxv', 'dwuz']

So I claim this last one is one of the most practical keypad (in terms of clicks).

\$\endgroup\$
7
  • \$\begingroup\$ How do you know if it is optimal? \$\endgroup\$ Commented Mar 30, 2014 at 20:23
  • \$\begingroup\$ Montecarlo does not work in this way. It only put you nearer and nearer to the optimal solution. But let say, if in 1 million tries your solution does not change..then you're probably using the optimal one. (or you're not moving enough from this "local minimum") \$\endgroup\$ Commented Mar 30, 2014 at 20:24
  • \$\begingroup\$ So for challenges, we only need it to work most of the time then? \$\endgroup\$ Commented Mar 30, 2014 at 20:33
  • 1
    \$\begingroup\$ @PyRulez I realized that this would not be an easy problem to find an optimal solution for (except if you try all possible solutions, which I was hoping to prevent with that -35 bonus), so I actually really dig this approach. \$\endgroup\$
    – Flonk
    Commented Mar 31, 2014 at 11:11
  • 1
    \$\begingroup\$ Interesting method, but maybe this task is not exactly its domain. I checked, and, for 'Alice', default keyboard requires 291613 clicks. Optimized with my program - 195597. With 'Monte Carlo' approach I didn't get less than 207000 clicks in more than 5 million iterations. And, it's better to swap letters, i.e. 2x4 + 6x3 layout remaining constant. \$\endgroup\$ Commented Mar 31, 2014 at 19:15
2
\$\begingroup\$

Well if you just want the most popular characters assigned to the bins 2-9, Perl can do that in 127 chars...

foreach(split /\s*/,<>){$x{$_}++}
foreach(sort{$x{$b}<=>$x{$a}}keys %x){$o{$n++%8}.=$_}
for(0..7){printf "%d: %s\n",$_+2,$o{$_}}

giving something like:

echo "jackdawslovemybigsphinxofquartz" | perl ./keypad.pl
2: ajeb
3: iynz
4: suv
5: ohm
6: wkl
7: rgp
8: xfc
9: dtq

Or print it all on one line, removing 12 characters:

foreach(split /\s*/,<>){$x{$_}++}
foreach(sort{$x{$b}<=>$x{$a}}keys %x){$o[$n++%8].=$_}
print join",",values@o,"\n";
\$\endgroup\$
1
  • 2
    \$\begingroup\$ you can easily trim this to 100 chars: $x{$_}++for split/\s*/,<>;map$o{$n++%8}.=$_,sort{$x{$b}<=>$x{$a}}keys%x;print map"$_:".$o{$_-2},2..9 \$\endgroup\$
    – ardnew
    Commented Mar 23, 2014 at 23:14
1
\$\begingroup\$

Haskell, 160 - 35 = 125

import Data.List
import GHC.Exts
main=interact f where f s=show$transpose$map($sortWith(\x->length$filter(/=x)s)['a'..'z'])[t,t.d,t.d.d,d.d.d];t=take 8;d=drop 8

Example:

$ runhaskell % <<< "jackdaws loves my big sphinx of quartz"
["afpy","sgqz","ihr","ojt","bku","clv","dmw","enx"]
$ </usr/share/dict/propernames tr A-Z a-z | runhaskell % 
["atjx","edgq","rhb","nmp","iyv","lcf","ouw","skz"]

One might argue that this does not optimize for rule 2, but it does put most frequent letters on different keys.

\$\endgroup\$
1
\$\begingroup\$

JavaScript, 192 - 35 = 157

Just noticed the repeating characters rule; this doesn't take that into account. But as @mniip noted in his answer:

One might argue that this does not optimize for rule 2, but it does put most frequent letters on different keys.

o={}
a=[]
b=['','','','','','','','']
i=-1
s.split('').forEach(function(x){o[x]=o[x]?o[x]+1:1})
for(x in o)a.push([o[x],x])
a.sort().reverse().forEach(function(x){b[i=(i+1)%8]+=x[1]})
alert(b)

This would have probably been in Ruby, but I'm not at home and am being forced to use Internet Explorer (eww). But hey, it's sometimes fun using languages terrible at golfing to golf! ;)

Sample output (for your input):

avlc,sukb,otj,irh,zqg,ypf,xne,wmd

Since JS has no STDIN, the program assumes that input is stored in variable s.

\$\endgroup\$
6
  • \$\begingroup\$ Are you optimizing with this in mind as well: "Writing two letters that are typed using the same button requires an additional 2 button presses" \$\endgroup\$ Commented Mar 21, 2014 at 22:31
  • \$\begingroup\$ Re: last edit. I think the output for 'abcdefghia' is not exactly optimal. \$\endgroup\$ Commented Mar 23, 2014 at 16:23
  • \$\begingroup\$ @VadimR "You can also assume that the input text is reasonably large and every letter is in there at least once" \$\endgroup\$
    – Doorknob
    Commented Mar 23, 2014 at 16:24
  • \$\begingroup\$ I know. 'azbcdefghizjklmnopqzrstuvwxyz' \$\endgroup\$ Commented Mar 23, 2014 at 16:28
  • 1
    \$\begingroup\$ You can optimise b=['','','','','','','',''] to b=[x='',x,x,x,x,x,x,x], s.split('') to s.split(x) and o[x]=o[x]?o[x]+1:1 to o[x]=-~o[x]. \$\endgroup\$
    – Toothbrush
    Commented Mar 25, 2014 at 19:42
0
\$\begingroup\$

Python (119-35=84):

Assuming the string is a variable a, and only contains lowercase letters:

for h in range(8): print h+2,zip(*sorted([(__import__("collections").Counter(a)[d],d) for d in set(a)])[::-1])[1][h::8]

ungolfed:

import collections

#a="jackdawslovemybigsphinxofquartz"
a=__import__("string").lowercase

b=collections.Counter(a)

c=set(a)

d=[(b[d],d) for d in c]

e=sorted(d)

f=e[::-1]

g=zip(*f)[1]

for h in range(8): print h+2,g[h::8]

PYG (76-35=41):

Aaah, we can drop the enourmous import. Again, this assumes the stripped string is in a.

for h in R(8): print h+2,Z(*S([(CC(a)[d],d) for d in Se(a)])[::-1])[1][h::8]
\$\endgroup\$

Not the answer you're looking for? Browse other questions tagged or ask your own question.