15
\$\begingroup\$

Write an algorithm to interpret a sequence of letters as a Roman numeral. (see roman numeral rules below)

Each distinct letter has a matching Arabic decimal value, no maximum. But you don't have the key beforehand, so {A=10, I=1, X=5, ... Z=1000000} is decided by your interpretation.

Challenge

  1. Read input via STDIN or equivalent and write output via STDOUT or equivalent
  2. Valid inputs are combinations of uppercase and lowercase letters i.e. matching \[a-zA-Z]+\
  3. Input should be validated to see if the letter sequence can be interpreted as valid Roman numeral
  4. If the input passes validation, valid output should be the lowest Arabic decimal interpretation and the key used i.e. Aa is interpreted as 4 {a=5, A=1} not 6 {A=5, a=1} or 9 {a=10, a=1}

Roman Numeral Rules

  1. Only letters representing powers of ten can be repeated, maximum of three times successively and four times in total e.g. II III XXXIX

  2. If one or more letters are placed after another letter of greater value, add that amount

    AAaa   => 22 {A=10, a=1}          (20 + 2 = 22)  
    bbAAaa => 222 {b=100, A=10, a=1}  (200 + 20 + 2 = 222)   
    
  3. If a letter is placed before another letter of greater value, subtract that amount

    Aa    => 4 {a=5, A=1}                 (5 – 1 = 4)  
    AaA   => 19 {A=10, a=1}               (10 + 10 – 1 = 19)  
    BbBaA => 194 {B=100, b=10, A=5, a=1}  (100 + 100 - 10 + 5 - 1 = 194)  
    

    Several rules apply for subtracting amounts from Roman numerals:

    • Only subtract powers of ten i.e. 1, 10, 100... not 5, 50, 500...
    • No double subtraction therefore 18 is written as XVIII not IIXX (10 + 10 - 1 - 1)
    • Do not subtract a number from one that is more than ten times greater.
      You can subtract 1 from 5 or 10 but not from 50, 100, 500...

Example

Input:

Aa  
BAa  
CCCXLVII   
MMMCDVII  
ABADDF  
XVVX  
FAASGSH  
DXCCDA  
AaBbcDEf   

Output:

4 {a=5, A=1}  
14 {B=10, a=5, A=1}  
347 {C=100, L=50, X=10, V=5, I=1}  
347 {M=100, D=50, C=10, V=5, I=1}  
1921 {A=1000, B=100, D=10, F=1}  
'XVVX' failed Roman numeral test  
7191 {F=5000, A=1000, S=100, G=10, H=1}  
'DXCCDA' failed Roman numeral test
4444 {a=5000, A=1000, b=500, B=100, D=50, c=10, f=5, E=1}  
\$\endgroup\$
5
  • 3
    \$\begingroup\$ @IamOgbz this has turned into a great question but attracted a lot of questions in comments along the way. Now that you have enough reputation, I recommend the sandbox. I find it very useful for getting questions just right before posting. \$\endgroup\$ Commented Oct 15, 2015 at 11:25
  • \$\begingroup\$ Wouldn't CCCLXVII be interpreted as CCCXLVII, giving 347? \$\endgroup\$
    – Skyler
    Commented Oct 15, 2015 at 19:52
  • \$\begingroup\$ @Skyler you're absolutely right, will update now! thanks. \$\endgroup\$
    – iamogbz
    Commented Oct 15, 2015 at 20:39
  • \$\begingroup\$ I don't see any restriction on what values the individual letters can have (and indeed you mention 20, which is not the value of a standard Roman numeral). Do you mean to say that any positive integer can be represented by a Roman numeral? In that case, Aa has a value of 1 (A=1, a=2). \$\endgroup\$
    – msh210
    Commented Oct 26, 2015 at 15:48
  • \$\begingroup\$ @msh210 as the letters can only be interpreted as Roman Numerals, it follows that individual letter values can only be powers of 10 or 5 times powers of 10. 20 was only mentioned in relation to combining two roman numerals (and to stress that IXX = 19 is not a valid subtraction). Hope that clears it up for you. \$\endgroup\$
    – iamogbz
    Commented Oct 27, 2015 at 4:33

1 Answer 1

1
\$\begingroup\$

Python 2, 415 444 440 419 416 bytes

There aren't all that many Roman numerals, after all. This script creates all of them and checks all permutations of the input, then returns the smallest match.

a=raw_input()
g=range
b=list(set(a))+[' ']*9
from itertools import*
c=[]
s={}
u=1000
for i in g(10*u):
 t,f=(10*u,9*u,5*u,4*u,u,900,500,400,100,90,50,40,10,9,5,4,1),i;r=""
 for j in g(17):k=i/t[j];r+=('W MW Q MQ M CM D CD C XC L XL X IX V IV I').split()[j]*k;i-=t[j]*k
 s[r]=f
for i in permutations(b[:9]):
 r=''
 for j in a:r+='IVXLCMQWE'[i.index(j)]
 if r in s:c+=[s[r]]
print c and min(c)or'%s failed Roman numeral test'%a
\$\endgroup\$
8
  • \$\begingroup\$ That's a good answer to the challenge as it is now. But in the comment conversation that was wiped out early, it was hinted that this (not real) system goes on after M=1000, having symbols for 5k, 10k and so on. Look at the first example at top: {A=10, I=1, X=5, ... Z=1000000} is decided by your interpretation \$\endgroup\$
    – edc65
    Commented Oct 15, 2015 at 20:53
  • \$\begingroup\$ .., and the last example, a=5000 ... \$\endgroup\$
    – edc65
    Commented Oct 15, 2015 at 20:54
  • \$\begingroup\$ I updated it to handle all of the given test cases. I doubt this approach can be extended past 10,000 as it takes O(n!) time on the number of Roman digits. \$\endgroup\$
    – Skyler
    Commented Oct 15, 2015 at 21:18
  • \$\begingroup\$ @Skyler the test cases aren't exhaustive. The program should handle all possible permutations of the valid inputs that can be interpreted as per the roman numeral rules, with preference given to lower numbers in ambiguous cases. Also your code couldn't handle the last test case link \$\endgroup\$
    – iamogbz
    Commented Oct 15, 2015 at 22:37
  • \$\begingroup\$ Isn't import itertools as i and then i.permutations shorter? \$\endgroup\$
    – cat
    Commented Oct 24, 2015 at 14:12

Not the answer you're looking for? Browse other questions tagged or ask your own question.