Parse a C++14 integer literal

Question

According to http://en.cppreference.com/w/cpp/language/integer_literal, integer literals consist of a decimal/hex/octal/binary literal and a optional integer suffix, that is obviously completely unnecessary, wastes precious bytes and is not used in this challenge.

A decimal literal is a non-zero decimal digit (1, 2, 3, 4, 5, 6, 7, 8, 9), followed by zero or more decimal digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9).

A octal literal is the digit zero (0) followed by zero or more octal digits (0, 1, 2, 3, 4, 5, 6, 7).

A hexadecimal literal is the character sequence 0x or the character sequence 0X followed by one or more hexadecimal digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, A, b, B, c, C, d, D, e, E, f, F) (note the case-insensitivity of abcdefx).

A binary literal is the character sequence 0b or the character sequence 0B followed by one or more binary digits (0, 1).

Additionally, there may optionally be some 's as a digit separator. They have no meaning and can be ignored.

Input

A string that represents a C++14 integer literal or an array of its charcodes.

Output

The number represented by the input string in base 10, with an optional trailing newline. The correct output never will exceed 2*10^9

Winning criteria

The GCC contributors need over 500 lines of code to do this, therefore our code must be as short as possible!

Test cases:

0                       ->    0
1                       ->    1
12345                   ->    12345
12345'67890             ->    1234567890
0xFF                    ->    255
0XfF                    ->    255
0xAbCdEf                ->    11259375
0xa'bCd'eF              ->    11259375
0b1111'0000             ->    240
0b0                     ->    0
0B1'0                   ->    2
0b1                     ->    1
00                      ->    0
01                      ->    1
012345                  ->    5349
0'123'4'5               ->    5349

@LuisfelipeDejesusMunoz No; how did you expect that to be parsed? — the default., Commented May 16, 2019 at 13:19
I assume simply writing a function in C++14 would be cheating, right? Since the compiler already does it automatically (even if it is internally 500+ lines of code...) — Darrel Hoffman, Commented May 16, 2019 at 20:22
@DarrelHoffman You couldn't just do it with "a function in C++14" though, since that wouldn't take a string input. Maybe with some script that invokes a C++ compiler. — aschepler, Commented May 16, 2019 at 21:05
The string 0 might be a good test case to add (it revealed a bug in one of my recent revisions). — Daniel Schepler, Commented May 17, 2019 at 1:25

Daniel Schepler · Accepted Answer · 2019-05-17 16:04:54Z

x86 (32-bit) machine code, 59 57 bytes

This function takes esi as a pointer to a null-terminated string and returns the value in edx. (Listing below is GAS input in AT&T syntax.)

        .globl parse_cxx14_int
        .text
parse_cxx14_int:
        push $10
        pop %ecx                # store 10 as base
        xor %eax,%eax           # initialize high bits of digit reader
        cdq                     # also initialize result accumulator edx to 0
        lodsb                   # fetch first character
        cmp $'0', %al
        jne .Lparseloop2
        lodsb
        and $~32, %al           # uppercase letters (and as side effect,
                                # digits are translated to N+16)
        jz .Lend                # "0" string
        cmp $'B', %al           # after '0' have either digit, apostrophe,
                                # 'b'/'B' or 'x'/'X'
        je .Lbin
        jg .Lhex
        dec %ecx
        dec %ecx                # update base to 8
        jmp .Lprocessdigit      # process octal digit that we just read (or
                                # skip ' if that is what we just read)   
.Lbin:
        sub $14, %ecx           # with below will update base to 2
.Lhex:
        add $6, %ecx            # update base to 16
.Lparseloop:
        lodsb                   # fetch next character
.Lparseloop2:
        and $~32, %al           # uppercase letters (and as side effect,
                                # digits are translated to N+16)
        jz .Lend
.Lprocessdigit:
        cmp $7, %al             # skip ' (ASCII 39 which would have been
                                # translated to 7 above)
        je .Lparseloop
        test $64, %al           # distinguish letters and numbers
        jz .Lnum
        sub $39, %al            # with below will subtract 55 so e.g. 'A'==65
                                # will become 10
.Lnum:
        sub $16, %al            # translate digits to numerical value
        imul %ecx, %edx
#        movzbl %al, %eax
        add %eax, %edx          # accum = accum * base + newdigit
        jmp .Lparseloop
.Lend:
        ret

And a disassembly with byte counts - in Intel format this time, in case you prefer that one.

Disassembly of section .text:

00000000 <parse_cxx14_int>:
   0:   6a 0a                   push   0xa
   2:   59                      pop    ecx
   3:   31 c0                   xor    eax,eax
   5:   99                      cdq    
   6:   ac                      lods   al,BYTE PTR ds:[esi]
   7:   3c 30                   cmp    al,0x30
   9:   75 16                   jne    21 <parse_cxx14_int+0x21>
   b:   ac                      lods   al,BYTE PTR ds:[esi]
   c:   24 df                   and    al,0xdf
   e:   74 28                   je     38 <parse_cxx14_int+0x38>
  10:   3c 42                   cmp    al,0x42
  12:   74 06                   je     1a <parse_cxx14_int+0x1a>
  14:   7f 07                   jg     1d <parse_cxx14_int+0x1d>
  16:   49                      dec    ecx
  17:   49                      dec    ecx
  18:   eb 0b                   jmp    25 <parse_cxx14_int+0x25>
  1a:   83 e9 0e                sub    ecx,0xe
  1d:   83 c1 06                add    ecx,0x6
  20:   ac                      lods   al,BYTE PTR ds:[esi]
  21:   24 df                   and    al,0xdf
  23:   74 13                   je     38 <parse_cxx14_int+0x38>
  25:   3c 07                   cmp    al,0x7
  27:   74 f7                   je     20 <parse_cxx14_int+0x20>
  29:   a8 40                   test   al,0x40
  2b:   74 02                   je     2f <parse_cxx14_int+0x2f>
  2d:   2c 27                   sub    al,0x27
  2f:   2c 10                   sub    al,0x10
  31:   0f af d1                imul   edx,ecx
  34:   01 c2                   add    edx,eax
  36:   eb e8                   jmp    20 <parse_cxx14_int+0x20>
  38:   c3                      ret

And in case you want to try it, here is the C++ test driver code that I linked with it (including the calling convention specification in GCC asm syntax):

#include <cstdio>
#include <string>
#include <iostream>

inline int parse_cxx14_int_wrap(const char *s) {
    int result;
    const char* end;
    __asm__("call parse_cxx14_int" :
            "=d"(result), "=S"(end) :
            "1"(s) :
            "eax", "ecx", "cc");
    return result;
}

int main(int argc, char* argv[]) {
    std::string s;
    while (std::getline(std::cin, s))
        std::printf("%-16s -> %d\n", s.c_str(), parse_cxx14_int_wrap(s.c_str()));
    return 0;
}

-1 byte due to comment by Peter Cordes

-1 byte from updating to use two decrements to change 10 to 8

Only you're missing tests for overflows... Too large a number gets reported by compilers. — Alexis Wilke, Commented May 17, 2019 at 4:38
Can you swap your register usage for rdx and rbx? Then you can use 1-byte cdq` to zero rdx from eax. — Peter Cordes, Commented May 17, 2019 at 10:48
This should be either list the byte count of your assembly, or be labelled as 59 bytes of x86 machine code. — Potato44, Commented May 17, 2019 at 12:20
@PeterCordes Thanks, didn't know about that one. (Also, on looking at it again, I noticed that changing the base from 10 to 8 could be 2 bytes - from two decrement instructions - instead of 3 bytes.) — Daniel Schepler, Commented May 17, 2019 at 15:56
@AlexisWilke It also doesn't test for invalid format (e.g. digits out of range of the given base) which compilers would also do. But according to the problem statement, the input is guaranteed to be valid and not to overflow a 32-bit signed integer. — Daniel Schepler, Commented May 17, 2019 at 16:49

Luis felipe De jesus Munoz · Accepted Answer · 2019-05-16 13:18:11Z

12

JavaScript (Babel Node), 26 bytes

lol x2

_=>eval(_.split`'`.join``)

Try it online!

answered May 16, 2019 at 13:18

Luis felipe De jesus Munoz

10.6k2 gold badges28 silver badges82 bronze badges

4

\$\begingroup\$ This isn't BabelJS exclusive, it works from ES6 onwards \$\endgroup\$
– Bassdrop Cumberwubwubwub
Commented May 16, 2019 at 13:29
1

\$\begingroup\$ @BassdropCumberwubwubwub, the header was probably copied from TIO. \$\endgroup\$
– Shaggy
Commented May 16, 2019 at 17:41
\$\begingroup\$ Nice, I first tried to use Number because it handles binary and hex, but apparently not octal Number("010") === 10 \$\endgroup\$
– Carl Walsh
Commented May 16, 2019 at 21:48

Add a comment |

Daniel Schepler · Accepted Answer · 2019-05-19 01:28:29Z

8

C++ (gcc), 141 138 134 120 bytes

This is a function that takes an array of characters (specified as a pair of pointers to the start and end - using the pair of iterators idiom) and returns the number. Note that the function mutates the input array.

(This does rely on the behavior of gcc/libstdc++ that #include<cstdlib> also places the functions in global scope. For strictly standard compliant code, replace with #include<stdlib.h> for a cost of one more character.)

Brief description: The code first uses std::remove to filter out ' characters (ASCII 39). Then, strtol with a base of 0 will already handle the decimal, octal, and hexadecimal cases, so the only other case to check for is a leading 0b or 0B and if so, set the base for strtol to 2 and start parsing after the leading 2 characters.

#import<algorithm>
#import<cstdlib>
int f(char*s,char*e){e=s[*std::remove(s,e,39)=1]&31^2?s:s+2;return strtol(e,0,e-s);}

Try it online.

Saved 3 bytes due to suggestion by ceilingcat and some more golfing that followed.

Saved 4 bytes due to suggestions by grastropner.

-2 bytes by Lucas

-12 bytes by l4m2

edited May 19, 2019 at 1:28

answered May 16, 2019 at 21:17

Daniel Schepler

1,3917 silver badges15 bronze badges

\$\begingroup\$ 134 bytes \$\endgroup\$
– gastropner
Commented May 17, 2019 at 19:54
\$\begingroup\$ Incorporated, thanks. \$\endgroup\$
– Daniel Schepler
Commented May 17, 2019 at 20:52
\$\begingroup\$ 132 bytes by using the deprecated #import instead of #include? \$\endgroup\$
– Lucas
Commented May 17, 2019 at 23:41
\$\begingroup\$ If invalid input is undefined behavior, no need to check if 1st char is 0 for base 2 \$\endgroup\$
– l4m2
Commented May 18, 2019 at 12:19
\$\begingroup\$ so 124 \$\endgroup\$
– l4m2
Commented May 18, 2019 at 12:23

| Show 5 more comments

Luis felipe De jesus Munoz · Accepted Answer · 2019-06-04 12:11:27Z

6

Japt, 6 bytes

OxUr"'

OxUr"'  Full Program. Implicit Input U
  Ur"'  Remove ' from U
Ox      Eval as javascript

Try it online!

edited Jun 4, 2019 at 12:11

answered May 16, 2019 at 14:46

Luis felipe De jesus Munoz

10.6k2 gold badges28 silver badges82 bronze badges

\$\begingroup\$ How does this work? \$\endgroup\$
– lirtosiast
Commented Jun 4, 2019 at 9:35
\$\begingroup\$ @lirtosiast Basically the same as my js answer. I Remove ' from the input and then evaluate it as Js \$\endgroup\$
– Luis felipe De jesus Munoz
Commented Jun 4, 2019 at 12:12

Add a comment |

hyper-neutrino · Accepted Answer · 2019-05-16 13:28:27Z

5

Python 2, 32 bytes

lambda a:eval(a.replace("'",""))

Try it online!

lol

(needs Python 2 because Python 3 changed octal literals to 0o(...)).

edited May 16, 2019 at 13:28

answered May 16, 2019 at 13:15

hyper-neutrino♦

42.5k5 gold badges68 silver badges224 bronze badges

3

\$\begingroup\$ we've truly gone full circle at this point \$\endgroup\$
– osuka_
Commented May 17, 2019 at 23:54

Add a comment |

Community · Accepted Answer · 2020-06-17 09:04:33Z

4

Perl 5 (-p), 14 bytes

y/'/_/;$_=eval

TIO

edited Jun 17, 2020 at 9:04

CommunityBot

1

answered May 16, 2019 at 13:38

Nahuel Fouilleul

8,6021 gold badge10 silver badges18 bronze badges

Add a comment |

Giuseppe · Accepted Answer · 2019-05-16 15:40:45Z

4

R, 79 71 69 bytes

`+`=strtoi;s=gsub("'","",scan(,""));na.omit(c(+s,sub("..",0,s)+2))[1]

Try it online!

strtoi does everything except for the base 2 conversions and ignoring the ', so there's quite a lot of bytes just to fix those things.

Thanks to Aaron Hayman for -6 bytes, and inspiring -4 more bytes (and counting!)

Verify all test cases (old version)

edited May 16, 2019 at 15:40

answered May 16, 2019 at 13:33

Giuseppe

28.7k3 gold badges31 silver badges104 bronze badges

\$\begingroup\$ can save a byte replacing sub("0b|B" with sub("b|B", since the leading "0" will not affect the value. Can get another by renaming strtoi \$\endgroup\$
– Aaron Hayman
Commented May 16, 2019 at 14:29
1

\$\begingroup\$ 74 bytes: Try it online! \$\endgroup\$
– Aaron Hayman
Commented May 16, 2019 at 14:52
1

\$\begingroup\$ @AaronHayman wow, I've never seen na.omit before. Super handy here, and I golfed a bit more off :-) \$\endgroup\$
– Giuseppe
Commented May 16, 2019 at 15:00
1

\$\begingroup\$ If we assume every fail of the first strtoi is a binary, you can use substring instead of sub to save another byte: Try it online! \$\endgroup\$
– Aaron Hayman
Commented May 16, 2019 at 15:18
1

\$\begingroup\$ @AaronHayman we can strip off the first 2 characters of s using sub instead with sub('..','',s) which is another byte shorter! \$\endgroup\$
– Giuseppe
Commented May 16, 2019 at 15:40

Add a comment |

Emigna · Accepted Answer · 2019-05-17 13:25:55Z

4

05AB1E, 16 14 bytes

Saved 2 bytes thanks to Grimy

''KlÐïK>i8ö}.E

Try it online! or as a Test Suite

Explanation

''K                # remove "'" from input
   l               # and convert to lower-case
    Ð              # triplicate
     ï             # convert one copy to integer
      K            # and remove it from the second copy
       >i  }       # if the result is 0
         8ö        # convert from base-8 to base-10
            .E     # eval

edited May 17, 2019 at 13:25

answered May 16, 2019 at 14:08

Emigna

53k5 gold badges42 silver badges162 bronze badges

\$\begingroup\$ -2 bytes \$\endgroup\$
– Grimmy
Commented May 17, 2019 at 13:00
\$\begingroup\$ And here's a fake 13 (passes all the test cases, but fails on e.g. 0010). \$\endgroup\$
– Grimmy
Commented May 17, 2019 at 13:08
\$\begingroup\$ @Grimy: Thanks! Cool use of ï! \$\endgroup\$
– Emigna
Commented May 17, 2019 at 13:26

Add a comment |

Community · Accepted Answer · 2020-06-17 09:04:33Z

4

Excel, 115 bytes

=DECIMAL(SUBSTITUTE(REPLACE(A1,2,1,IFERROR(VALUE(MID(A1,2,1)),)),"'",),VLOOKUP(A1,{"0",8;"0B",2;"0X",16;"1",10},2))

Input from A1, output to wherever you put this formula. Array formula, so use Ctrl+Shift+Enter to enter it.

I added a couple test cases you can see in the image - some early attempts handled all given test cases correctly but got rows 16 and/or 17 wrong.

edited Jun 17, 2020 at 9:04

CommunityBot

1

answered May 17, 2019 at 23:34

Sophia Lechner

1,2507 silver badges10 bronze badges

\$\begingroup\$ Is it against the rules to omit the final two closing parentheses and take advantage of the fact that the “compiler” (pressing return or tab) will error-correct for you? \$\endgroup\$
– Lucas
Commented Jun 15, 2019 at 3:13
\$\begingroup\$ In my personal opinion, yes. I don't think there's a site consensus. Excel adding the parentheses feels like the equivalent of a code-completion feature in another language's IDE, which should be ignored for byte counting. (But, I think "?" should be counted as 1 byte in BASIC even though it will be silently expanded to "PRINT" so maybe I'm not entirely consistent here). \$\endgroup\$
– Sophia Lechner
Commented Jun 17, 2019 at 17:48

Add a comment |

Peter Cordes · Accepted Answer · 2019-05-19 01:29:18Z

x86-64 machine code, 44 bytes

(The same machine code works in 32-bit mode as well.)

@Daniel Schepler's answer was a starting point for this, but this has at least one new algorithmic idea (not just better golfing of the same idea): The ASCII codes for 'B' (1000010) and 'X' (1011000) give 16 and 2 after masking with 0b0010010.

So after excluding decimal (non-zero leading digit) and octal (char after '0' is less than 'B'), we can just set base = c & 0b0010010 and jump into the digit loop.

Callable with x86-64 System V as unsigned __int128 parse_cxx14_int(int dummy, const char*rsi); Extract the EDX return value from the high half of the unsigned __int128 result with tmp>>64.

        .globl parse_cxx14_int
## Input: pointer to 0-terminated string in RSI
## output: integer in EDX
## clobbers: RAX, RCX (base), RSI (points to terminator on return)
parse_cxx14_int:
        xor %eax,%eax           # initialize high bits of digit reader
        cdq                     # also initialize result accumulator edx to 0
        lea 10(%rax), %ecx      # base 10 default
        lodsb                   # fetch first character
        cmp $'0', %al
        jne .Lentry2
    # leading zero.  Legal 2nd characters are b/B (base 2), x/X (base 16)
    # Or NUL terminator = 0 in base 10
    # or any digit or ' separator (octal).  These have ASCII codes below the alphabetic ranges
    lodsb

    mov    $8, %cl              # after '0' have either digit, apostrophe, or terminator,
    cmp    $'B', %al            # or 'b'/'B' or 'x'/'X'  (set a new base)
    jb   .Lentry2               # enter the parse loop with base=8 and an already-loaded character
         # else hex or binary. The bit patterns for those letters are very convenient
    and    $0b0010010, %al      # b/B -> 2,   x/X -> 16
    xchg   %eax, %ecx
    jmp  .Lentry

.Lprocessdigit:
    sub  $'0' & (~32), %al
    jb   .Lentry                 # chars below '0' are treated as a separator, including '
    cmp  $10, %al
    jb  .Lnum
    add  $('0'&~32) - 'A' + 10, %al   # digit value = c-'A' + 10.  we have al = c - '0'&~32.
                                        # c = al + '0'&~32.  val = m+'0'&~32 - 'A' + 10
.Lnum:
        imul %ecx, %edx
        add %eax, %edx          # accum = accum * base + newdigit
.Lentry:
        lodsb                   # fetch next character
.Lentry2:
        and $~32, %al           # uppercase letters (and as side effect,
                                # digits are translated to N+16)
        jnz .Lprocessdigit      # space also counts as a terminator
.Lend:
        ret

The changed blocks vs. Daniel's version are (mostly) indented less than other instruction. Also the main loop has its conditional branch at the bottom. This turned out to be a neutral change because neither path could fall into the top of it, and the dec ecx / loop .Lentry idea for entering the loop turned out not to be a win after handling octal differently. But it has fewer instructions inside the loop with the loop in idiomatic form do{}while structure, so I kept it.

Daniel's C++ test harness works unchanged in 64-bit mode with this code, which uses the same calling convention as his 32-bit answer.

g++ -Og parse-cxx14.cpp parse-cxx14.s &&
./a.out < tests | diff -u -w - tests.good

Disassembly, including the machine code bytes that are the actual answer

0000000000000000 <parse_cxx14_int>:
   0:   31 c0                   xor    %eax,%eax
   2:   99                      cltd   
   3:   8d 48 0a                lea    0xa(%rax),%ecx
   6:   ac                      lods   %ds:(%rsi),%al
   7:   3c 30                   cmp    $0x30,%al
   9:   75 1c                   jne    27 <parse_cxx14_int+0x27>
   b:   ac                      lods   %ds:(%rsi),%al
   c:   b1 08                   mov    $0x8,%cl
   e:   3c 42                   cmp    $0x42,%al
  10:   72 15                   jb     27 <parse_cxx14_int+0x27>
  12:   24 12                   and    $0x12,%al
  14:   91                      xchg   %eax,%ecx
  15:   eb 0f                   jmp    26 <parse_cxx14_int+0x26>
  17:   2c 10                   sub    $0x10,%al
  19:   72 0b                   jb     26 <parse_cxx14_int+0x26>
  1b:   3c 0a                   cmp    $0xa,%al
  1d:   72 02                   jb     21 <parse_cxx14_int+0x21>
  1f:   04 d9                   add    $0xd9,%al
  21:   0f af d1                imul   %ecx,%edx
  24:   01 c2                   add    %eax,%edx
  26:   ac                      lods   %ds:(%rsi),%al
  27:   24 df                   and    $0xdf,%al
  29:   75 ec                   jne    17 <parse_cxx14_int+0x17>
  2b:   c3                      retq

Other changes from Daniel's version include saving the sub $16, %al from inside the digit-loop, by using more sub instead of test as part of detecting separators, and digits vs. alphabetic characters.

Unlike Daniel's every character below '0' is treated as a separator, not just '\''. (Except ' ': and $~32, %al / jnz in both our loops treats space as a terminator, which is possibly convenient for testing with an integer at the start of a line.)

Every operation that modifies %al inside the loop has a branch consuming flags set by the result, and each branch goes (or falls through) to a different location.

Do you even need the initialization of eax given that AIUI in 64-bit mode opcodes with small destination will reset the higher bits to 0? — Daniel Schepler, Commented May 20, 2019 at 23:40
@Daniel: writing a 32-bit register zero-extends to 64-bit. Writing an 8 or 16-bit register keeps the behaviour from other modes: merge into the existing value. AMD64 didn't fix the false dependency for 8 and 16-bit registers, and didn't change setcc r/m8 into setcc r/m32, so we still need a stupid 2-instruction xor-zero / set flags / setcc %al sequence to create a 32/64-bit 0 or 1 variable, and it needs the zeroed register before the flag-setting. (Or use mov $0, %eax instead, or use movzx on the critical path). — Peter Cordes, Commented May 21, 2019 at 23:45

Neil · Accepted Answer · 2019-05-17 00:37:51Z

1

Retina, 96 bytes

T`'L`_l
\B
:
^
a;
a;0:x:
g;
a;0:b:
2;
a;0:
8;
[a-g]
1$&
T`l`d
+`;(\d+):(\d+)
;$.($`*$1*_$2*
.+;

Try it online! Link includes test suite. Explanation:

T`'L`_l

Delete 's and convert everything to lower case.

\B
:

Separate the digits, as any hex digits need to be converted into decimal.

^
a;
a;0:x:
g;
a;0:b:
2;
a;0:
8;

Identify the base of the number.

[a-g]
1$&
T`l`d

Convert the characters a-g into numbers 10-16.

+`;(\d+):(\d+)
;$.($`*$1*_$2*

Perform base conversion on the list of digits. $.($`*$1*_*$2* is short for $.($`*$1*_*$2*_) which multiplies $` and $1 together and adds $2. ($` is the part of the string before the ; i.e. the base.)

.+;

Delete the base.

answered May 17, 2019 at 0:37

Neil

173k12 gold badges72 silver badges276 bronze badges

\$\begingroup\$ I appreciate the literal programming approach you took to explain the code :-) \$\endgroup\$
– grooveplex
Commented May 17, 2019 at 23:12

Add a comment |

FrownyFrog · Accepted Answer · 2019-05-17 09:09:46Z

1

J, 48 bytes

cut@'0x 16b +0b 2b +0 8b0 '''do@rplc~'+',tolower

Try it online!

Eval after string substitution.

0XfF -> +16bff -> 255
0xa'bCd'eF -> +16babcdef -> 11259375
0B1'0 -> +2b10 -> 2
0 -> 8b0 -> 0
01 -> 8b01 -> 1
0'123'4'5 -> 8b012345 -> 5349

edited May 17, 2019 at 9:09

answered May 16, 2019 at 16:16

FrownyFrog

3,7321 gold badge13 silver badges20 bronze badges

\$\begingroup\$ It doesn't seem to work correctly with hexadecimals containing 0b: tio.run/##FcwxCsIwFAbg/… \$\endgroup\$
– Galen Ivanov
Commented May 16, 2019 at 19:21
1

\$\begingroup\$ @GalenIvanov nice find, fixed \$\endgroup\$
– FrownyFrog
Commented May 16, 2019 at 19:35

Add a comment |

nwellnhof · Accepted Answer · 2019-05-17 10:30:49Z

1

Perl 6, 29 bytes

{+lc S/^0)>\d/0o/}o{S:g/\'//}

Try it online!

Perl 6 requires an explicit 0o prefix for octal and doesn't support uppercase prefixes like 0X.

Explanation

                   {S:g/\'//}  # remove apostrophes
{                }o  # combine with function
     S/^0)>\d/0o/    # 0o prefix for octal
  lc  # lowercase
 +    # convert to number

edited May 17, 2019 at 10:30

answered May 17, 2019 at 10:20

nwellnhof

10.5k1 gold badge18 silver badges38 bronze badges

Add a comment |

Expired Data · Accepted Answer · 2019-05-17 14:20:08Z

1

Octave, 29 21 20 bytes

@(x)str2num(x(x>39))

Try it online!

-8 bytes thanks to @TomCarpenter

edited May 17, 2019 at 14:20

answered May 16, 2019 at 13:56

Expired Data

4,0011 gold badge11 silver badges34 bronze badges

\$\begingroup\$ For 22 bytes: @(x)str2num(x(x~="'")) \$\endgroup\$
– Tom Carpenter
Commented May 17, 2019 at 8:48
\$\begingroup\$ Which becomes for 21 bytes: @(x)str2num(x(x~=39)) \$\endgroup\$
– Tom Carpenter
Commented May 17, 2019 at 8:50
\$\begingroup\$ Octal doesn't appear to be working (at least on TIO)... for example, f=("077") returns ans = 77 when it should be 63. Or, as in the test case in OP f=("012345") should return 5349 but instead ans = 12345 \$\endgroup\$
– brhfl
Commented Jun 3, 2019 at 15:41

Add a comment |

Community · Accepted Answer · 2020-06-17 09:04:33Z

1

Bash, 33 bytes

x=${1//\'};echo $[${x/#0[Bb]/2#}]

TIO

Zsh, 29 27 bytes

-2 bytes thanks to @GammaFunction

<<<$[${${1//\'}/#0[Bb]/2#}]

TIO

edited Jun 17, 2020 at 9:04

CommunityBot

1

answered May 16, 2019 at 14:30

Nahuel Fouilleul

8,6021 gold badge10 silver badges18 bronze badges

\$\begingroup\$ Clever! I would have thought setopt octalzeroes would be necessary for Zsh. \$\endgroup\$
– GammaFunction
Commented Jun 15, 2019 at 5:46
\$\begingroup\$ You can save 2 bytes in Zsh with <<<$[...] instead of echo $[...] \$\endgroup\$
– GammaFunction
Commented Jun 15, 2019 at 5:47
\$\begingroup\$ thanks, i didn't know that zsh empty command with redirection could display output, i don't know much about zsh, i know a lot better bash \$\endgroup\$
– Nahuel Fouilleul
Commented Jun 15, 2019 at 17:23
\$\begingroup\$ i knew that bash automatically interpret numbers with leading zero to octal, and must be removed for example in date / time \$\endgroup\$
– Nahuel Fouilleul
Commented Jun 15, 2019 at 17:29

Add a comment |

vityavv · Accepted Answer · 2019-05-16 17:27:03Z

0

Go, 75

import "strconv"
func(i string)int64{n,_:=strconv.ParseInt(i,0,0);return n}

answered May 16, 2019 at 17:27

vityavv

7794 silver badges15 bronze badges

\$\begingroup\$ This doesn't appear to work for binary literals, nor for single-quote digit delimiters. \$\endgroup\$
– Nick Matteo
Commented May 17, 2019 at 18:21
\$\begingroup\$ Oh crap. I'll fix it soon. Completely forgot about the delimiters \$\endgroup\$
– vityavv
Commented May 21, 2019 at 16:17

Add a comment |

Naruyoko · Accepted Answer · 2019-05-16 18:12:16Z

0

JavaScript (ES6), 112 bytes

n=>+(n=n.toLowerCase().replace(/'/g,""))?n[1]=="b"?parseInt(n.substr(2),2):parseInt(n,+n[0]?10:n[1]=="x"?16:8):0

answered May 16, 2019 at 18:12

Naruyoko

6405 silver badges10 bronze badges

Add a comment |

Nick Kennedy · Accepted Answer · 2019-05-16 23:26:16Z

0

Jelly, 27 bytes

ØDiⱮḢ=1aƲȦ
ṣ”';/Ḋ⁾0o;Ɗ¹Ç?ŒV

Try it online!

Almost all of this is handling octal. Feels like it could be better golfed.

answered May 16, 2019 at 23:26

Nick Kennedy

21.1k3 gold badges17 silver badges43 bronze badges

Add a comment |

Value Ink · Accepted Answer · 2019-05-16 23:52:47Z

0

Ruby with `-n`, 17 bytes

Just jumping on the eval train, really.

p eval gsub(?'){}

Try it online!

answered May 16, 2019 at 23:52

Value Ink

12.6k1 gold badge17 silver badges42 bronze badges

Add a comment |

Shieru Asakoto · Accepted Answer · 2019-05-17 01:37:02Z

0

Java (JDK), 101 bytes

n->{n=n.replace("'","");return n.matches("0[bB].+")?Long.parseLong(n.substring(2),2):Long.decode(n);}

Try it online!

Long.decode deals with all kinds of literals except the binary ones.

Template borrowed from Benjamin's answer

answered May 17, 2019 at 1:37

Shieru Asakoto

6,21815 silver badges40 bronze badges

\$\begingroup\$ Nice. I need to look more at the functions primitive wrappers have \$\endgroup\$
– Benjamin Urquhart
Commented May 18, 2019 at 18:00

Add a comment |

gastropner · Accepted Answer · 2019-05-18 00:55:50Z

0

C (gcc), 120 118 bytes

-1 byte thanks to ceilingcat

f(char*s){int r=0,c=s[1]&31,b=10;for(s+=2*(*s<49&&(b=c^24?c^2?8:2:16)-8);c=*s++;)r=c^39?r*b+(c>57?c%32+9:c-48):r;c=r;}

Try it online!

edited May 18, 2019 at 0:55

answered May 17, 2019 at 23:55

gastropner

4,3731 gold badge14 silver badges20 bronze badges

Add a comment |

Community · Accepted Answer · 2020-06-17 09:04:33Z

0

C (gcc), 101 97 83 bytes

*d;*s;n;f(int*i){for(s=d=i;*d=*s;d+=*s++>39);i=wcstol(i+n,0,n=!i[1]||i[1]&29?0:2);}

Try it online

edited Jun 17, 2020 at 9:04

CommunityBot

1

answered Jun 3, 2019 at 15:12

jdt

3,5421 gold badge14 silver badges18 bronze badges

Add a comment |

Martin Barker · Accepted Answer · 2019-06-04 16:39:49Z

0

PHP - 43 Byte

eval("return ".str_replace($a,"'","").";");

Same method as https://codegolf.stackexchange.com/a/185644/45489

answered Jun 4, 2019 at 16:39

Martin Barker

4133 silver badges10 bronze badges

Add a comment |

HatsuPointerKun · Accepted Answer · 2019-06-05 19:50:09Z

0

C++, G++, 189 bytes

#include<fstream>
#include<string>
void v(std::string s){{std::ofstream a("a.cpp");a<<"#include<iostream>\nint main(){std::cout<<"<<s<<";}";}system("g++ -std=c++14 a.cpp");system("a.exe");}

No need for tests

Requires installation of g++ with C++14 support

Now, explanations :

It writes a file called a.cpp, uses GCC to compile it and gives a file that output the number

answered Jun 5, 2019 at 19:50

HatsuPointerKun

2,2111 gold badge8 silver badges13 bronze badges

\$\begingroup\$ 175 bytes \$\endgroup\$
– ceilingcat
Commented Jun 5, 2019 at 20:13

Add a comment |

trillian · Accepted Answer · 2019-06-11 13:12:30Z

0

Pyth, 27 bytes

Jscz\'?&qhJ\0}@J1r\0\7iJ8vJ

Try it online!

Unlike the previous (now deleted) Pyth answer, this one passes all test cases in the question, though it is 3 bytes longer.

answered Jun 11, 2019 at 13:12

trillian

8545 silver badges11 bronze badges

\$\begingroup\$ Welcome to the site! \$\endgroup\$
– Wheat Wizard ♦
Commented Jun 11, 2019 at 16:36

Add a comment |

jdt · Accepted Answer · 2019-06-14 19:57:56Z

0

C (gcc) / Bash / C++, 118 bytes

f(i){asprintf(&i,"echo \"#import<iostream>\nmain(){std::cout<<%s;}\">i.C;g++ i.C;./a.out",i);fgets(i,i,popen(i,"r"));}

Try it online!

edited Jun 14, 2019 at 19:57

answered Jun 4, 2019 at 11:12

jdt

3,5421 gold badge14 silver badges18 bronze badges

\$\begingroup\$ I have golfed some code. Then I have realized there is no reason at all for it to work, but it seems to work; 158 bytes. \$\endgroup\$
– the default.
Commented Jun 4, 2019 at 12:52
\$\begingroup\$ @someone, it's nasty, but I like it! \$\endgroup\$
– jdt
Commented Jun 4, 2019 at 13:00
\$\begingroup\$ 148 bytes by merging popen and system. G++ has a flag, I think -x, to read from stdin. That might be shorter than fopen stuff, but I don't know how to invoke with stdin in C. \$\endgroup\$
– the default.
Commented Jun 4, 2019 at 13:10
\$\begingroup\$ @someone, Everything is now merged into the popen command \$\endgroup\$
– jdt
Commented Jun 4, 2019 at 13:52
\$\begingroup\$ printf -> echo seems to work. You're going to be programming in bash soon. \$\endgroup\$
– the default.
Commented Jun 4, 2019 at 13:59

| Show 1 more comment

Benjamin Urquhart · Accepted Answer · 2019-06-15 17:33:10Z

0

Java, 158 154 bytes

This just waiting to be outgolfed. Just tries regexes until something works and default to hex.
-4 bytes thanks to @ValueInk

n->{n=n.replace("'","");var s=n.split("[bBxX]");return Long.parseLong(s[s.length-1],n.matches("0[bB].+")?2:n.matches("0\\d+")?8:n.matches("\\d+")?10:16);}

Try it online

Using ScriptEngine, 92 87 bytes

Eval train coming through. Technically this is passing the torch to JS, so it's not my main submission.

n->new javax.script.ScriptEngineManager().getEngineByName("js").eval(n.replace("'",""))

TIO

edited Jun 15, 2019 at 17:33

answered May 16, 2019 at 23:45

Benjamin Urquhart

1,5247 silver badges25 bronze badges

\$\begingroup\$ Use [bBxX] and 0[bB].+ for some quick regex optimizations. \$\endgroup\$
– Value Ink
Commented May 17, 2019 at 0:09
\$\begingroup\$ That's, not an Integer it's a Long, the title clearly says Integer, a single or double precision IEEE754 could become incorrect due to the method used to save the number when due to the decimal place system in IEEE754 en.wikipedia.org/wiki/IEEE_754#Roundings_to_nearest, it also supports a number higher than 2 trillion (0x9999999999) \$\endgroup\$
– Martin Barker
Commented Jun 18, 2019 at 1:30
\$\begingroup\$ @MartinBarker it's allowed to use Long instead of Integer for golfing purposes. Also, if you are correct, Python can't compete because it has effectively arbitrary-precision integers. Also, a long in Java is an integer represented with 64 bits instead of 32. There are no decimal places. \$\endgroup\$
– Benjamin Urquhart
Commented Jun 18, 2019 at 2:10
\$\begingroup\$ The Long thing was just you're using long not an integer and you're wrong about the golfing purposes, The correct output never will exceed 2*10^9 it quite clearly states that meaning that long can't be used on its own because I can give it 0x9999999999 and it will produce a number higher than 2*10^9 whereas C++ it would create a memory overflow issue because your using more than 32 bits on memory when you have allocated only 32 bits of memory to this number \$\endgroup\$
– Martin Barker
Commented Jun 18, 2019 at 9:55
\$\begingroup\$ @MartinBarker I 100% disagree. Just because my submission can exceed 2*10^9 doesn't mean it's invalid. If it was, all the "normal" int answers would be invalid as well since they go to (2^31)-1, which is larger. \$\endgroup\$
– Benjamin Urquhart
Commented Jun 18, 2019 at 11:58

Add a comment |

Stack Exchange Network

Parse a C++14 integer literal

Input

Output

Winning criteria

Test cases:

27 Answers 27

x86 (32-bit) machine code, 59 57 bytes

JavaScript (Babel Node), 26 bytes

C++ (gcc), 141 138 134 120 bytes

Japt, 6 bytes

Python 2, 32 bytes

Perl 5 (-p), 14 bytes

R, 79 71 69 bytes

05AB1E, 16 14 bytes

Excel, 115 bytes

x86-64 machine code, 44 bytes

Retina, 96 bytes

J, 48 bytes

Perl 6, 29 bytes

Explanation

Octave, 29 21 20 bytes

Bash, 33 bytes

Zsh, 29 27 bytes

Go, 75

JavaScript (ES6), 112 bytes

Jelly, 27 bytes

Ruby with `-n`, 17 bytes

Java (JDK), 101 bytes

C (gcc), 120 118 bytes

C (gcc), 101 97 83 bytes

C++, G++, 189 bytes

Pyth, 27 bytes

C (gcc) / Bash / C++, 118 bytes

Java, 158 154 bytes

Using ScriptEngine, 92 87 bytes

Not the answer you're looking for? Browse other questions tagged
code-golf
parsing
or ask your own question.

Hot Network Questions

Parse a C++14 integer literal

Input

Output

Winning criteria

Test cases:

27 Answers 27

x86 (32-bit) machine code, 59 57 bytes

JavaScript (Babel Node), 26 bytes

C++ (gcc), 141 138 134 120 bytes

Japt, 6 bytes

Python 2, 32 bytes

Perl 5 (-p), 14 bytes

R, 79 71 69 bytes

05AB1E, 16 14 bytes

Excel, 115 bytes

x86-64 machine code, 44 bytes

Retina, 96 bytes

J, 48 bytes

Perl 6, 29 bytes

Explanation

Octave, 29 21 20 bytes

Bash, 33 bytes

Zsh, 29 27 bytes

Go, 75

JavaScript (ES6), 112 bytes

Jelly, 27 bytes

Ruby with -n, 17 bytes

Java (JDK), 101 bytes

C (gcc), 120 118 bytes

C (gcc), 101 97 83 bytes

C++, G++, 189 bytes

Pyth, 27 bytes

C (gcc) / Bash / C++, 118 bytes

Java, 158 154 bytes

Using ScriptEngine, 92 87 bytes

Not the answer you're looking for? Browse other questions tagged code-golfparsing or ask your own question.

Related

Hot Network Questions

Ruby with `-n`, 17 bytes

Not the answer you're looking for? Browse other questions tagged
code-golf
parsing
or ask your own question.