13
\$\begingroup\$

Your goal is to write a program that takes no input and outputs the following text:

ca e na ŋa va o sa;
þa ša ra la ła.
ma a pa fa ga ta ča;
în ja i da ða.
ar ħo ên ôn ân uħo;
carþ taŋ neŋ es nem.
elo cenvos.

But there's a catch: for each letter (any character whose general category in Unicode starts with L) in your source, you get a penalty of 20 characters! (For reference, the text to be printed has 81 letters.)

The Perl 6 code below has 145 bytes and 84 letters, so it gets a score of 1,845:

say "ca e na ŋa va o sa;
þa ša ra la ła.
ma a pa fa ga ta ča;
în ja i da ða.
ar ħo ên ôn ân uħo;
carþ taŋ neŋ es nem.
elo cenvos."

The code below has 152 bytes and 70 letters, so it gets a score of 1,552:

$_="C e N ŋa V o S;
Þ �� R L Ł.
M a P F G T Č;
în J i D Ð.
ar ħo ên ôn ân uħo;
Crþ Tŋ neŋ es nem.
elo cenvos.";s:g/<:Lu>/{$/.lc~'a'}/;.say

Standard loopholes are forbidden.

Originally, I thought of forbidding letters altogether, but I don't think there are many languages that make this possible. You're more than welcome to try.

(ŋarâþ crîþ [ˈŋaɹa̰θ kɹḭθ] is one of my conlangs. I wanted to capitalise its name here, but I get the ugly big eng here. Oh well, the language doesn't use capital letters in its romanisation anyway.)

Edit: realised that one of the lines is wrong, but I'll keep it since there are already answers. The correct version of the third line is ma a fa ga pa ta ča; at your choice, you may choose to produce the corrected text instead.

\$\endgroup\$
2
  • 11
    \$\begingroup\$ kolmogorov-complexity, restricted-source, and special scoring are all sorts of things that benefit greatly from careful consideration in the sandbox. Currently, it seems like the best approach to this challenge would be to just write out all of the codepoints in decimal then turn them into text with a builtin, with some shortcut to encode all of the as--or not, depending on how many letters it would take, because 20 characters is a really big penalty (although when everything else is scored by bytes, it's not quite well defined...)! \$\endgroup\$ Commented Apr 15, 2019 at 20:09
  • 4
    \$\begingroup\$ And considering the invocation of Unicode, some explicit rules governing special codepages as used by most golflangs are probably called for (alongside maybe a link to a script to validate scoring). \$\endgroup\$ Commented Apr 15, 2019 at 21:00

13 Answers 13

22
\$\begingroup\$

7, 410 characters, 154 bytes in 7's encoding, 0 letters = score 154

55104010504200144434451510201304004220120504005434473340353241135014335450302052254241052253052244241052335452241114014241310052340435303052335442302052335500302052335430302052313340435303135014243241310335514052312241341351052302245341351525755102440304030434030421030442030424030455733413512410523142410523030523112411350143355142410523414252410523102410523002410523413342411145257551220304010420030455741403

Try it online!

In a challenge that dislikes using letters, what better language to use than one consisting only of digits?

This is a full program that exits via crashing, so there's extraneous output to stderr, but stdout is correct.

Explanation

A 7 program, on its first iteration, simply pushes a number of elements to the stack (because out of the 12 commands that exist in 7, only 8 of them can be represented in a source program, and those 8 are specialised for writing code to push particular data structures to the stack). This program does not use the 6 command (which is the simplest way to create nested structures, but otherwise tends not to appear literally in a source program), so it's only the 7 commands that determine the structure; 7 pushes a new empty element to the top of stack (whereas the 05 commands just append to the top of stack). We can thus add whitespace to the program to show its structure:

551040105042001444344515102013040042201205040054344 7

33403532411350143354503020522542410522530522442410523354522411140142413100523
40435303052335442302052335500302052335430302052313340435303135014243241310335
514052312241341351052302245341351525 7

55102440304030434030421030442030424030455 7

33413512410523142410523030523112411350143355142410523414252410523102410523002
41052341334241114525 7

551220304010420030455 7

41403

The elements near the end of the program are pushed last, so are on top of the stack at the start of the second iteration. On this iteration, and all future iterations, the 7 interpreter automatically makes a copy of the top of the stack and interprets it as a program. The literal 41403 pushes the (non-literal, live code) 47463 (7 has 12 commands but only 8 of them have names; as such, I use bold to show the code, and non-bold to show the literal that generates that code, meaning that, e.g. 4 is the command that appends 4 to the top stack element). So the program that runs on the second iteration is 47463. Here's what that does:

47463
4       Swap top two stack elements, add an empty element in between
 7      Add an empty stack element to the top of stack
  4     Swap top two stack elements, add an empty element in between
   6    Work out which commands would generate the top stack element;
        append that to the element below (and pop the old top of stack)
    3   Output the top stack element, pop the element below

This is easier to understand if we look at what happens to the stack:

  • d c b a 47463 (code to run: 47463)
  • d c b 47463 empty a (code to run: 7463)
  • d c b 47463 empty a empty (code to run: 463)
  • d c b 47463 empty empty empty a (code to run: 63)
  • d c b 47463 empty empty "a" (code to run: 3)
  • d c b 47463 empty (code to run: empty)

In other words, we take the top of stack a, work out what code is most likely to have produced it, and output that code. The 7 interpreter automatically pops empty elements from the top of stack at the end of an iteration, so we end up with the 47463 back on top of the stack, just as in the original program. It should be easy to see what happens next: we end up churning through every stack element one after another, outputting them all, until the stack underflows and the program crashes. So we've basically created a simple output loop that looks at the program's source code to determine what to output (we're not outputting the data structures that were pushes to the stack by our 05 commands, we're instead recreating what commands were used by looking at what structures were created, and outputting those). Thus, the first piece of data output is 551220304010420030455 (the source code that generates the second-from-top stack element), the second is 3341351…114525 (the source code that generates the third-from-top stack element), and so on.

Obviously, though, these pieces of source code aren't being output unencoded. 7 contains several different domain-specific languages for encoding output; once a domain-specific language is chosen, it remains in use until explicitly cleared, but if none of the languages have been chosen yet, the first digit of the code being output determines which of the languages to use. In this program, only two languages are used: 551 and 3.

551 is pretty simple: it's basically the old Baudot/teletype code used to transmit letters over teletypes, as a 5-bit character set, but modified to make all the letters lowercase. So the first chunk of code to be output decodes like this:

551  22 03 04 01 04 20 03 04  55
     c  a  SP e  SP n  a  SP  reset output format

As can be seen, we're fitting each character into two octal digits, which is a pretty decent compression ratio. Pairs of digits in the 0-5 range give us 36 possibilities, as opposed to the 32 possibilities that Baudot needs, so the remaining four are used for special commands; in this case, the 55 at the end clears the remembered output format, letting us use a different format for the next piece of output we produce.

3 is conceptually even simpler, but with a twist. The basic idea is to take groups of three digits (again, in the 0-5 range, as those are the digits for which we can guarantee that we can recreate the original source code from its output), interpret them as a three-digit number in base 6, and just output it as a byte in binary (thus letting us output the multibyte characters in the desired output simply by outputting multiple bytes). The twist, though, comes from the fact that there are only 216 three-digit numbers (with possible leading zeroes) in base 6, but 256 possible bytes. 7 gets round this by linking numbers from 332₆ = 128₁₀ upwards to two different bytes; 332 can output either byte 128 or 192, 333 either byte 129 or 193, and so on, up to 515 which outputs either byte 191 or 255.

How does the program know which of the two possibilities to output? It's possible to use triplets of digits from 520 upwards to control this explicitly, but in this program we don't have to: 7's default is to pick all the ambiguous bytes in such a way that the output is valid UTF-8! It turns out that there's always at most one way to do this, so as long as it's UTF-8 we want (and we do in this case), we can just leave it ambiguous and the program works anyway.

The end of each of the 3… sections is 525, which resets the output format, letting us go back to 551 for the next section.

\$\endgroup\$
2
  • \$\begingroup\$ This is either 410 bytes + 0 letters in the unpacked representation, or 154 bytes + lots of letters in the packed representation. Counting the bytes in one and the letters in the other one seems cheaty. \$\endgroup\$
    – Grimmy
    Commented Sep 5, 2019 at 16:18
  • 2
    \$\begingroup\$ @Grimy: You're confusing bytes with characters. The packed representation consists of 154 bytes in 7's encoding that encode 410 octal digits, each of which is a digit not a letter. Your reasoning implies that, say, ɓ in Jelly is not a letter (because its encoding in Jelly's encoding corresponds to the control code "CSI" if interpreted in a typical 8-bit character set, rather than a letter). Just like Jelly, 7 also uses a custom encoding; but because 7 uses no letters, the encoding has no need to encode letters and thus can't. \$\endgroup\$
    – ais523
    Commented Sep 5, 2019 at 20:16
10
\$\begingroup\$

Haskell, 0 letters, 423 bytes = score 423

(['\10'..]!!)<$>[89,87,22,91,22,100,87,22,321,87,22,108,87,22,101,22,105,87,49,0,244,87,22,343,87,22,104,87,22,98,87,22,312,87,36,0,99,87,22,87,22,102,87,22,92,87,22,93,87,22,106,87,22,259,87,49,0,228,100,22,96,87,22,95,22,90,87,22,230,87,36,0,87,104,22,285,101,22,224,100,22,234,100,22,216,100,22,107,285,101,49,0,89,87,104,244,22,106,87,321,22,100,91,321,22,91,105,22,100,91,99,36,0,91,98,101,22,89,91,100,108,101,105,36]

Try it online!

\$\endgroup\$
6
\$\begingroup\$

Jelly,  274 260  212 bytes + 2 letters =  314 300  252

-48 bytes thanks to Nick Kennedy

“19ב+49;7883,8220,8216,7884Ọ“19937801,1169680277365253“38“68112“;107¤+1+\“@“&%"("/%"@%"6%"0"3%$!<%" %"2%"-%"?%#!.%"%"1%")%"*%"4%"=%$!9/",%"+"'%":%#!%2">0"8/";/"7/"5>0$!&%2<"4%@"/(@"(3"/(.#!(-0"&(/603#“_32¤”;";/V

(Uses !"#$%&'()*+,-./0123456789:;<=>?@V\_¤×Ọ‘“” of which V and are Unicode letters and are used once each)

Try it online!

\$\endgroup\$
3
  • \$\begingroup\$ 212 bytes plus two letters \$\endgroup\$ Commented Apr 17, 2019 at 22:58
  • \$\begingroup\$ Verification \$\endgroup\$ Commented Apr 17, 2019 at 23:23
  • \$\begingroup\$ @NickKennedy I'd played around with golfing the number, but didn't step back and look to just offset the ordinals, good stuff - thanks! \$\endgroup\$ Commented Apr 18, 2019 at 7:17
3
\$\begingroup\$

PowerShell, scores 601 546

-join(67,65,0,69,0,78,65,0,299,65,0,86,65,0,79,0,83,65,27,-22,222,65,0,321,65,0,82,65,0,76,65,0,290,65,14,-22,77,65,0,65,0,80,65,0,70,65,0,71,65,0,84,65,0,237,65,27,-22,206,78,0,74,65,0,73,0,68,65,0,208,65,14,-22,65,82,0,263,79,0,202,78,0,212,78,0,194,78,0,85,263,79,27,-22,67,65,82,222,0,84,65,299,0,78,69,299,0,69,83,0,78,69,77,14,-22,69,76,79,0,67,69,78,86,79,83,14|%{[char]($_+32)})

Try it online!

Naive approach; I just took the code points and converted them to decimal, subtracted 32, then this code treats them as a char before -joining it back together into a single string.

\$\endgroup\$
2
3
\$\begingroup\$

Jelly, 321 bytes + 2 letters = score 361

3343781777797791350694255572961968519437585132057650209974147122192542459108221624793330048943528237823681411832154316740173721249435700067706302064570847610741421342406380917446310820012503592770000532190167243585300911078873144513786923305473352724133578818457026824110152529235136461572588027747840738399150398304b354Ọ

Try it online!

This is hideous and someone can definitely do better.

Verify score.

\$\endgroup\$
1
  • 1
    \$\begingroup\$ actually less bad than it seems \$\endgroup\$
    – ASCII-only
    Commented Apr 16, 2019 at 0:06
3
\$\begingroup\$

05AB1E, score 209 207 (187 bytes + 20 penalty for 1 letter)

•£?;\:'%¢;.'¡£/':¢?'¢°':¢°#@¢«>#%¡¤;®[¢:¥¢:©¢:¦¢;®¢>#¡£#¨¢#&¢+¢#,¢:§¡¤#¬¢#@¢#)¢#(¢#<¢#¢#/¡£#¯¢#.¢#>¢#±¢#«¡¤#?¢;¢#\¢#°¢#:¢'¢#%•[₅‰`©®_#∞158+9022014013016708040204090101502501027¾¡17∍.¥>:ç?

Try it online!

The only letter is ç. The currency symbols €£¢ are not considered letters in Unicode.

\$\endgroup\$
2
\$\begingroup\$

Python 3, 380 bytes + 5 letters = 480

print("""\143\141 \145 \156\141 \513\141 \166\141 \157 \163\141;
\376\141 \541\141 \162\141 \154\141 \502\141.
\155\141 \141 \160\141 \146\141 \147\141 \164\141 \415\141;
\356\156 \152\141 \151 \144\141 \360\141.
\141\162 \447\157 \352\156 \364\156 \342\156 \165\447\157;
\143\141\162\376 \164\141\513 \156\145\513 \145\163 \156\145\155.
\145\154\157 \143\145\156\166\157\163.""")

Try it online!

\$\endgroup\$
0
1
\$\begingroup\$

Retina, 140 characters, 159 bytes, 14 letters = score 439


%# ' 1# !# 9# 2 6#;¶þ# š# 5# /# ł#.¶0# # 3# (# )# 7# č#;¶î1 ,# + &# ð#.¶#5 ħ2 ê1 ô1 â1 8ħ2;¶%#5þ 7#! 1'! '6 1'0.¶'/2 %'1926.
T`!--/-9`ŋ\`-{

Try it online! Edit: Saved 1 letter by switching from K` to a newline. Now also works in Retina 0.8.2 (but the title would be too long).

\$\endgroup\$
1
\$\begingroup\$

Japt -S, 304 286 bytes + 2 1 letters = 344 306

Well, this is just god-awful!

"3 1
5
14 1
235 1
22 1
15
19 1 -37 -86 158 1
257 1
18 1
12 1
226 1 -50 -86 13 1
1
16 1
6 1
7 1
20 1
173 1 -37 -86 142 14
10 1
9
4 1
144 1 -50 -86 1 18
199 15
138 14
148 14
130 14
21 199 15 -37 -86 3 1 18 158
20 1 235
14 5 235
5 19
14 5 13 -50 -86 5 12 15
3 5 14 22 15 19 -50"·®¸®°d96} ¬

Try it

\$\endgroup\$
1
\$\begingroup\$

PHP -a, 402 bytes + 200 penalty = 602 score

foreach([67,65,0,69,0,78,65,0,299,65,0,86,65,0,79,0,83,65,27,-22,222,65,0,321,65,0,82,65,0,76,65,0,290,65,14,-22,77,65,0,65,0,80,65,0,70,65,0,71,65,0,84,65,0,237,65,27,-22,206,78,0,74,65,0,73,0,68,65,0,208,65,14,-22,65,82,0,263,79,0,202,78,0,212,78,0,194,8,0,85,263,79,27,-22,67,65,82,222,0,84,65,299,0,78,69,299,0,69,83,0,78,69,77,14,-22,69,76,79,0,67,69,78,86,79,83,14] as $i){echo ''.mb_chr($i+32);}

Port of Artermis Fowl's answer, and my first codegolf entry!

Leaves me wishing that chr() could just support UTF-8; those extra 3 bytes + 40 characters hurts!

\$\endgroup\$
1
  • \$\begingroup\$ Welcome to PPCG :) \$\endgroup\$
    – Shaggy
    Commented Apr 17, 2019 at 12:15
0
\$\begingroup\$

Python 3, 397 bytes + 19 letters = 777 score

print(''.join(chr(i+32)for i in[67,65,0,69,0,78,65,0,299,65,0,86,65,0,79,0,83,65,27,-22,222,65,0,321,65,0,82,65,0,76,65,0,290,65,14,-22,77,65,0,65,0,80,65,0,70,65,0,71,65,0,84,65,0,237,65,27,-22,206,78,0,74,65,0,73,0,68,65,0,208,65,14,-22,65,82,0,263,79,0,202,78,0,212,78,0,194,78,0,85,263,79,27,-22,67,65,82,222,0,84,65,299,0,78,69,299,0,69,83,0,78,69,77,14,-22,69,76,79,0,67,69,78,86,79,83,14]))

Try it online!

Port of AdmBorkBork's answer.

\$\endgroup\$
4
  • \$\begingroup\$ 1 less letter? \$\endgroup\$
    – ASCII-only
    Commented Apr 16, 2019 at 12:28
  • 1
    \$\begingroup\$ score 732 \$\endgroup\$
    – ASCII-only
    Commented Apr 16, 2019 at 12:29
  • \$\begingroup\$ 562, -2 if using python 2 \$\endgroup\$
    – ASCII-only
    Commented Apr 16, 2019 at 12:33
  • \$\begingroup\$ TIO doesn't work at my organization, so I'll have to wait to get home to add those. \$\endgroup\$
    – Miriam
    Commented Apr 16, 2019 at 14:17
0
\$\begingroup\$

R, 384 bytes + 12 letters * 20 points = 684 score

Not terribly original.

cat(intToUtf8(c(67,65,0,69,0,78,65,0,299,65,0,86,65,0,79,0,83,65,27,-22,222,65,0,321,65,0,82,65,0,76,65,0,290,65,14,-22,77,65,0,65,0,80,65,0,70,65,0,71,65,0,84,65,0,237,65,27,-22,206,78,0,74,65,0,73,0,68,65,0,208,65,14,-22,65,82,0,263,79,0,202,78,0,212,78,0,194,78,0,85,263,79,27,-22,67,65,82,222,0,84,65,299,0,78,69,299,0,69,83,0,78,69,77,14,-22,69,76,79,0,67,69,78,86,79,83,14)+32))

Try it online!

\$\endgroup\$
0
\$\begingroup\$

05AB1E, score 383 365 (325 bytes + 2 letters * 20 penalty)

3343781777797791350694255572961968519437585132057650209974147122192542459108221624793330048943528237823681411832154316740173721249435700067706302064570847610741421342406380917446310820012503592770000532190167243585300911078873144513786923305473352724133578818457026824110152529235136461572588027747840738399150398304 354вç.««

Port of @HyperNeutrino's Jelly answer.

Will try to improve here on. The number is divisible by a bunch of numbers, but none of them would save any bytes unfortunately, and the larger divisors compressed contain at least 1 letter..

-18 (+2 bytes and -20 penalty) thanks to @Grimy, replacing the letter J (join) with .«« (reduce by concatenating).

Try it online.

\$\endgroup\$
2
  • 1
    \$\begingroup\$ J can be .«« for -18. Or for a completely different approach, see my answer. \$\endgroup\$
    – Grimmy
    Commented Sep 5, 2019 at 16:15
  • \$\begingroup\$ @Grimy Thanks! :) And nice answer! \$\endgroup\$ Commented Sep 5, 2019 at 16:52

Not the answer you're looking for? Browse other questions tagged or ask your own question.