Skip to main content
deleted 6486 characters in body
Source Link
Wheat Wizard Mod
  • 99k
  • 1
  • 39
  • 82

The goal of this challenge is to fill a niche that is mostly lacking on this site. In my observations there most parsing verification challenges fall into two categories:

  1. Super easy parsing. This parsing can usually be done with a regex, and regex based answers usually do well.

  2. Super complex parsing. These are challenges with a lot of small parsing components that would on their own be type 1, but together make for long code with a lot of edge cases.

Neither of these categories are really bad, each can have fun challenges, but I want to shoot for something different. A challenge which is hard, hard enough that a regex will be unable to do the brunt of the work but with an easy to understand spec.


The goal of this challenge is to parse a lisp-like lambda function. Your code should accept a string as input and output of two possible consistent values. One which corresponds to an affirmative response (i.e. the input is a properly formatted lambda) and the other corresponds to a negative response (i.e. the input is not a properly formatted lambda). The string will only contain lowercase letters, spaces and parentheses (( or )).

I've put together 3 specifications of what constitutes a validly formatted lambda for this challenge. The first is a quick intuition base explanation. It explains the process and what things mean even if it is not perfectly rigorous. The second a formal grammar which expresses the idea succinctly and with no ambiguity. And the third is raw code, which is the hardest to read for humans but allows you to play around with the idea by trying different inputs. I recommend you read one of the first two and play around with the third.

Formatting

The challenge uses a lisp like format. This consists of "lists" which are just parentheses enclosing lowercase words, and other lists separated each by exactly one space. So

(foo (bar baz) foobar)

Would be a lisp-like "list". For our challenge each list will be identified by it's first element (which should by highlighted by the syntax highlighter), which must be either app, abs, or var, representing applications, abstractions and variables, respectively.

An application is the application of one lambda function to another. It is represented by a three element list starting with of course app followed by two other valid lambda expressions. So if S and K were valid lambdas then:

(app S K)

would be a valid lambda too.

The other two abstractions and variables work together, an abstraction introduces new names for variables and variables use those names. An abstraction encloses some lambda which must be valid with the introduced variable name, and a variable can only use names which are given by abstractions that surround it. So

(abs x (abs y (var x)))

Is valid because the x name was introduced and used in a nested context. But

(app (abs x (var x)) (var x))

Is not because even though there is an abstraction for x the second (var x) is not surrounded by any abstraction at all, let alone one that introduces x.

Additionally abstractions can only introduce new variables, so

(abs x (abs x (var x)))

is not valid because the name x is ambiguous, it could refer to either abstraction. Variable names must be a string of lowercase ascii letters (a-z).

Grammar

Now for the golfy definition. This parse cannot be defined in terms of a context free grammar, so this grammar uses context. The context is a set of strings of lowercase ascii letters (these are the valid variable names) and is represented here with \$C\$. To start \$C\$ is empty.

\$ S\left(C\right) := \left\{ \begin{array}[rr] \ \color{green}{\texttt{(var }}x\color{green}{\texttt{)}} & \textrm{ where } x\in C \\ \color{green}{\texttt{(abs }} x\texttt{ }S(C\cup\{x\})\color{green}{\texttt{)}} & \textrm{ where } x\notin C\\ \color{green}{\texttt{(app }} S(C)\texttt{ }S(C)\color{green}{\texttt{)}} & \end{array} \right. \$

Any string spanned by the grammar from \$S(\{\})\$ is valid and any string not is invalid.

Code

lambda :: [String] -> Parser ()
lambda variables =
  do
    char '('
    choice
      [ do
        string "abs "
        name <- many $ charBy (`elem` ['a'..'z'])
        guard $ notElem name variables
        char ' '
        lambda (name : variables)
      , do
        string "var "
        choice (map string variables)
        return ()
      , do
        string "app "
        lambda variables
        char ' '
        lambda variables
      ]
    char ')'
    return ()

Try it online!

Scoring

This is so answers will be scored in bytes with the goal being a smaller score.

Test cases

Reject

(abs x (var x
uuuuuuuuu
)))(abs x (var x))
app (abs x (var x)) (abs x (var x))
(mmm hummus)
(abs x (var y))
(abs x (abs x (var x)))
(app (abs x (var x)) (var x))
(abs (var k) (var (vark k)))
(abs x (abs y (abs y (var x))))

Accept

(abs x (var x))
(abs x (abs y (var x)))
(abs xx (app (var xx) (var xx)))
(app (abs ab (app (var ab) (var ab))) (abs ab (app (var ab) (var ab))))

The goal of this challenge is to fill a niche that is mostly lacking on this site. In my observations there most parsing verification challenges fall into two categories:

  1. Super easy parsing. This parsing can usually be done with a regex, and regex based answers usually do well.

  2. Super complex parsing. These are challenges with a lot of small parsing components that would on their own be type 1, but together make for long code with a lot of edge cases.

Neither of these categories are really bad, each can have fun challenges, but I want to shoot for something different. A challenge which is hard, hard enough that a regex will be unable to do the brunt of the work but with an easy to understand spec.


The goal of this challenge is to parse a lisp-like lambda function. Your code should accept a string as input and output of two possible consistent values. One which corresponds to an affirmative response (i.e. the input is a properly formatted lambda) and the other corresponds to a negative response (i.e. the input is not a properly formatted lambda). The string will only contain lowercase letters, spaces and parentheses (( or )).

I've put together 3 specifications of what constitutes a validly formatted lambda for this challenge. The first is a quick intuition base explanation. It explains the process and what things mean even if it is not perfectly rigorous. The second a formal grammar which expresses the idea succinctly and with no ambiguity. And the third is raw code, which is the hardest to read for humans but allows you to play around with the idea by trying different inputs. I recommend you read one of the first two and play around with the third.

Formatting

The challenge uses a lisp like format. This consists of "lists" which are just parentheses enclosing lowercase words, and other lists separated each by exactly one space. So

(foo (bar baz) foobar)

Would be a lisp-like "list". For our challenge each list will be identified by it's first element (which should by highlighted by the syntax highlighter), which must be either app, abs, or var, representing applications, abstractions and variables, respectively.

An application is the application of one lambda function to another. It is represented by a three element list starting with of course app followed by two other valid lambda expressions. So if S and K were valid lambdas then:

(app S K)

would be a valid lambda too.

The other two abstractions and variables work together, an abstraction introduces new names for variables and variables use those names. An abstraction encloses some lambda which must be valid with the introduced variable name, and a variable can only use names which are given by abstractions that surround it. So

(abs x (abs y (var x)))

Is valid because the x name was introduced and used in a nested context. But

(app (abs x (var x)) (var x))

Is not because even though there is an abstraction for x the second (var x) is not surrounded by any abstraction at all, let alone one that introduces x.

Additionally abstractions can only introduce new variables, so

(abs x (abs x (var x)))

is not valid because the name x is ambiguous, it could refer to either abstraction. Variable names must be a string of lowercase ascii letters (a-z).

Grammar

Now for the golfy definition. This parse cannot be defined in terms of a context free grammar, so this grammar uses context. The context is a set of strings of lowercase ascii letters (these are the valid variable names) and is represented here with \$C\$. To start \$C\$ is empty.

\$ S\left(C\right) := \left\{ \begin{array}[rr] \ \color{green}{\texttt{(var }}x\color{green}{\texttt{)}} & \textrm{ where } x\in C \\ \color{green}{\texttt{(abs }} x\texttt{ }S(C\cup\{x\})\color{green}{\texttt{)}} & \textrm{ where } x\notin C\\ \color{green}{\texttt{(app }} S(C)\texttt{ }S(C)\color{green}{\texttt{)}} & \end{array} \right. \$

Any string spanned by the grammar from \$S(\{\})\$ is valid and any string not is invalid.

Code

lambda :: [String] -> Parser ()
lambda variables =
  do
    char '('
    choice
      [ do
        string "abs "
        name <- many $ charBy (`elem` ['a'..'z'])
        guard $ notElem name variables
        char ' '
        lambda (name : variables)
      , do
        string "var "
        choice (map string variables)
        return ()
      , do
        string "app "
        lambda variables
        char ' '
        lambda variables
      ]
    char ')'
    return ()

Try it online!

Scoring

This is so answers will be scored in bytes with the goal being a smaller score.

Test cases

Reject

(abs x (var x
uuuuuuuuu
)))(abs x (var x))
app (abs x (var x)) (abs x (var x))
(mmm hummus)
(abs x (var y))
(abs x (abs x (var x)))
(app (abs x (var x)) (var x))
(abs (var k) (var (vark k)))
(abs x (abs y (abs y (var x))))

Accept

(abs x (var x))
(abs x (abs y (var x)))
(abs xx (app (var xx) (var xx)))
(app (abs ab (app (var ab) (var ab))) (abs ab (app (var ab) (var ab))))
added 11 characters in body
Source Link
Wheat Wizard Mod
  • 99k
  • 1
  • 39
  • 82
(abs x (var x
uuuuuuuuu
)))(abs x (var x))
app (abs x (var x)) (abs x (var x))
(mmm hummus)
(abs x (var y))
(abs x (abs x (var x)))
(app (abs x (var x)) (var x))
(abs (var k) (var (vark k)))
(abs x (var x
uuuuuuuuu
)))(abs x (var x))
app (abs x (var x)) (abs x (var x))
(mmm hummus)
(abs x (var y))
(abs x (abs x (var x)))
(app (abs x (var x)) (var x))
(abs (var k) (var (vark k)))
(abs x (abs y (abs y (var x))))
(abs x (var x
uuuuuuuuu
)))(abs x (var x))
app (abs x (var x)) (abs x (var x))
(mmm hummus)
(abs x (var y))
(abs x (abs x (var x)))
(app (abs x (var x)) (var x))
(abs (var k) (var (vark k)))
(abs x (var x
uuuuuuuuu
)))(abs x (var x))
app (abs x (var x)) (abs x (var x))
(mmm hummus)
(abs x (var y))
(abs x (abs x (var x)))
(app (abs x (var x)) (var x))
(abs (var k) (var (vark k)))
(abs x (abs y (abs y (var x))))
deleted 180 characters in body
Source Link
Wheat Wizard Mod
  • 99k
  • 1
  • 39
  • 82

Now for the golfy definition. This parse cannot be defined in terms of a context free grammar, so this grammar uses context. The context is a set of strings of lowercase ascii letters (these are the valid variable names) and is represented here with \$C\$. To start \$C\$ is empty.

\$ \begin{array}[rr] \ S\left(x\in C\right) &:= \color{green}{\texttt{(var }}x\color{green}{\texttt{)}} \\ S\left(x\notin C\right) &:= \color{green}{\texttt{(abs }} x\texttt{ }S(C\cup\{x\})\color{green}{\texttt{)}} \\ S\left(C\right) &:= \color{green}{\texttt{(app }} S(C)\texttt{ }S(C)\color{green}{\texttt{)}} \end{array} \$\$ S\left(C\right) := \left\{ \begin{array}[rr] \ \color{green}{\texttt{(var }}x\color{green}{\texttt{)}} & \textrm{ where } x\in C \\ \color{green}{\texttt{(abs }} x\texttt{ }S(C\cup\{x\})\color{green}{\texttt{)}} & \textrm{ where } x\notin C\\ \color{green}{\texttt{(app }} S(C)\texttt{ }S(C)\color{green}{\texttt{)}} & \end{array} \right. \$

Any string spanned by the grammar from \$S(\{\})\$ is valid and any string not is invalid.

Now for the golfy definition. This parse cannot be defined in terms of a context free grammar, so this grammar uses context. The context is a set of strings of lowercase ascii letters (these are the valid variable names) and is represented here with \$C\$.

\$ \begin{array}[rr] \ S\left(x\in C\right) &:= \color{green}{\texttt{(var }}x\color{green}{\texttt{)}} \\ S\left(x\notin C\right) &:= \color{green}{\texttt{(abs }} x\texttt{ }S(C\cup\{x\})\color{green}{\texttt{)}} \\ S\left(C\right) &:= \color{green}{\texttt{(app }} S(C)\texttt{ }S(C)\color{green}{\texttt{)}} \end{array} \$

Any string spanned by the grammar is valid and any string not is invalid.

Now for the golfy definition. This parse cannot be defined in terms of a context free grammar, so this grammar uses context. The context is a set of strings of lowercase ascii letters (these are the valid variable names) and is represented here with \$C\$. To start \$C\$ is empty.

\$ S\left(C\right) := \left\{ \begin{array}[rr] \ \color{green}{\texttt{(var }}x\color{green}{\texttt{)}} & \textrm{ where } x\in C \\ \color{green}{\texttt{(abs }} x\texttt{ }S(C\cup\{x\})\color{green}{\texttt{)}} & \textrm{ where } x\notin C\\ \color{green}{\texttt{(app }} S(C)\texttt{ }S(C)\color{green}{\texttt{)}} & \end{array} \right. \$

Any string spanned by the grammar from \$S(\{\})\$ is valid and any string not is invalid.

deleted 180 characters in body
Source Link
Wheat Wizard Mod
  • 99k
  • 1
  • 39
  • 82
Loading
Minor typo
Source Link
user
  • 357
  • 11
  • 22
Loading
added 36 characters in body
Source Link
Wheat Wizard Mod
  • 99k
  • 1
  • 39
  • 82
Loading
added 86 characters in body
Source Link
Wheat Wizard Mod
  • 99k
  • 1
  • 39
  • 82
Loading
added 3 characters in body
Source Link
Wheat Wizard Mod
  • 99k
  • 1
  • 39
  • 82
Loading
Added the code
Source Link
Wheat Wizard Mod
  • 99k
  • 1
  • 39
  • 82
Loading
Source Link
Wheat Wizard Mod
  • 99k
  • 1
  • 39
  • 82
Loading