15
$\begingroup$

Use case

Some languages offer techniques to ensure certain requirements at compile time. For example, rust has the NonZeroU32 type that will ensure at compile time you can't assign a zero value to it without checking first.

In more complex cases, it can be hard to see if some set of constraints actually restricts the output type as expected. Consider this type in typescript:

type Obj = {... some fields};

const getObj = async<T extends keyof Obj> (id: number, fields: Array<T>): Promise<null | Pick<Obj, T>> =>
            GET(`/api/${ 
                id
            }?fields=${fields.join(',')}`);

// This should fail
getObj(12, ["bla"]).foo

This is intended to return a type with only the fields passed as strings returned (the rest won't even be fetched from the API). However, the way this works is a bit convoluted and it would be easy to make a mistake that would lead to it allowing accessing other fields, which would crash at runtime. I would want to ensure that can't happen. I can't add test cases for this though because if it failed it would prevent the entire test suit from compiling.

This is just an example, I'm looking for language design features that would allow doing this naturally, not advice on this specific case.

Some other examples of failures I might want to test for:

let x: NonZeroU32 = 0;

or

println!("{}"); // bad macro parameters

Question

Do any languages have systems that allow testing code that does not compile? How might such a system be designed?

I'm not looking for advice for this case specifically but more generally if other languages have methods to deal with similar situation.

$\endgroup$
6
  • $\begingroup$ Maybe something like duck typing? I feel like it's a similar concept, where while it doesn't match a valid type/matches a disallowed type, the thing still quacks. $\endgroup$
    – Someone
    Commented Feb 5 at 13:20
  • 3
    $\begingroup$ @Someone Wouldn't that defeat the whole purpose? The whole reason for defining the types like this is to make sure that it fails at compile time if it's not the right type $\endgroup$
    – mousetail
    Commented Feb 5 at 13:25
  • $\begingroup$ Maybe a strongly-typed language could have a -B "broken" option that uses duck typing? I mean, it seems close enough. $\endgroup$
    – Someone
    Commented Feb 5 at 13:36
  • $\begingroup$ @Someone Feel free to add that as an answer. I don't know exactly what you mean but maybe it's what I'm looking for if expanded a bit more $\endgroup$
    – mousetail
    Commented Feb 5 at 13:40
  • 5
    $\begingroup$ The Scala community has longe since raised the more general question: given that our modern type systems are of comparable power to our modern programming languages, what should the tooling look like? What is the type-level equivalent of an IDE, a semantic highlighter, a debugger, a profiler, a test framework, a coverage analyzer? Note that I carefully phrased this as "the Scala community has raised the question", as I believe they have no found a good answer yet. But at least, they recognize the problem. $\endgroup$ Commented Feb 5 at 21:55

6 Answers 6

16
$\begingroup$

Write a test harness that invokes the compiler with the failing code (in a separate file), and inspects the output to make sure it fails. See trybuild for an example.

  • Upside: works in any language
  • Downside: getting your project's dependencies may be hard; worst case, you may need to create a library sub-project. Invoking the compiler may be slow, especially because you probably need to invoke it multiple times (at best, once for every phase you have errors in; if the compiler is dumb and stops on a single error, you have to invoke it once for every test).

If the metaprogramming consists of running code at compile-time, you may be able to test it like any other code with a test-suite which also runs at compile-time.

E.g. in Racket, a macro is simply a function which takes and returns a syntax object. If you extract this function out of define-syntax into define-for-syntax, in a begin-for-syntax block you can call it directly. Here's an example of how to test a macro (derived from the define-syntax documentation):

#lang racket/base
(define-for-syntax foo-impl
  (syntax-rules ()
    ((_ (a) ...) (printf "~a\n" (list a ...)))))
(define-syntax foo foo-impl)

; Usage
(foo (1) (2) (3) (4)) ; produces (printf "~a\n" (list 1 2 3 4))
; (foo 1 2 3 4) fails to compile

; How to test
(require (for-syntax racket/base rackunit))
(begin-for-syntax
  ; Test first usage
  (check-equal?
    (syntax->datum (foo-impl #'(foo (1) (2) (3) (4))))
    (syntax->datum #'(printf "~a\n" (list 1 2 3 4))))
  ; Test second usage (THIS IS WHERE WE TEST DOES NOT COMPILE)
  (check-exn exn:fail? (lambda () (foo-impl #'(foo 1 2 3 4)))))

Similarly in Zig, you can test any comptime function with an expectError which also runs at comptime (or at least I believe, I'm not as familiar with Zig and maybe there are some limitations which prevent comptime expectError...)


AFAIK this feature is natively in Rust (documentation tests with the compile_fail attribute) and via libraries in Scala (Shapeless illTyped) and Haskell (should-not-typecheck).

$\endgroup$
1
10
$\begingroup$

TypeScript 3.9+ has a feature for this: // @ts-expect-error comments. In an example like yours, it would be:

type Obj = {bla: string, foo: string};

const getObj = <T extends keyof Obj> (id: number, fields: Array<T>): Pick<Obj, T> => {
    throw 'empty body';
}

// @ts-expect-error Property 'foo' should not exist
getObj(12, ["bla"]).foo

Playground Link

These comments are checked by the compiler, so that a compilation error occurs if the line following the comment would not otherwise raise a compilation error. It's recommended to add a description to explain why an error is expected.

TypeScript's approach has some limitations:

  • // @ts-expect-error comments are checked by the compiler, not the test runner, so tests are effectively run at build-time rather than alongside your other tests. This might also mean not taking advantage of the nice UI in your IDE for test results.
  • It's not possible to make more granular assertions about what kind of error should occur. For example, you might want to assert that a call like foo(1, 2) fails because the arguments are numbers, but the test would also pass if the call fails from having the wrong number of arguments. This is a solvable problem, but it would require inventing a syntax for describing errors.
$\endgroup$
3
  • 2
    $\begingroup$ It would also require a stable error detection implementation in the compiler which is a challenge in practice if the language is still evolving. $\endgroup$
    – feldentm
    Commented Feb 5 at 20:03
  • 1
    $\begingroup$ @feldentm That's true, but it's fundamental to the problem of testing for compiler errors, and can itself be a reason for wanting such tests. If the compiler changes such that something which used to be an error is no longer an error, but it is desired to be an error, then the test is useful for indicating that some of the types need to change in order to make it an error again. $\endgroup$
    – kaya3
    Commented Feb 5 at 22:02
  • $\begingroup$ True, but if this exposed as a language feature, than changes have a much larger blast radius. $\endgroup$
    – feldentm
    Commented Feb 6 at 17:38
4
$\begingroup$

Do any languages have systems that allow testing code that does not compile?

Yes, several languages make it possible to compile code during runtime.

  • Perl has the eval method, which takes a string and compiles it during runtime (implicitly, before running said code). It makes it easily possible for the caller to work with any kind of error condition (compilation error or execution error).
  • Ruby has the Kernel.eval method which allows for much the same. As everything in Ruby it relies on the usual exception handling, leaving it up to the caller to see if there were compilation errors.
  • Java can do this too, albeit more involved and less integrated into the base language, by exposing the Java compiler as user-callable code.
  • For your own example, JavaScript has eval() too. (And for typescript there is a transpile() method.)
  • Even low level languages like C, which most definitely do not include a compiler in their executables, have been using the method of like simply calling the C compiler as a sub-process, on some piece of source code which has been created by the user program dynamically. This is heavily used, in, for example, the automake and similar build systems, to be able to test the compilation environment of some Unix tool and figure out which of a myriad of options said environment offers. This method can arguably used in every language which allows to run sub-processes, and where the compilation environment is available while the tests are running (which might or might not be trivial, if you are thinking about running your tests in an otherwise "lean" CI/CD runner environment, but certainly possible to achieve).

Using these tools, it is straightforward to write tests that make sure that the compilation fails, via some pseudo code:

test() {
  got_error = false;
  try {
    eval("this does not compile!");
  } catch (CompilationException) {    
    got_error = true;
  }
  assert(got_error);
}

Adjust by taste.

$\endgroup$
2
  • $\begingroup$ Nitpick: You can't eval typescript code. You'd need to run the typescript as a sub-process like your C answer $\endgroup$
    – mousetail
    Commented Feb 6 at 11:03
  • $\begingroup$ Nitpick on your nitpick: there seems to be a transpile() which seems to be doing exactly that. I have added/clarified it. It's not the most important detail of the answer as OP was adamant in it being a generic question, with TS just an example; anyways. ;) $\endgroup$
    – AnoE
    Commented Feb 6 at 15:06
3
$\begingroup$

In C++, and possibly other languages, we have the Detection Idiom. This relies on SFINAE or Concepts to determine whether template code can be instantiated, yielding a bool which you can static_assert() as a test.

I don't have any examples to show, though I do intend to experiment a bit with this and perhaps create some utility code to make it easier to use.

$\endgroup$
1
  • 1
    $\begingroup$ Yes, that's the Concepts version of the detection idiom that I'm describing. $\endgroup$ Commented Feb 10 at 15:50
1
$\begingroup$

Tyr had noCompile as a test category even in the first published version (example). The question can be split into two aspects.

The first is regular types that exclude certain values. Positive, nonzero and nonnull would be examples for such excluding subtypes. Implementation wise, they behave mostly like their base types except that they need to carry a proof that excluded values cannot reach values of the type. If this is not possible, the implementation can still resort to a runtime check or simply fail. While this category is mentioned explicitly in the question, it is not really about failing compilation. Personally, I would always have such checks as runtime checks because type systems usually lack the expressiveness, i.e. analysis quality, to make such features useful in real world code bases.

The second aspect is actual failing compilation. Here, one must be aware what not compiling means for the compiler. When it comes to syntax errors, there might be no reasonable way to recover from that situation and return to the boundaries of the noCompile expression or block or whatever is offered. When it comes to type errors, the compiler needs thorough tracking of what made it reach that type. Also, some errors are there to protect the compiler from running into an infinite loop. Getting out of such loop can mean to abort translation in a way that might not be suitable for recovery. Finally, what might be unexpected, is that evaluation of malformed code can result in the compilers data structures such as the type context getting poisoned with nonsense. Depending on the error class, it can be hard to impossible to allow a compiler to execute the next phase or revert all its changes inside the noCompile context to allow it to produce a usable binary in the end. An example for such an issue is a malformed template type that results in an infinite materialization chain that gets aborted by a limit. This will itself cause an error in a phase requiring all partly materialized types to get fully materialized.

If you think about adding such a feature to your language, have a look at the implementation. If your error category is implemented by just adding an error message to the list and continuing with compilation afterwards, you might likely be fine. If you throw an exception and hope for some surrounding context to abort the member or wherever you can continue, you might exclude the category from such a feature.

TBH, I do not see why one would use a feature like noCompile outside of tests. The primary reason to allow regular programmers to write such tests is to ensure that APIs do not have certain properties like revealing internal state.

$\endgroup$
1
0
$\begingroup$

Compile-time errors due to typing are usually in strongly-typed languages. But if a type is too complicated for a compiler, perhaps a runtime type evaluation is needed.

I don't believe this exists in a language as of now (then again, there are thousands of esolangs), but perhaps some kind of compile/runtime flag?

proposal (probably terrible)

Essentially, when compiling, there can be a -B (for broken) flag that ignores type errors and switches to a duck typing stance. Essentially, while the types might technically not work, the untyped (or weakly typed) code still might, which is the whole idea behind duck typing.

$\endgroup$
2
  • $\begingroup$ How will this let me assert that some construct fails to compile? $\endgroup$
    – mousetail
    Commented Feb 5 at 14:45
  • 1
    $\begingroup$ I…think I answered a slightly different question. Testing whether something compiles or not is something different entirely. $\endgroup$
    – Someone
    Commented Feb 5 at 15:09

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .