How do I implement AOT compilation of my 2-D esolang?

Question

Background

Trilangle is an interpreted 2-D esolang that I made in February and March of this year. At some point, I had the idea to implement an ahead-of-time compiler for the language, and what resulted was a "disassembler" feature to convert it to an assembly-like syntax.

For example, when passing the cat program below with the flags -Dn, the output is as follows:
4:      GTC
5:      BNG 12
7:      PTC
9:      POP
11:     JMP 2
12:     EXT

This syntax has been useful to write programs, and the disassembler has been an incredibly helpful debugging tool. It could be translated into bytecode or machine code far more easily than the original source could, though I haven't implemented that yet.

I attempted to implement this as a depth-first traversal of possible instruction pointer states with loop detection. However, it's not the cleanest code; it traverses the program twice in a way that's probably not necessary.

When I added the threading operator in version 1.1, it was an entirely new kind of instruction I had to add support for in the disassembler. The code became a bit entangled, due to the way the threading operator works.

The threading operator represents one of three operations, depending on which direction the IP is moving, and parallel "thread" execute round-robin style. The three operations are:

"thread kill" TKL. This can be followed by anything or nothing, similar to EXT, as once the interpreter hits this it won't keep going.
"thread spawn" TSP. This takes a label as an argument, similar to BNG and JMP. When the interpreter reaches this, the current thread splits in two; one picks up on the instruction after the TSP, and the other starts at the given label.
"thread join" TJN. When one thread reaches this, it waits for a second thread to reach the same one, and then they merge their stacks and continue as a single thread.

The language lacks explicit looping/goto instructions; instead, loops are achieved by using the IP redirection instructions to get to somewhere you've already been. The threading operator is special, however: depending on which direction you're approaching it in, this may represent a "thread join" operation rather than a simple loop. Both situations mark one entry point as a jump to the other, but thread join also inserts a TJN instruction at the location being jumped to. Once I thought I had it working well enough, I dropped this comment at the top of the file:

// FIXME: This class is held together by an uncomfortable amount of spaghetti. It seems to work, for now, but it's prone
// to error if you look at it funny. It is nontrivial to add new kinds of operations to it. The entire class may have to
// be redesigned from the ground up.

In the wise words of Tom Scott, "That was a problem for future me, and now I am future me."

I discovered a bug involving the threading operator recently. While I fixed the particular example I cited in that issue, I did so with a seemingly unnecessary check, and I would feel much better about tearing that code out and starting over.

The Question

So, with all of that out of the way:

Was my original idea (a depth-first traversal of IP states) even the correct approach? If so, how do I implement loop- and threadjoin-detection in an extensible way? If not, how should I approach it?

By my understanding, your question is about how to translate your language's unusual control flow into an an assembly language which uses jumps. Is that correct, or does your question relate to other aspects of your language besides control flow? — kaya3, Commented May 19, 2023 at 4:29
Yes, that sounds correct. I don’t think multithreading/parallelism tags apply here, despite the problematic operation being called “threadjoin”, as the compiler traverses the code serially. — Bbrk24, Commented May 19, 2023 at 4:34
Does every "thread" execute one command each tick in round-robin fashion, or are the threads allowed to interleave arbitrarily (as in true threads)? You might want to clarify what the interpreter would do with { (in Trilangle) and TSP/TJN/TKL (in disassembly). — Bubbler, Commented May 19, 2023 at 4:55
I guarantee round-robin execution, but not the order of threads within each tick. I’ll edit the post to clarify. — Bbrk24, Commented May 19, 2023 at 11:19

Jacob · Accepted Answer · 2023-05-19 16:14:32Z

A depth first (or breadth-first, doesn't matter) search seems to be the most reasonable approach.

Trilangle does not seem to have any self-modification operators unlike most other 2d languages, so building a graph of all possible instruction pointer locations is quite doable.

Every instruction can be radically different depending on what direction you approach from. I'd start by creating a list of program fragments, and a hash map of source code positions to locations in your assembly:

I'm using vaguely rust-like psuedocode here as I'm not super familiar with C++.

// This is all program fragments. We don't know how long each of them is so we have a vector of vectors.
// If a item is None it is not yet compiled, it is our todo list
let program_fragments: List< Option< List< Instruction > > > = [None];
// This is a mapping of locations in the source code to locations in the assembly. Each item is a tuple of fragment index then index in the fragment
let program_location_map: HashMap<(SourceCodePosition, Direction), (integer, integer)> = HashMap::new();

program_location_map[(initial_position, initial_direction)] = [0,0]

Now it's just iterating over the un-compiled but reachable locations until none remain. Something like this:

// This hypothetical function gives both the location of the first uncompleted segment as well as the index of the fragment to be compiled in the instruction list.
outer: while let Some((location, direction), instruction_list_index) = get_first_uncompiled_reachable_area {
    instructions = []
  
    while !instruction_at(location, direction).is_branching_instruction() {
        // This condition checks for joins, if we join a place we have been before just jump to that pre-computed location and exit
        if program_locations_map.contains(location, direction) {
            instructions.add(Jump(program_locations_map(location, direction)))
            continue outer;
        } else {
            instructions.add(instruction_at(location, direction);
            program_location_map[(location, direction)] = (instruction_list_index, len(instructions));
            (location, direction) = next_instruction(location, direction);
        }
    }
    
    program_fragments[instruction_list_index] = Some(instructions);
    
    // Handle the branching operator
    // Threading would be considered branching
    // Also program termination is branching just with 0 branches

    let instruction = instruction_at(location, direction);
    program_location_map[(location, direction)] = (instruction_list_index, len(instructions));
    
    
    // Que branches being crated
    let fragments = instructions.map(|i|{
        // If the destination has already been compiled, just branch to it
        if (program_locations_map.contains(i.new_location, i.new_direction) {
            return program_locations_map(i.new_location, i.new_direction);
        } else {
            program_locations_map.insert((i.new_location, i.new_direction), program_fragments.length)
            program_fragments.add(None);
            return (program_fragments.length - 1, 0);
        }
    });

    // We add the instruction but since we don't know where the final position is going to be we need something like a psuedolocation that can be resolved to an actual location in source code after all the fragments are concatenated
    instructions.add(
        instruction.with_psuedo_destinations(fragments)
    )
    
}

This gives a list of branchless bits of assembly code. Any jumping instructions will be placeholders. At this point, you can just concatenate all the fragments and resolve all psuedo jumps and branches to actual jumps and branches in the new instruction list.

If you just want to create a disassembler, you can just label the branch destinations and you are done.

However, if you want to make an actual compiler to machine code, your fake assembly needs to be converted to actual assembly. Many of your instructions are a bit complex so likely will become a significant amount of assembly lines. For very complex functions like your threading you might want to consider turning them into function calls and distributing the functions separately.

The rest of the compilation should proceed the same as any other language.

On your threading operators

Threads can be implemented as just branching operators. Making multiple threads work in round-robin style in assembly is quite hard, as the assembly code is totally different depending on what combination of threads are running at the same time. There could potentially be a infinite number of combinations.

I'd suggest instead of inserting alternating instructions in the assembly, to only order IO instructions. E.g: If thread one needs to output something after 7 instructions and thread 2 after 14, use a locking mechanism to ensure the first outputs first but otherwise let the rest of the code execute simultaneously. In case of error this would effect behavior, though.

You can keep a variable per flag that keeps track of how far it is. This way you can wait for a IO operation until

let thread.io_index;
let thread.block_io_index;

fn block_end(block) {
    thread.block_io_index += block.length
    thread.io_index = thread.block_io_index
}

fn io_operation(operation) {
    thread.io_index = operation.io_index + block.length
    while threads.any(thread.io_index < operation.io_index + block.length) {
        wait
    }

    perform_io_operation();
}

Possible extra optimizations

Re-arranging your program fragments in such a order to minimize needed branching
Removing sequences of stack operators from the assembly of each segment, and finding the most efficient set of operators that give the same result
Automatically detecting any identical fragments and consolidating them

That get_first_uncompiled_reachable_area is doing a lot of heavy lifting in your code snippet, and off the top of my head I'm not sure how to implement that. I'll think about it some and see if this insight helps, but it seems like this just interleaves the double-traversal rather than removing it. Still, compiling each branch individually is good insight, so I'll see what I can do with that. — Bbrk24, Commented May 19, 2023 at 11:35
@Bbrk24 that part is just a simple for loop. The information is in the array and the map — mousetail, Commented May 19, 2023 at 11:36
Oh yeah, I see now that it can be implemented by just scanning instructions. That does make that part clearer. I'll still wait to accept this (or any answer) until I actually implement this in case there's some flaw here I haven't noticed yet. — Bbrk24, Commented May 19, 2023 at 11:40
@Bbrk24 I added a part about how to handle your round Robbin threading, since I assumed no synchronization earlier — mousetail, Commented May 19, 2023 at 11:56
You can consider joins like other IO instructions, they can use the same ordering mechanism — mousetail, Commented May 19, 2023 at 12:00

Stack Exchange Network

How do I implement AOT compilation of my 2-D esolang?

Background

The Question

1 Answer 1

On your threading operators

Possible extra optimizations

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
implementation
compilers
control-flow
.

Hot Network Questions

How do I implement AOT compilation of my 2-D esolang?

Background

The Question

1 Answer 1

On your threading operators

Possible extra optimizations

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged implementationcompilerscontrol-flow.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
implementation
compilers
control-flow
.