Is there a downside to using offsets instead of raw pointers in a virtual machine?

Ask Question

Asked 1 month ago

Modified 1 month ago

Viewed 203 times

Say I'm designing a virtual machine for a bytecode compiler/interpreter, using C as the implementation language. Some kind of “tagged” representation of values is simplest for this language, where every object carries information about what kind of object it is. Most objects are heap-allocated, so their usual representation at runtime is as a pointer.

A common way of doing tagged pointers is to allocate with a certain alignment so that some of the least significant bits don't matter; the tag goes in those bits. However, as I understand it, this is sketchy in C as far as portability is concerned (at least, that's why the Lua developers decided to use a normal union).

Is there a reason not to use offsets from the start of my language's heap instead of “raw“ pointers? Tagging offsets is easy, and I have better control over how the tagging works (e.g. it can be in the high bits and it can be more than a few bits wide).

I feel that heap[p] isn't too much worse than *p; in fact, the intent is clearer in my opinion. Also fewer casts would be necessary.

One potential objection is that I'd have to do my own memory management. But my language already requires garbage collection, so that isn't too bad.

A recent question is related to this one, but it does not really help me, since the question and the existing answer take the use of an offset instead of a pointer for granted. And that answer addresses the specific question about a modification of the offset representation.

I'd like to clarify here that I am not interested in pointer compression or saving space. I really don't care about “wasted” bits in a 64-bit word; I don't plan on significantly limiting the heap size available to programs in my language, so the larger the address space the better. If it turns out to be a problem I can consider compression mechanisms. For now I find it simpler to reserve some bits for tagging and let the rest of the word be an address or offset.

edited Jun 14 at 18:00

asked Jun 13 at 17:12

texdr.aft

3132 silver badges10 bronze badges

4

$\begingroup$ This is discussed at some length on the V8 dev blog: Pointer Compression in V8. $\endgroup$
– kaya3
Commented Jun 13 at 17:42
1

$\begingroup$ Does this answer your question? Compressed pointers, why not "relative" rather than "base" encoding? $\endgroup$
– Greg Hewgill
Commented Jun 13 at 20:08
2

$\begingroup$ @GregHewgill It's closely related, but I'm not asking about compressed pointers and it focuses on an alteration to the basic “offset instead of pointer” scheme. I'm not really interested in trying to save space. An answer to that question might be an answer to this one, but the existing answer addresses primarily the specific concerns with relative offsets. However, the viability of offsets as a strategy is implicitly confirmed by the cited real-world uses, which is good to know. $\endgroup$
– texdr.aft
Commented Jun 13 at 20:21
1

$\begingroup$ Possibly a slight hit to performance. Every memory reference will need an extra add to compute heap+p (in addition to masking off the tag bits of p, but that would apply to tagged raw pointers too). But some machines have addressing modes where the add could come for free. Also, you'll need to load the heap pointer frequently, and probably spend a register in most functions to keep it around. The overall impact may or may not be significant; only profiling can tell you for sure. $\endgroup$
– Nate Eldredge
Commented Jun 14 at 5:52
1

$\begingroup$ I think the "common" implementation is a legacy of implementations from decades ago, where the performance hit was more significant. The heap+offset mechanism is probably more acceptable today. $\endgroup$
– Barmar
Commented Jun 15 at 21:54

| Show 2 more comments

Stack Exchange Network

Is there a downside to using offsets instead of raw pointers in a virtual machine?

0

You must log in to answer this question.

Browse other questions tagged
implementation
memory-management
pointers
virtual-machine
.

Linked

Hot Network Questions

Is there a downside to using offsets instead of raw pointers in a virtual machine?

0

You must log in to answer this question.

Browse other questions tagged implementationmemory-managementpointersvirtual-machine.

Linked

Related

Hot Network Questions

Browse other questions tagged
implementation
memory-management
pointers
virtual-machine
.