70

I've recently been looking at The Java Virtual Machine Specifications (JVMS) to try to better understand the what makes my programs work, but I've found a section that I'm not quite getting...

Section 4.7.4 describes the StackMapTable Attribute, and in that section the document goes into details about stack map frames. The issue is that it's a little wordy and I learn best by example; not by reading.

I understand that the first stack map frame is derived from the method descriptor, but I don't understand how (which is supposedly explained here.) Also, I don't entirely understand what the stack map frames do. I would assume they're similar to blocks in Java, but it appears as though you can't have stack map frames inside each other.

Anyway, I have two specific questions:

  • What do the stack map frames do?
  • How is the first stack map frame created?

and one general question:

  • Can someone provide an explanation less wordy and easier to understand than the one given in the JVMS?
8
  • 1
    @EJP It's something I'm working on. That's one of the main reasons I decided to read the JVMS in the first place.
    – Steven
    Commented Aug 3, 2014 at 23:48
  • 3
    @EJP I've also been reading the JVM spec, and believe me, it is not just like reading the spec, for instance: to understand the part of how the type verification works (related to this question) you need to have a basic knowledge of Prolog programming... so I think a question/answer for this is worth to be in Stackoverflow
    – morgano
    Commented Aug 3, 2014 at 23:56
  • @morgano, I find that it's much more helpful to ignore the Prolog stuff and focus on the classic inference verifier. The new verifier is very similar, they just decided to specify it in 200 pages of Prolog instead of using a vauge English description like the old one
    – Antimony
    Commented Aug 4, 2014 at 1:09
  • @Antimony exactly, that is what this question is about, to translate to plain English the formal specification.
    – morgano
    Commented Aug 4, 2014 at 1:11
  • 1
    I have tried to explain them comprehensively here volatileinterface.com/… Commented Sep 12, 2015 at 18:16

1 Answer 1

162

Java requires all classes that are loaded to be verified, in order to maintain the security of the sandbox and ensure that the code is safe to optimize. Note that this is done on the bytecode level, so the verification does not verify invariants of the Java language, it merely verifies that the bytecode makes sense according to the rules for bytecode.

Among other things, bytecode verification makes sure that instructions are well formed, that all the jumps are to valid instructions within the method, and that all instructions operate on values of the correct type. The last one is where the stack map comes in.

The thing is that bytecode by itself contains no explicit type information. Types are determined implicitly through dataflow analysis. For example, an iconst instruction creates an integer value. If you store it in slot 1, that slot now has an int. If control flow merges from code which stores a float there instead, the slot is now considered to have invalid type, meaning that you can't do anything more with that value until overwriting it.

Historically, the bytecode verifier inferred all the types using these dataflow rules. Unfortunately, it is impossible to infer all the types in a single linear pass through the bytecode because a backwards jump might invalidate already inferred types. The classic verifier solved this by iterating through the code until everything stopped changing, potentially requiring multiple passes.

However, verification makes class loading slow in Java. Oracle decided to solve this issue by adding a new, faster verifier, that can verify bytecode in a single pass. To do this, they required all new classes starting in Java 7 (with Java 6 in a transitional state) to carry metadata about their types, so that the bytecode can be verified in a single pass. Since the bytecode format itself can't be changed, this type information is stored seperately in an attribute called StackMapTable.

Simply storing the type for every single value at every single point in the code would obviously take up a lot of space and be very wasteful. In order to make the metadata smaller and more efficient, they decided to have it only list the types at positions which are targets of jumps. If you think about it, this is the only time you need the extra information to do a single pass verification. In between jump targets, all control flow is linear, so you can infer the types at in between positions using the old inference rules.

Each position where types are explicitly listed is known as a stack map frame. The StackMapTable attribute contains a list of frames in order, though they are usually expressed as a difference from the previous frame in order to reduce data size. If there are no frames in the method, which occurs when control flow never joins (i.e. the CFG is a tree), then the StackMapTable attribute can be omitted entirely.

So this is the basic idea of how StackMapTable works and why it was added. The last question is how the implicit initial frame is created. The answer of course is that at the beginning of the method, the operand stack is empty and the local variable slots have the types given by the types of the method parameters, which are determined from the method decriptor.

If you're used to Java, there are a few minor differences to how method parameter types work at the bytecode level. First off, virtual methods have an implicit this as first parameter. Second, boolean, byte, char, and short do not exist at the bytecode level. Instead, they are all implemented as ints behind the scenes.

6
  • 12
    As an amendment to your last paragraph, long and double parameters will, like for all local variables, consume two local variables in the stack frame.
    – Holger
    Commented Aug 4, 2014 at 11:01
  • I'm pretty new to this bytecode stuff, but if I'm writing an app which has a fixed list of variables, is defining Frames for jumps strictly necessary? My ASM Eclipse plugin inserts frames, but it seems the code works just fine without them - and the program is using both If and do-while.
    – ThomasRS
    Commented Aug 28, 2018 at 22:41
  • New indeed, I was not aware that ASM can be configured to automatically insert the frames!
    – ThomasRS
    Commented Sep 12, 2018 at 12:40
  • @ThomasRS yes ASM can automatically compute frames (althought it's twice as expensive as manually computing it in many cases - but from a friend's experience, calculating frames manually is a pain)
    – arviman
    Commented May 12, 2020 at 5:35
  • I just don't know why they named it stack map frame? It makes me to think that it's related to frames in jvm stack!
    – sify
    Commented Oct 24, 2022 at 9:57

Not the answer you're looking for? Browse other questions tagged or ask your own question.