15

I've programmed in Java for about 8 years and I know the language quite well as a developer, but my goal is to deepen my knowledge of the internals. I've taken undergraduate courses in PL design, but they were very broad academic overviews (in Scheme, IIRC).

Can someone suggest a route to start delving into the details? Specifically, are there particular topics (say, garbage collection) that might be more approachable or be a good starting point? Is there a decent high-level book on the internals of the JVM and the design of the Java programming language? My current approach is going to be to start with the JVM spec and research as needed.

1
  • 2
    My personal approach would be (and is) to find the real reason for all that little "why exactly does it work like this" cases in Java. How is auto-boxing defined, how do generic types work. What about var-args? What does the SUPER flag in the class files actually do? Most of that is described in the JVM spec itself, but it requires some work to get it out of there and into your brain ;-) Commented Jun 1, 2012 at 12:05

4 Answers 4

13

I did a bit of this when I started with Java, years ago. My approach was to read the VM spec, and to look at the output of javap -c, which displays the disassembled bytecode of a class. I also tried creating java classes with particular bytecode, using a java bytecode assembler. There is an assembler called jasmin, if you want to try that.

You might also want to look at the Lambda Expression Translation document that Brian Goetz of Oracle has posted, which covers the strategy that will be used to translate the lambdas (closures, essentially) that are being added in Java 8.

You can also get the source code for the Hotspot VM from OpenJDK, and the early access version of the javac compiler with lambda support (hg repository) for JDK 8, if you really feel like diving into the deep end of the pool.

Looking into garbage collection is probably a good idea. A quick search turned up this Dr. Dobbs article on Java's garbage-first GC. I don't know if that's a good introduction. I assume you already know about mark-and-sweep and generational garbage collectors; if not, you will want to read up on those first.

3

Some additional ideas:

  • Get involved in the OpenJDK project. Nothing beats understanding the internals of some software by hacking on it!
  • Look at how other JVM languages (e.g. Clojure or Scala) generate code for the JVM
  • Do a mini-project that interests you and requires use of JVM internals. Perhaps using something like ASM to manipulate bytecode..
2

If you are not familiar yet with the Java bytecode format, then consider writing a tiny compiler which creates valid Java byte code (or Jasmin assembler) and make it run correctly.

Seeing "Hello World" or "4" (given 2+2) generated by your code is very satisfying.

1

In addition to what everyone else have said, check out the Java Performance Tuning page. It contains a lot of articles on garbage collection and performance tuning in general. It also contains a list of performance tuning books that might be interesting for you as well.

Not the answer you're looking for? Browse other questions tagged or ask your own question.