Compilation and Virtual Machine Implementation

A compiler translates programs in one executable language to equivalent programs in another language. Some examples of executable languages are: Scheme, Java, C, C++, Sparc machine code, Pentium machine code, and Java Virtual Machine code. Java is unique among commercial and pedagogic languages because the language definitions specifies the target language for Java compilers, namely the Java Virtual Machine (JVM), which is a machine architecture akin to the Sparc or the Pentium processors. The Java Virtual Machine is not intended to specify the actual hardware on which Java programs will run. A Java virtual machine can be implemented on top of any hardware processor (e.g., Sparc or Pentium) through a combination of compilation from JVM code to "native" machine code (e.g., Sparc or Pentium code) and interpretation. A Java Virtual Machine that interprets JVM code includes a program that simulates the JVM architecture (much like the simulator for the ?? that you studied in Comp 210).

The JDK 1.2 implementation of the JVM that we are using in this course compiles each method to Sparc machine code before it executes it. This process is called "Just-In-Time" (JIT) compilation. Note that JIT-compilation is a completely separate process from the compilation of the Java source language to JVM code in class files. The old JDK 1.1.4 JVM (invoked by the oldjavac and oldjava commands in the ~comp212/bin directory) interprets JVM code. The Java compiler in JDK 1.1.4 is identical to the Java compiler in JDK 1.2. The latter runs faster only because the compiler is a Java program running on the underlying JVM and the JDK 1.2 JVM is much faster than the JDK 1.1.4 JVM.

Virtual machines are commonly used in language implementations to achieve portability across processor architectures. For example, DrScheme translates Scheme to a byte code language, which it then interprets. As a result, the same DrScheme system (except for minor but tedious modifcations to accommodate differences in operating systems and byte ordering in machine words) runs a wide variety of processor architectures. If DrScheme relied on translation to native machine code to execute Scheme code, a completely separate translator would be required for each machine architecture.

Java has taken the virtual machine idea one step further. The translation of Java to JVM code is part of the language specification. Every Java source language compiler must produce exactly the class file format containing JVM code. Hence, every Java compiler is compatible with every JVM regardless of the machine on which the compiler or the JVM runs.