Raj Bandyopadhyay, Compiling Dynamic Languages

Dynamic scripting languages such as Python, Matlab and R are ubiquitous today due their ease of use and flexibility. However, the use of these languages in scientific computing is limited owing to their poor performance compared to C/Fortran. This is because these languages are either interpreted or byte-code compiled, instead of being compiled to native code. In order to combine the productivity of scripting languages with the performance of C/Fortran, we need to develop methods and tools for rapidly developing good native-code compilers for these languages.

We propose an alternative approach in compiler development: translating the dynamically typed language to a strong statically typed functional language. This idea is related to that of Typed Intermediate languages, developed by the Fox project at CMU and the Flint project at Yale, and currently used in the Singularity Project at Microsoft. Using functional languages takes advantage of existing infrastructure such as automatic memory management. It also provides a high-level platform for performing reusable compiler optimizations. In addition, we obtain a formal semantics for the source language and greater reliability due to strong typing.

We have implemented a compiler using this methodology for Python, a dynamically typed scripting language, by translating to OCaml, a strong statically typed functional language. Both the compiler and runtime environment are completely developed in OCaml. In my talk, I will present this work, some encouraging preliminary performance results and the optimizations I am currently working on.