In 2020, I released Cubiml, showing how to combine full type inference with structural subtyping in an ML-like language, and earlier this year, I followed it up with PolySubML, extending it with higher rank types and existential types among other features. For my next language (which I’ll call X here, since I haven’t chosen a name yet), I set the ambitious goal of supporting all of OCaml’s most notable functionality on top of everything PolySubML already supports. In this post, I will talk about the biggest OCaml feature that needs to be added, modules.
OCaml modules are not like the modules you might be used to in other languages. The basic idea is to be able to bundle data and types together and pass them around as record-like objects. They’re a fairly unique feature, and there’s considerable debate about whether they are worth it compared to simpler and easier-to-use systems like Haskell typeclasses. But the goal of X is to emulate the major features of OCaml, so that’s what we’re going to do.
OCaml syntax is almost like two languages in one. There’s the ordinary language of types and values and functions, but then also the module system, which has a completely separate set of syntax and concepts - module types and module values and functors. People have often dreamt of unifying them, and the 1ML project tried to do this back in 2015.
Naturally, I wanted to unify them in X as well, and thus will be explaining in this post how to do this and what the difficulties are. As it turns out, they can mostly be unified, but there are some minor aspects that still require separate syntax. But first, some disclaimers:
Disclaimers
X is still in the planning stages and currently exists only in my head. What is described here is the planned design for X, but it’s always possible that unforeseen issues will come up that force design changes. It’s also possible that I will make changes as a result of feedback from this post.
Additionally, I am not an OCaml expert. I’ve done my best to research how everything works but it’s possible that I’ve got some details wrong. And whenever I talk about OCaml, I’ll be describing things from the perspective of how equivalent features would be implemented in X/PolySubML, even if OCaml itself sees things differently or uses different terminology (e.g. “type abbreviations” instead of “type aliases”).
I. Introduction to OCaml modules
Before we can worry about how to implement them, we first have to understand how OCaml modules actually work. Let’s begin with a simple example:
module M : sig type t val zero : t val add : ( t * t ) -> t end = struct type t = int let zero = 0 let add ( x , y ) = x + y end (* Example usage of module M *) let foo = M . zero let bar = M . add ( foo , foo ) let baz : M . t = M . add ( foo , bar )
... continue reading