Saturday, January 13, 2007

Some history

(This is not technical today but more a piece of personal history. I'm describing some of my background that led me to start the Cohatoe project.)

One of the key motivations for me to develop Cohatoe came from my experiences in the EclipseFP project. EclipseFP implements a set of plugins that support functional programming in the Eclipse IDE. Originally initiated by me, it has meanwhile got a momentum of its own, and there are a number of people who contribute to it. There is some support for OCaml (currently unmaintained and on hold), but the bulk of the functionality is for supporting Haskell development. When I started the EclipseFP project, I was about to learn Haskell, and coming from a Java background, I wanted to have Eclipse support similar to the one that existed for Java. There was no Eclipse plugin for Haskell, so I decided to write my own. (That was a bit of an unclever decision: I ended up with investing more time writing Eclipse code than actually using it for learning to write Haskell code ;-)

Usually, if one starts to develop language support in Eclipse, the first thing is to create a source code editor with some syntax coloring; perhaps some minimal Code Assist (automatic code completion) for keywords; a project type and some wizards that support the creation of source files; and a compiler integration that drives a build when files are saved and reports compiler errors and warnings as 'markers' (which show up in the Problems View and as line annotations in the code).

All these things are pretty easily done, and they can be implemented in Java without a lot of knowledge about the target language specifics going into that code. For instance, for syntax coloring, a simple set of coloring rules (e.g.: apply the comment color from the occurence of '--' until the end of the line) is enough to reasonably color Haskell source files without even actually knowing about the exact syntactical structure of Haskell source code. Eclipse's APIs provide good support for doing these things, and it is not too difficult or too much effort to get some Eclipse-based IDE up and running using them.

However, Eclipse's own Java Development Tools (JDT) have a lot of functionality that cannot be done without building some deep understanding of the language into the code. Here are some examples:

  • Mark Occurrences: given a source file and the cursor positioned on some identifier (e.g. in a call to a function), highlight all other occurrences of that identifier in the editor - and of course do not count just any occurrences of the same string of characters, but only those which actually are the same identifier; don't highlight them in comments, or occurrences of local variables with the same name in a different scope.
  • Source Outline and Code Folding: In the Outline View, there is a tree that represents the language elements (type declarations, field declarations, method declarations etc.) from the source code in the editor; and that tree can be used to navigate the code quickly and even change it (for example, deleting or moving field declarations). Code folding allows to collapse regions of the code (for instance a long comment at the top of the source file) into a single line to make the code more easy to read.
  • Refactoring: there are a lot of automated refactorings, like the simple Rename (of a method or type), which not only changes the type name in the source code, but also adjusts all code that refers to it, and also renames source files accordingly, or more complicated ones such as Extract Method, which allows to isolate a block of code and put it into a newly created method.
Such things usually require analysis of the source code as it is done by compilers and other tools.

(I have written a more extensive article about the topic of how much language-specific code is needed for which functionalities of an Eclipse-based IDE - it was written for a German journal, and is therefore in German, so you will unfortunately only be able to read it if you understand that language. My co-author, Markus Barchfeld, is a committer on the Ruby Development Tools, RDT, and he has made some similar experiences there like the ones I'm describing here.)

There is plainly no chance of re-implementing all this in Java for a language such as Haskell. (The JDT team was in the enviable position to be able to write their own Java compiler and in addition to use their own tool for building itself.) Therefore, the only sensible way of getting deeper integration would be to somehow re-use existing code that does the job. And happily, all that code exists, is free and could be used for an Eclipse-based IDE (think of the GHC API, HaRe, the Haskell Refactorer, and many other Haskell development tools). It is just necessary to find a way to use Haskell code in an Eclipse plugin without too much hassle and compromise.

This is what I hope to achieve with Cohatoe. On the way so far I have tried a number of different approaches, and I have discussed it with many people, notably the people on the EclipseFP mailing list. EclipseFP is now maintained by Thiago Arrais. Thiago has written a Haskell parser that does already a pretty cool job in supporting Code Assist and other helpers in the Haskell source editor. It has replaced an earlier approach of mine which used the Haskell parser from the haskell-src package in the standard libraries, and accessed it via FFI and JNI (Haskell's and Java's respective foreign and native interfaces). That approach had proved to be somewhat unstable (not the parser, but my code that drove it) and had the decisive disadvantage of working only on Windows (because it required the Haskell code to be run in a dll). I also wrote a contribution to the Eclipse Languages Symposium about a year ago. All this, however, left me a bit frustrated. Somehow I wasn't gettting the feeling that real progress could be made with all the approaches that I had tried.

When I was at the Haskell in Leipzig meeting last year, however, there was a very engaged discussion, and I decided to have another stab at the problem; I'd like to thank the participants in that meeting, and in particular Johannes Waldmann, for this very encouraging debate. This time I think I have made some progress, and it seems that is mostly thanks to a change in perspective. I'm focusing more on the programming model this time, instead of thinking in terms of choosing a technology that is able to make the connection between the Haskell code and the Java code that runs inside Eclipse. The goal is to design both the Haskell side and the Eclipse side of things in Cohatoe so that developers find themselves at home on either side, no distortion or unwieldy constructs, or extra efforts necessary. So far, this has worked out fine, so let's see what happens next :-)

No comments: