Sunday, January 28, 2007

History of Haskell

Simon Peyton Jones has announced the final version of the History of Haskell paper that he has written together with Paul Hudak, John Hughes and Philip Wadler. It is a longish text, but a great read, especially if you have (like me) only started with Haskell in the recent few years.

(EclipseFP is also mentioned in the section about development tools :-)

Saturday, January 13, 2007

Some history

(This is not technical today but more a piece of personal history. I'm describing some of my background that led me to start the Cohatoe project.)

One of the key motivations for me to develop Cohatoe came from my experiences in the EclipseFP project. EclipseFP implements a set of plugins that support functional programming in the Eclipse IDE. Originally initiated by me, it has meanwhile got a momentum of its own, and there are a number of people who contribute to it. There is some support for OCaml (currently unmaintained and on hold), but the bulk of the functionality is for supporting Haskell development. When I started the EclipseFP project, I was about to learn Haskell, and coming from a Java background, I wanted to have Eclipse support similar to the one that existed for Java. There was no Eclipse plugin for Haskell, so I decided to write my own. (That was a bit of an unclever decision: I ended up with investing more time writing Eclipse code than actually using it for learning to write Haskell code ;-)

Usually, if one starts to develop language support in Eclipse, the first thing is to create a source code editor with some syntax coloring; perhaps some minimal Code Assist (automatic code completion) for keywords; a project type and some wizards that support the creation of source files; and a compiler integration that drives a build when files are saved and reports compiler errors and warnings as 'markers' (which show up in the Problems View and as line annotations in the code).

All these things are pretty easily done, and they can be implemented in Java without a lot of knowledge about the target language specifics going into that code. For instance, for syntax coloring, a simple set of coloring rules (e.g.: apply the comment color from the occurence of '--' until the end of the line) is enough to reasonably color Haskell source files without even actually knowing about the exact syntactical structure of Haskell source code. Eclipse's APIs provide good support for doing these things, and it is not too difficult or too much effort to get some Eclipse-based IDE up and running using them.

However, Eclipse's own Java Development Tools (JDT) have a lot of functionality that cannot be done without building some deep understanding of the language into the code. Here are some examples:

  • Mark Occurrences: given a source file and the cursor positioned on some identifier (e.g. in a call to a function), highlight all other occurrences of that identifier in the editor - and of course do not count just any occurrences of the same string of characters, but only those which actually are the same identifier; don't highlight them in comments, or occurrences of local variables with the same name in a different scope.
  • Source Outline and Code Folding: In the Outline View, there is a tree that represents the language elements (type declarations, field declarations, method declarations etc.) from the source code in the editor; and that tree can be used to navigate the code quickly and even change it (for example, deleting or moving field declarations). Code folding allows to collapse regions of the code (for instance a long comment at the top of the source file) into a single line to make the code more easy to read.
  • Refactoring: there are a lot of automated refactorings, like the simple Rename (of a method or type), which not only changes the type name in the source code, but also adjusts all code that refers to it, and also renames source files accordingly, or more complicated ones such as Extract Method, which allows to isolate a block of code and put it into a newly created method.
Such things usually require analysis of the source code as it is done by compilers and other tools.

(I have written a more extensive article about the topic of how much language-specific code is needed for which functionalities of an Eclipse-based IDE - it was written for a German journal, and is therefore in German, so you will unfortunately only be able to read it if you understand that language. My co-author, Markus Barchfeld, is a committer on the Ruby Development Tools, RDT, and he has made some similar experiences there like the ones I'm describing here.)

There is plainly no chance of re-implementing all this in Java for a language such as Haskell. (The JDT team was in the enviable position to be able to write their own Java compiler and in addition to use their own tool for building itself.) Therefore, the only sensible way of getting deeper integration would be to somehow re-use existing code that does the job. And happily, all that code exists, is free and could be used for an Eclipse-based IDE (think of the GHC API, HaRe, the Haskell Refactorer, and many other Haskell development tools). It is just necessary to find a way to use Haskell code in an Eclipse plugin without too much hassle and compromise.

This is what I hope to achieve with Cohatoe. On the way so far I have tried a number of different approaches, and I have discussed it with many people, notably the people on the EclipseFP mailing list. EclipseFP is now maintained by Thiago Arrais. Thiago has written a Haskell parser that does already a pretty cool job in supporting Code Assist and other helpers in the Haskell source editor. It has replaced an earlier approach of mine which used the Haskell parser from the haskell-src package in the standard libraries, and accessed it via FFI and JNI (Haskell's and Java's respective foreign and native interfaces). That approach had proved to be somewhat unstable (not the parser, but my code that drove it) and had the decisive disadvantage of working only on Windows (because it required the Haskell code to be run in a dll). I also wrote a contribution to the Eclipse Languages Symposium about a year ago. All this, however, left me a bit frustrated. Somehow I wasn't gettting the feeling that real progress could be made with all the approaches that I had tried.

When I was at the Haskell in Leipzig meeting last year, however, there was a very engaged discussion, and I decided to have another stab at the problem; I'd like to thank the participants in that meeting, and in particular Johannes Waldmann, for this very encouraging debate. This time I think I have made some progress, and it seems that is mostly thanks to a change in perspective. I'm focusing more on the programming model this time, instead of thinking in terms of choosing a technology that is able to make the connection between the Haskell code and the Java code that runs inside Eclipse. The goal is to design both the Haskell side and the Eclipse side of things in Cohatoe so that developers find themselves at home on either side, no distortion or unwieldy constructs, or extra efforts necessary. So far, this has worked out fine, so let's see what happens next :-)

Sunday, January 07, 2007

Cohatoe 0.2 preview

I have just released another preview version of Cohatoe. It contains a number of improvements and a cool new example.

As before, please keep in mind that this is still experimental and the APIs are bound to change a few times more. Actually, a rather severe change has been made already in 0.2 to the format of the plugin.xml when declaring a Haskell function (see below). (I promise I will stop doing such things of course once the APIs are more mature ;-).

There is now a changelog file in the SDK feature, where you can check what is new in which versions. The changelog is cumulative and contains all changes since the very beginning. In what follows, I'm giving a commented version of the changelog contents.

Declaring Haskell functions

As I mentioned, there have been changes to the schema of the haskellFunctions extension point (in the plugin de.leiffrenzel.cohatoe.server.core The attribute objectCode has been renamed to objectFile. This means that all plugin.xml declarations that use this extension point must be updated.

The old attribute is left in and tagged as deprecated, so that uses of the old format can be detected more easily. It will be removed in 0.3. That the attribute is still there doesn't mean that it still works, however. Eclipse will give a warning for each occurrence, so I have decided to leave it in. But the Cohatoe code doesn't read it anymore.

For situations where the Haskell code that you want to call is distributed over multiple modules, and you would consequently have more than one object file to declare, I have added some support. In the case where your modules are all in the same folder as the declared .o file, Cohatoe makes sure that they all are passed to hs-plugins, which will be able to load them. In the case where your object files are in a different folder (inside the plugin's directory structure, of course), you can specify that folder in the plugin.xml. Here is an example (taken from my unit test cases):

<haskellFunction
implementation="de.leiffrenzel.cohatoe.server.core.test.functions.MultiFolderModule"
interface="de.leiffrenzel.cohatoe.server.core.test.functions.IMultiFolderModule"
name="Multi-modules in different folders"
objectFile="$os$/obj/MultiFolderModule1.o">
<objectCodeFolder>$os$/obj2/</objectCodeFolder>
</haskellFunction>
There may be zero or more objectCodeFolder elements inside a haskellFunction element in your declarations now, each of which can declare a plugin-relative path, and all these paths are resolved and passed to the Haskell side where they are used for tracking module dependencies. (As usual, this works also when plugins are deployed and live in a .jar file.)

I'm planning to add further support for packages of object files (as created by the archiver tool). You will be able to declare these packages in a similar manner, but that support is not yet included as of version 0.2.

Next, I have added some helpful functions that can be used for marshalling data to the Haskell side and unmarshalling the results back. They are located in the class CohatoeUtils. Actually, they were already available in version 0.1, but there they were not well organized, well named and not even completely implemented. I've done a bit in that area, too.

Error handling

I have also started to improve Cohatoe in the area of error handling. This belongs to the large topic of robustness. Since Cohatoe will execute any Haskell code, it is important that its functioning is not impaired by code that crashes or perhaps is even malicious. Currently, a problem inside contributed code will cause the Haskell server executable to exit, and a restart of Eclipse is needed to get it working again. That is of course not acceptable, and so I have started to tackle the possibly problematic situations one by one.

The first case, the one that is now handled by Cohatoe, is when your contributed Haskell code calls the Prelude.error function. This will now no longer break the server, but is instead caught at the Haskell side and reported to the Eclipse side, where the exception is re-thrown as a CohatoeException.

The exception type (CohatoeException) is a subtype of Eclipse's CoreException, and it carries some useful information in addition to the usual Java exception information. It contains the error message that was used on the Haskell side, and it also knows the plugin identifier of the plugin that contributed the code which has caused the exception.

Since it is a CoreException, it also contains an Eclipse IStatus object that you can conveniently write to the workspace log, like so:
try {
// ...
} catch( final CohatoeException cohex ) {
// we don't handle this here, just
// write it to the workspace log
CohatoeExamplesPlugin plugin = CohatoeExamplesPlugin.getDefault();
plugin.getLog().log( cohex.getStatus() );
}
The exception knows also an identifier for the type of exception that was raised. I want to use that for distinguishing between calls to the error function and other types of errors, such as IO Errors, or problems that occured when hs-plugins tried to load code it couldn't. But that is not yet in - currently only Prelude.error is handled by Cohatoe.

New example - Sudoku solver

A new example has been introduced: a Sudoku solver View. The View can be found via the menu Window > Show View > Cohatoe Examples. The Haskell code that does the actual work is a Sudoku solver written by Graham Hutton which can be found at http://www.cs.nott.ac.uk/~gmh/sudoku.lhs (thanks to Graham for giving permission to include this code :-).

This example demonstrates how Haskell code is handled that spans more than just one object file (more background discussion can be found at in my earlier post about module dependencies in multiple modules), and it also gives an example for accessing multiple Haskell functions through a single interface on the Java side.

I will comment this example in more detail in one of my next posts, but I want to give you at least a screenshot here:

That's it from me for now - have fun! Any feedback is very welcome.

Saturday, January 06, 2007

Building a server executable

Part of the Cohatoe architecture is the Cohatoe server. On the Eclipse side it is represented by a singleton object (of class de.leiffrenzel.cohatoe.server.core.CohatoeServer), where clients can access it. The server runs in a separate process, and the singleton on the Java side communicates with it via a socket connection. It is implemented in Haskell, and it is able to locate object code that has been contributed via Cohatoe, load and execute it (using hs-plugins).

That the server is implemented in Haskell means also that it has to be compiled into a native executable, which is of course a different one for each supported OS/WS platform. Currently, that is only Windows (strictly speaking: only 32 bit Windows). But the mechanisms for supporting other platforms are there (mostly provided by Eclipse already), and the Haskell code for the server does not make use of platform specifics (as far as I can see). Cohatoe will automatically locate the correct executable for the platform it runs on (if one is there). Therefore, the only thing that is needed is to build the executable for all the other platforms. In what follows, I'll describe how to do this and explain a few background details on the way.

The Haskell source code for the Cohatoe server is located in the plugin de.leiffrenzel.cohatoe.server.core, in the directory INTERNAL/hs/src/. The native executables for the various platforms (i.e. Windows, Linux/GTK, MacOSX etc.) do not reside in the same plugin. Instead, they are contained in fragments. Fragments are a special concept in Eclipse's plugin architecture. (They have also recently become part of the OSGi specification, which is implemented by the Eclipse runtime.)

A fragment is, like a plugin, an OSGi bundle. In the workspace, it lives as a fragment project (instead of as a plugin project), and it contains a fragment.xml instead of a plugin.xml. But they have a similar directory structure, and they have the usual META-INF/MANIFEST.MF file where the meta information is located. When deployed, a fragment looks exactly like a plugin. It (usually) lives in a jar file in the plugins/ folder.

The specific thing about fragments is that they do not exist on their own - they are dependent on a plugin, which is called their host plugin. They have to declare which plugin their host is by the host's plugin id and version. As an example, here is how this looks like for the fragment with the Cohatoe server executable for Windows:


When Eclipse starts, it collects all fragments for a given host and adds their contents to the host plugin's. The contents are overlayed only virtually, they are not physically copied to the host plugin's folder. This means that all classes from .jar files in the fragment appear on the classpath of the host plugin, all files (such as icons, or internationalization properties files) appear on the host plugin's resource path, and so on.

Take an example: Suppose there is a plugin p, and a fragment f that declares a as its host plugin. Suppose further that f contains an icon file icons/eview16/myview.gif (p does not contain a file with that path and name). Now any code in a plugin that depends on p will get that icon file from p, if f is present, even though p does not contain it (and in fact doesn't even know about it).

This mechanism is used by the Eclipse project itself for providing internationalization texts. The plugins of the Eclipse IDE themselves contain only English UI texts. But there is a set of fragments (collectively called the 'language packs') that provide translations of these texts into a number of other languages. These fragments consist of just a lot of .properties files. When they are present, these files are used by Eclipse to display UI texts in their localized versions.

Fragments of the Cohatoe server plugin (de.leiffrenzel.cohatoe.server.core) which contain the server executables are platform-specific. That means that there is one fragment for Windows (which you can recognize from the win32.win32.x86 in the name), one for Linux (linux.gtk.x86) and so on. The three elements in the fragment name stand for Operating System, os (e.g. linux), Window System, ws (e.g. gtk) and OS Architecture, arch (x86).

The same naming convention is used in Eclipse for other things that are also related to platform-specific code. Another example (in addition to the platform fragments for the server executable which are my topic here) is the location of the Haskell object code which I have mentioned in an earlier post (the simple example walkthrough). Remember that the folder names there also had these 'os' and 'win32' bits? I'll explain the particulars of that in more details in an upcoming post.

For providing a server executable for a platform, the only thing that is needed is a fragment for that platform, and that fragment should contain a folder server, in which the executable is placed. The name of the executable must start with haskellserver. (On Windows, it is called haskellserver.exe, on other platforms it will be very likely just called haskellserver.)

I have organized the Windows fragment so that the build scripts for building the Windows executable are located with the fragment, not with the plugin (where the source code is). This is because I think that there will be platform-specific build scripts in all the other fragments too. The build scripts are in the folder INTERNAL/integration/ in the windows
fragment project.
As you can see, I'm using Cabal for building the executable. That makes it convenient to declare additional package dependencies. The packages that are currently needed are base, network and parsec, of course hs-plugins and the Cohatoe API package (cohatoe-api), and HaXml. Further dependencies may be added in the future, depending on how much more functionality will go into the server.

I have wrapped the several Cabal build steps (configure, build, clean) in a Windows batch file. There I'm also doing a pragmatic step to copy the created executable to the correct destination folder. I'm sure that there is a way to extend Cabal so that this can be done in a Cabal build hook - so if I've got a bit of time, I'll probably see to beautify that :-).

Finally, to execute that build script (i.e. the Windows batch file), I have declared an External Tool in Eclipse. This functionality in Eclipse is very useful, but seemingly not so well-known. You can create a launch configuration for an external tool ('external' means usually a separate process, but the file may be located inside the workspace, as it is in our case) and then have the launch configuration settings written into a workspace file. Mine is in the Haskell server exectuable - win32.launch file that you can see in the build folder in the screenshot above. You can configure it on the dialog that you get from the menu Run > External Tools > External Tools ....

You can see here that I have just declared the batch file as the executable and the folder which contains it as the working directory. In addition, it is very helpful to specify, on the Refresh tab, that the entire project should be refreshed automatically after the tool has run.


Finally, on the Common tab you can specify a location where the launch configuration settings should be stored.

Now that we have the launch configuration we can simply run it from the menu or from the global Eclipse toolbar (look out for the 'External Tools' button). What's even better, we can also register the External Tool launch configuration as a builder on the Eclipse project that contains the sources, so that it is automatically triggered when these sources change. (But that is left as an exercise to the reader ;-).

Wednesday, January 03, 2007

Indirect programming

Yesterday during a debugging session a colleague and I stumbled over a terrific example of what I have called 'indirect programming' elsewhere:

  private void start(String organizerId) {
IBreakpointOrganizer organizer = getOrganizer(organizerId);
IPropertyChangeListener listener = new IPropertyChangeListener() {
public void propertyChange(PropertyChangeEvent event) {
}
};
organizer.addPropertyChangeListener(listener);
organizer.removePropertyChangeListener(listener);
}
Why in the world would someone first add and then immediately remove a listener, and above all a listener with an empty implementation?? Here's why (taken from what happens to be one internal implementation class of IBreakpointOrganizer):
  public void addPropertyChangeListener(IPropertyChangeListener listener) {
getOrganizer().addPropertyChangeListener(listener);
}

[...]
protected IBreakpointOrganizerDelegate getOrganizer() {
if (fDelegate == null) {
try {
fDelegate = (IBreakpointOrganizerDelegate)
fElement.createExecutableExtension(ATTR_CLASS);
} catch (CoreException e) {
DebugUIPlugin.log(e);
}
}
return fDelegate;
}
Adding a listener has the (undocumented) side-effect of initializing the internals of the organizer. The implementation of the method above makes thus use of a) an implementation detail it shouldn't know; and b) a behaviour that is clearly a side-effect of an entirely different functionality. One can only hope that not, at some future time, a source code cleanup will just clean away that 'useless' add/remove a no-op piece of code ...

Tuesday, January 02, 2007

What's in which package?

There are three features in Cohatoe. (Features are a fancy term in Eclipse's terminology for 'installable package' - a feature consists usually of one or more plugins and can be installed or uninstalled via Eclipse's built-in Update Manager). Here is a short explanation what each of them is needed for:

  • Runtime: this contains only those plugins that are needed at runtime.
  • Examples: an example plugin that demonstrates various aspects of the use of Cohatoe.
  • SDK: contains the sources and developer documentation for all of Cohatoe.
(The Examples feature depends on the Runtime feature, and the SDK depends on both others. The Runtime feature itself has no dependencies apart from that to the very basic Eclipse runtime plugins. You can even use it in UI-less mode.)

When you develop an Eclipse plugin that has some functionality implemented in Haskell, you would therefore depend on the Runtime feature. This means that the plugins from the runtime feature must be present on the system of the user of your plugins. The best way to achieve this is to have the end user install the Cohatoe runtime feature and then your feature (which contains your plugins).

For development of such plugins, on the other hand, you will likely want to have the SDK feature around. It contains the API documentation, the extension point documentation, and the sources of Cohatoe itself (useful for debugging). Since the SDK feature depends on the Runtime feature, this means that you also must have the Runtime feature installed, which is obvious anyway ;-).

If you want to hack Cohatoe itself, you could theoretically also use the sources from the SDK feature. However, it is recommended that you get the latest sources from the Darcs repository instead. This means not only that you get the latest changes (which may not yet have been included in the released versions), but also that you will be able to send your changes as patches. Any contribution of patches is very welcome!

If you don't know Darcs yet - please consider to have a look at it. It is a really nice revision control system (i.e. an alternative to the more widely used like CVS or Subversion), very easy to learn and somewhat addictive once one has used it for a while. It differs from the more mainstream alternatives in that it is not only distributed, but also change-oriented (as opposed to version-oriented). You can get a Darcs client and find more information about Darcs at the main Darcs website at darcs.net.

If you want to access the Cohatoe repository, here's what you need:
> darcs get --partial http://leiffrenzel.eu/repos/cohatoe
Have fun!

Monday, January 01, 2007

Tracing deployed plugins

In a previous post I have explained how to use Cohatoe's debugging options (called Tracing in Eclipse) for getting more info about what goes on inside Cohatoe when Haskell functions are registered and called. This is useful when you are contributing an Eclipse extension and something goes wrong, so that your function call doesn't actually happen.

You can also use the same tracing options in an Eclipse installation where your plugins are deployed. This is in some respects a different situation, different from the debugging situation I have described. Your plugins may now reside in .jar archives, for instance, where they before had lived in project folders in the workspace, and that may have some unforeseen effects.

And there is another typical problem with deployed Eclipse installations. When you export a plugin, you have to declare, in the build.properties, which files should be exported and which not. You can find the build.properties in the Plugin Manifest editor, from the Build tab.


In the screenshot, you can see how this looks in my workspace for the Cohatoe examples. You see that the files needed only at development time (such as Eclipse's .project file of the .checkstyle configuration are not checked, but the files that we need to deploy, such as the plugin.xml or the object files, are checked. A typical case is where a developer first creates a project, later adds some files, such as icons or object files, forgets to check them in the build.properties, and then wonders why something doesn't work in the deployed version. To me at least, this happens all the time :-).

One way to detect such a problem, at least with object files contributed via Cohatoe, is to use tracing. In the post I mentioned above I have explained how the tracing output looks like. If an object file was missing, this would be reported in a line in the tracing output.

Now suppose you have some Eclipse installation, where Cohatoe is installed.


This is what a typical Eclipse installation looks like. The only thing that I have added is the .options file. Its content is quite simple:

# option for tracing calls to the Cohatoe server,
# including inits and function calls
de.leiffrenzel.cohatoe.server.core/logs = true
I have just copied it from the Cohatoe server core plugin. (You can add options for other plugins in here too, and get their tracing output too.) Note that I have switched the value to true; the original file has false in this place.

Now run Eclipse from the command line with the following options:
> cd C:\cohatoetest\eclipse
> eclipse -debug -consolelog
You get a console window where you will see the tracing output (and as a nice extra, you will also get the contents of the workspace log appended there, i.e. the output that ususally goes to yourWorkspaceLocation/.metadata/.log).