York Haskell Compiler

Sunday, April 03, 2011

Yhc is dead

Someone recently asked on the Yhc mailing list if Yhc was dead. The answer is yes, noone has been working on the compiler for several years. The nhc98 compiler, from which Yhc originally forked, is still maintained. Since this is the end of Yhc, I thought I'd share a few random thoughts about Yhc:

Writing a Haskell compiler is a big undertaking, and the work required to compile a moderate number of programs from Hackage is immense. Too many libraries rely on something GHC specific - either language or runtime features. The GHC team have gone out of their way to ensure that their compiler is by default a standard Haskell compiler, which has kept Haskell as a viable language separate from GHC. But without widely-used competing implementations, programs tend to end up depending on GHC.

There are still other non-GHC Haskell compilers, and I wish them all well. Many offer features missing in GHC (compile to Javascript, tiny runtime system, extreme portability, code mobility etc.) - there are lots of interesting ideas floating around.

Yhc started in a fairly haphazard way, and became a fork of nhc long before anyone had noticed. Had Yhc's contributions been rolled back into nhc they'd probably have had more lasting impact.

One of the big draws for Yhc was it's ability to take a Haskell program, and produce a Core complete program. Despite all of Yhc's other weaknesses, Yhc Core drew several people to the project - I hope something equivalent appears for other Haskell compilers.

If you don't understand monads, you aren't yet ready to write a Haskell compiler.

If you have a group of Haskell programmers in the same place, you should try a group project. It's fun.

The biggest challenge for Yhc was the build system - we ended up with 10,000 lines of Python Scons scripts. Without a robust build system nothing else matters. When our sole Python hacker left the team that was the beginning of the end.

Working on Yhc was fun, and I learned a lot. Thanks to everyone who was involved.

Sunday, May 18, 2008

Haskell on BEAMs

Recent developments showed that Haskell, via conversion from Yhc Core, can run on:

Python runtime
Javascript in a web browser
Reduceron, which is more hardware than software

Ability to run Haskell on some runtime gives Haskell programmers access to some or all features provided by that runtime, while keeping the features provided by the language unchanged. So it may be worth trying more runtimes to run Haskell programs on.

BEAM is a virtual machine Erlang runs on. Erlang provides a distributed fault-tolerant concurrency-oriented computing platform, and its use keeps widening for various kinds of applications. BEAM is in plural in this blog post's title because a distributed Erlang program is running on multiple instances of BEAM, and so would a Haskell program in the same environment.

There are projects aimed at compilation of some programming languages to Erlang or Core Erlang:

So, why not add a compiler from Yhc Core to both collections?

For about a month I have been experimenting with a converter of Yhc Core to Core Erlang. The converter was in part based on the Javascript backend converter, and in part on Haskerl. The goal was set to run a Haskell program that utilizes some commonly used subset of Haskell Prelude, and is also able to deal with creation of Erlang processes, sending, and receiving messages. This program served as an example.

The goal was reached, so I would like to publish the results for everybody to see and criticize. Any feedback is appreciated.

Read on.

Labels: Erlang

Friday, March 28, 2008

Yhc/Javascript backend: personal/historical notes

It looks like this blog has been silent for more than a year ;) In the light of the first milestone reached recently by one of Yhc subprojects, the Javascript backend, I am happy to break the silence and put some personal historical notes as well as notes about the current status of the project I have been working on for more than 1.5 years.

1. The "crazy idea"

I was not definitely the first to have a thought to convert Haskell to Javascript. One that I can now recall was a piece of text on this page (by Jared Updike, dated March 2006):


------------------
...In addition I was thinking it would be cool to compile Haskell to
JavaScript so then you could actually write arbitrary things in
Haskell and run them in browsers...
------------------

There was a number of others that I lost track of. Anyway, the idea was in the air, and seeing other people expressing it prompted me that it was worth trying.

2. First experiments

I started my experiments with Nhc98 in June 2006. The intermediate form of Haskell program before it was turned to bytecodes was called PosLambda. It was not very easy to understand it, but some primitive things worked out. Later I noticed that the
Hajax page appeared on Haskellwiki (the link points to its earliest version) which referred to some of my developments. I started my own Wiki page, STG in Javascript, sort of response to the Hajax page. The "STG in Javascript" page is still there, showing the progress during that time (Summer 2006). Later, another developer (nickname Vir) picked up the topic, however his work is completely independent from mine.

3. Yhc core

Some time in September 2006 I learned (thanks Neil) about Yhc Core: an intermediate format that Yhc produces during compilation, similar by its idea to GHC Core, but much simpler and with more stable format. Plus, it has an externalized form, so in order to process it there was no need to alter the code of the Haskell compiler itself. These benefits were very attractive, and Yhc Core had much in common with PosLambda, so transition of my converter was relatively painless. In October 2006, the sources of the Javascript converter were checked into the Yhc darcs repo, so the Javasscript backend "officially" became a subproject of Yhc.

4. The Roman echo

On November 16, 2006, the first runnable example was published. It was a web page with an input field. When a decimal or Roman number was typed in, it responded with the same number converted to Roman or decimal (conversion code by Malcolm Wallace), plus approximate time that the conversion took. First results were perhaps terribly slow: time shown might be seconds, yet it worked.

5. DOM interface

It was absolutely necessary to find a way how to call DOM functions from Haskell. In the first Roman echo example, these calls were manually coded, but for more than one hundred of them defined by the Web Consortium, that did not seem acceptable. In addition, accurate representation of DOM methods' argument type information was critical, as one of Haskell's benefits in Web application design is strong static typing (as opposed to Javascript with no static typing at all).

Happily, the Web consortium provides DOM interface specifications in OMG IDL syntax. And there is an utility that is capable of understanding DOM IDL: H/Direct (by Sigbjorn Finne). The way H/Direct reflected IDL interface inheritance was simple, but not very open-ended: all sub-interfaces of an interface were presented as constructors of an algebraic data type, which meant that in order to add an instance of an interface, the whole thing needed to be recompiled as Haskell does not have extensible algebraic types.

So, the the Haskell code generator was stripped off from original H/Direct code while the IDL parser remained in place. The new Haskell code generator transformed the hierarchy of IDL interfaces into hierarchy of Haskell type classes. This is a more open-ended solution, as instances to classes may be defined elsewhere without changes to class definition itself.

Finally, DOM IDL interface definitions downloaded from the Web Consortium website were passed through the modified H/Direct, yielding tens of modules (each DOM interface was placed in a separate module), preserving the type information as much as it was possible (indeed, were DOM interfaces designed with much regard to strong static typing anyway?).

Of course, some manual coding still was necessary (mainly around the type class defaulting problem), but its amount is incomparably smaller than the amount of Haskell code generated from the DOM IDL definitions.

6. Unicode

Surprisingly, Javascript code running in Web browser has almost no access to Unicode character properties. Case (upper to lower, and the opposite) conversions are of course available, plus something can be done using regular expressions, but that's mainly it. No direct access to Unicode character category, no titlecase conversion, etc. Some logic was borrowed from Hugs and GHC where such information was provided as a lookup table (by character code ranges). This resulted in about ~70 kbytes added to the Javascript runtime included in every Web page generated, but at least essentially the same Unicode character properties API that is available in Hugs and GHC became available to Haskell programs running in Web browser.

7. Application coding paradigm

At the moment DOM bindings were complete, it became possible to call DOM functions directly (in continuation-passing style). So Haskell program looked similarly to Javascript, and this way of coding seemed to require too much effort. It was the same old sequential programming. CPS might be replaced by monadic interface, but this did not change the matter. Some functional-reactive paradigm was necessary.

The Fudgets library inspired some experiments. At the first attempt, the lowest-level layer of Stream Processors was successfully ported (tested on Firefox). Unfortunately, the Fudgets library itself showed some execution overhead, and finally when tested on MSIE, significant memory leaks.

Later, CPS Monad was tried out (instead of plain CPS). Same situation: works on Firefox, leaks and freezes on MSIE.

Plain CPS, though, never had such problems. So, after several months of experiments, it was finally decided to stay with plain CPS, and try to find a suitable solution to describe static Web page layout in mostly declarative way while leaving some degree of freedom for developers as to how to define the interactive (dynamic) part of an application.

In addition, pseudo-threads (in fact, coroutines) and message boxes for inter-thread communication were included in the runtime making it easier to develop interactive, event-driven applications.

8. Haskell web toolkit

It turned out that a relatively thin layer of combinators on top of DOM may solve the problem of defining the static layout of Web page. Most of ideas were borrowed from the HTML and its successor XHTML libraries (by Andy Gill, Bjorn Bringert).

Static structure of a Web page is described as a hierarchy of widgets, some (containers, that correspond to HTML elements with opening and closing tags) may nest other widgets, and some are terminals (leaves, like text labels or images). So, developing such static layout does not differ much from writing HTML, only these constructs are not indeed static: powerful functions like map and fold may be used to operate over multiple widgets at once.

Dynamics of the user interface can be added later, and it is almost completely orthogonal to static design. Special elements called activators expose the same interface as other widgets (that is, may be nested in containers), but their semantics includes runtime operations on their "parent" widgets such as handling user-induced events, on-the-fly style and content manipulation, or data exchange over XMLHTTP.

The name picked, "Haskell Web Toolkit" unintentionally resembles "Google Web Toolkit", but indeed, they both serve the same purpose: bringing in some language not supported natively by Web browsers as a language to program for Web browsers. Weights and sizes of the two toolkits are of course incomparable ... yet ;)

9. The first milestone

As the Javascript backend's code matured, the idea appeared to set up a web service for everybody to submit their Haskell source code for compilation and instant loading into their Web browsers. Such service was implemented, based on CouchDB as intermediate storage. The service was quietly opened for public testing on March 4, 2008. Having such service available was considered reaching the first milestone in the Javascript backend project.

So, what has been achieved since the first "Roman echo" demo was published:

speed of generated Javascript was significantly increased, partly due to improvements in the code generator, partly thanks to the Yhc Core optimizer which was developed in the same time when Javascript backend

a tool to automatically generate Haskell bindings from OMG IDL interface definitions was created. It is currently investigated whether Javascript documentation tools such as JsDoc could be used to generate OMG IDL definitions for existing Javascript libraries: this would make integration of them with Haskell much easier

programming paradigm for Haskell Web applications was defined. The Haskell Web Toolkit library is not large at the moment, but it will definitely be extended with new pieces as usage of Haskell in Web programming gets wider

a proof-of-concept live Yhc Web service was launched, so everybody interested may try Haskell in their browsers without installation of any software

10. Acknowledgements

Thanks to the Yhc Team for helping understand some internals of the compiler. Special thanks to Neil Mitchell who developed the Yhc Core format and the Core tools library. This helped a lot to improve the quality of Javascript generated by the backend. Some people contacted me privately and suggested some ideas or identified bugs. I am not putting names here, but feel free to identify yourselves in the comments.

And most of all: interested developers are welcome! The more people contribute in this project, the more chance Haskell has to become wider used in Web applications whose importance grows over time.

Monday, February 05, 2007

100 bugs!

Three days ago Thorkil Naur filed bug #100 for Yhc. Using a bug tracker was a good step for Yhc - and has helped us remember, track down and fix loads of bugs. We currently have 101 bugs, 62 of which remain open. Some bug statistics:

7 beginner bugs - these are bugs which don't require much Haskell knowledge, or in depth compiler knowledge. If you are just getting started, all of these should be within reach.
15 defects which are either High or Medium importance - at least one of which got fixed this morning!
30 enhancement requests - often relating to cleaning up the internals of Yhc.
17 bugs owned by me (2 by Andrew, 4 by Dimitry, 2 by Tom, 2 by Mike) - 25 waiting for you to own them :)

Well done to everyone who has contributed to Yhc - and hopefully we'll be able to reduce the bug count to 0 at some point!

Thursday, December 21, 2006

1000 patches!

The Yhc project hit 1000 patches sometime today. Congratulations on Dimitry for being the 1000th patcher, and for everyone who's made a commit.

This does of course mean that doing a non --partial darcs get is going to take forever...

Thursday, December 14, 2006

Yhc.Core.Html

The Yhc Core stuff has been coming on loads recently, being used in more and more projects, and gaining lots of cool features - even a strictness analyser! Check out the Yhc documentation here: http://www.cs.york.ac.uk/fp/yhc/snapshot/docs/ . A lot of this work is focused towards performance enhancements, so I'll give more details on that once we have a headline number to brag about :)

One cool thing I've been working is Yhc.Core.Html (not yet in the documentation index, but it should be tomorrow). If you load up Hugs and take your Yhc.Core file:

> loadCore "Roman.ycr" >>= writeFile "Roman.html" . coreHtml

That will generate a file called Roman.html, which is a hyperlinked, colour-coded viewer for your code. A sample is here. The links which are internal to this file are hyperlinked, and it also hilights the uses as well. Click on an identifier to see what it does.

Hopefully this should be of use to people who spend most of their day pouring over the details in a Yhc.Core document - i.e. me!

Monday, October 16, 2006

Yhc.Core documentation!

Since a few people now have an interest in Yhc.Core I thought I'd actually document it with a little tutorial style introduction.

There are also some little examples, including one to count the number of times you use the literal 42 in your code.

http://haskell.org/haskellwiki/Yhc/API/Core#Some_little_samples

We've also been busy getting installing set up, you can now do "scons install" on Linux, download nice snapshots on Windows, and there is progress towards Gentoo eBuilds and Debian .deb's.