Mind the gap please

Or, how all programming language evolution charts are incomplete.

Or even, how so many of you are looking the wrong way!

Just a quick moan this time. No doubt you’ve all seen charts which show how ideas from one programming language have flowed into others, in a kind of tracking-the-evolution sense.

Ever noticed what’s wrong with these diagrams? I think they are missing a crucial detail, and by missing this, they are missing a very important aspect of programming language work: the wider picture.

The same applies during many ‘discussions’ about programming languages etc. The discussion tends to focus on existing languages, as if we had to pick something that already exists. Looking backwards, in a sense.

What I think gets lost is the context. We’re programming to solve problems, and most of the fun parts of problem solving are done in our heads and then transferred to some executable format. There’s a gap between “brain” and “machine” that we need to bridge somehow, with a mix of languages, processes, techniques, and I think you’ll all agree the gap is pretty big at present.

My second favourite joke is relevant here. An eminent Comp Sci professor was asked what was the best programming language. Prof. paused, then answered “graduate student”.

So if we could, wouldn’t it be fun to explain to a graduate student what needed doing and then get a working program back? We’ve made the bridge shorter by using a fairly high-level entity to encode our solution. (YMMV though.) Sadly, there aren’t so many graduate students around and we sometimes have to do our own work. How do we reduce how much needs to be bridged?

Notice that I’m not recommending we work at the machine level. We should aim to do better than this. For example, I don’t program in Haskell: I write programs in my head and then write them down as Haskell programs. It’s the difference between programming with versus programming in. I would like my future languages to make it easier to write down what is in my head, not only because I’m lazy, but because it’s probably a bit safer that way.

One way to view programming language development is as an attempt to narrow the brain-machine gap.We try to provide features which make it easier to encode and think about concepts from the problem domain, and to say how they relate. Quite rightly, there’s a growing interest in techniques like DSLs (domain specific languages) as a way to encode important ideas more directly, without the noise of the host programming language. We can go further though.

I like the flexibility and terseness given by Haskell-style languages – great for saying what you mean without too much ceremony. But, quite often I know more about the problem domain etc and how things work than is possible to write in the code, and this is annoying. I don’t want to write such things down as comments, or write test cases to try to document and confirm additional properties. I think we can do better, and one way is with the new generation of dependently typed languages. (You can find an intro in recent PragPub magazines, in particular starting with the April 2013 issue.) Dependent types aren’t perfect, but it’s a good step forward and they introduce many new ideas to explore.

To sum up, I believe we need to be aware of the brain-machine gap and be aware how we need to develop tools (languages and otherwise) plus techniques to get more control over the gap. Don’t turn your back on it!

My PhD thesis is 15 years old

Since I left the organised education industry, my PhD thesis has not been readily available on the web. It’s about time I gave it a proper home.

So: my thesis can now be downloaded from here. (This is a postscript file converted to PDF, and might look a bit ‘scratchy’ in places. IIRC back then we only had 10 dpi fonts.

I just found another version, if you prefer, which is a scanned image from the middle-class finishing school down the road. It was added late 2012 it seems.

I also realised that it was almost 15 years ago to the day that I passed my PhD viva  (with minor changes). That seems a long time ago. What do I think about the whole thing from the distance of almost a sixth of a century? Basically, I’m still quite pleased with it. Quite a lot of what I said seems still relevant nowadays, particularly questioning the goals of wider NLP and trying to understand the value we’re creating (or not). I’ll go into more detail in another post.

But for now, here’s the abstract.

This research addresses the question, "how do we evaluate systems like LOLITA?" LOLITA is the Natural Language Processing (NLP) system under development at the University of Durham. It is intended as a platform for building NL applications. We are therefore interested in questions of evaluation for such general NLP systems. The thesis has two parts. The first, and main, part concerns the participation of LOLITA in the Sixth Message Understanding Conference (MUC-6). The MUC-relevant portion of LOLITA is described in detail. The adaptation of LOLITA for MUC-6 is discussed, including work undertaken by the author. Performance on a specimen article is analysed qualitatively, and in detail, with anonymous comparisons to competitors' output. We also examine current LOLITA performance. A template comparison tool was implemented to aid these analyses. The overall scores are then considered. A methodology for analysis is discussed, and a comparison made with current scores. The comparison tool is used to analyse how systems performed relative to each-other. One method, Correctness Analysis, was particularly interesting. It provides a characterisation of task difficulty, and indicates how systems approached a task. Finally, MUC-6 is analysed. In particular, we consider the methodology and ways of interpreting the results. Several criticisms of MUC-6 are made, along with suggestions for future MUC-style events. The second part considers evaluation from the point of view of general systems. A literature review shows a lack of serious work on this aspect of evaluation. A first principles discussion of evaluation, starting from a view of NL systems as a particular kind of software, raises several interesting points for single task evaluation. No evaluations could be suggested for general systems; their value was seen as primarily economic. That is, we are unable to analyse their linguistic capability directly.

I’ve been distracted…

I’ve not posted for a while, at least not here, but I have kept up the habit elsewhere.

If you didn’t know, I’ve been writing a few articles on Haskell and related topics for the Pragmatic Bookshelf’s in house magazine. My general aim is to talk about higher-level issues of programming and what we can get from a functional approach. The key ideas are about putting data first and about getting the programming language to fit the problem. So conceptually I try to start asking what data structures we need and what kind of transformations, then consider what we’d like from the programming language to make it easier or simpler to write the programs.

The current articles are:

One day I’d like to do a retrospective here, to restate the main points and think about what isn’t being explained clearly enough.

But soon, I need to get cracking on the next article to have it submitted before Christmas! It will probably feature Yesod, a full-scale web framework for Haskell. Or I may take it easy and just do some more Fay!

 

I remember the time before monads

I’ve been fortunate enough to get some of my ‘essays’ into PragPub magazine, starting with an overview of how functional programmers think and how they use their languages. See Issue 38 for the first installment. Subsequent months should see articles on types and testing, practical programming, refactoring, and dependent types. And yes, eventually something about monads.

When planning these articles, I’ve been reading around some of the recent books and tutorials in Haskell and Ruby – mainly to get a feel for what people might understand and for places where they might run into stumbling blocks. I’m still not entirely satisfied with current material on Haskell, which is one reason I started to write these pieces. The ‘expert’ writing on Haskell is fairly dry, a bit too academic, and I rarely see much of the enjoyment and wonder that keeps me programming in Haskell even after 20+ years. (I love coding, it must be said, and I like to use tools that let me have fun and do great things.)

The other main camp is the people learning Haskell and trying to explain their experiences to others. Though there’s some very useful and interesting accounts out there, they generally also miss some important ideas or put too much emphasis on certain details (and I suspect that the expert camp is partly to blame, for not explaining some ideas as well as they could).

There’s two main culprits: (a) a sense of denial and (b) monads.

Denial!

By denial, I mean the suggestion that Haskell etc denies itself the use of things like mutable state that everyone else takes for granted, and that a lot of what follows is an attempt to cope with our asceticism. My perspective is different: I see Haskell as starting from a different set of assumptions and arriving at a different place to the mainstream. Instead of us needing higher order functions to cope with not having mutable state, instead, I see HOFs as a useful tool for manipulating the data, and flexibility of HOFs means that we don’t need to rely on things like
mutable state so much. Hell, I’ll stick my neck out and propose a new law:

Callaghan’s 1st Law: the need for mutable state in a language is inversely proportional to its flexibility in manipulating data

Put another way, mutable state is a lower-level idea and becomes less important when your language supports higher-level ways of working.

A similar comment applies to our use of a type system. We’re not doing it to put ourselves in a strait-jacket for any silly or conservative reasons. Experienced Haskell programmers know that the type system is a tool for getting work done, and a great language for playing with designs, and they exploit these aspects to help them get their work done. Plus, if things get in the way, we often find ways to remove the obstacles.

So basically, I think FP and Haskell is more about opportunity than denial. We’re super-liberal!

Monads!

There’s a view that Haskell is 99% monads (or thereabouts), and that monads are some arcane mystical concept which only a few can master. Bullshit to both!

I remember the time before monads. It _was not_ a barren wasteland, where all we could do is write programs to build trees and not communicate with the outside world. We really could do real world stuff, including file operations, console IO, IPC, though it was a bit clumsy in places. At that time, I was doing PhD work on a large Natural Language Processing system, around 60k lines of Haskell and so one of the largest programs of its time. The program could process and analyze Wall Street Journal articles in a few seconds and build a complex semantic representation of the story, and didn’t use a single monad.

It was however a time of exploration, when researchers explored various ideas to find a good way of both having our cake and eating it. Monads are one of the solutions they found, and essentially gave us a small but flexible API for working with “computations” (like IO operations or state  modifications, or various combinations thereof) as opposed to simple data values, and did so elegantly _within_ the standard langage (ie. no ad hoc extensions needed). It got even better when syntactic sugar was added.

This simple idea provided an excellent structuring pattern to tame a lot of
clumsy code, and even more useful, gave us a solid framework for exploring more powerful ideas.

So monads are highly useful for some aspects of programming work, but they are certainly not an essential or core part. I estimate that 50-80% of most large Haskell programs do not involve monads at all – they are just pure data manipulation. Of the remainder, the monad use is mostly straightforward and follows certain common idioms. Real scary stuff is pretty rare.

Last words

As a new explorer (very warm welcome, by the way!), when you look at Haskell material you may see some very unusual or scary-looking stuff. But do bear in mind that a lot of it is just playing around with abstractions on top of the core language, and probably does translate to something more intelligible. Try to work out what is being said about the data being manipulated, and then it might not look so bad.

Another thing that works is to avoid writing code for things you know how to do, and instead try to write code for other things. For example, try thinking and playing with various tree operations, like leaf counting, traversals, map/foldfilter… Then, you’ll be less tempted to slip into imperative mode.

I also recommend Graham Hutton’s “Programming in Haskell” as the best book introduction to the language. It is a brisk but surprisingly complete introduction to Haskell and related techniques, and parts of it remind me of K&R. However, it is still an academic textbook for an introductory course, and can be thin on the pragmatics and wider picture.

There’s also a thread on the PragPub forums where I’m collecting ideas/requests for future articles. Please feel free to add comments there - http://forums.pragprog.com/forums/109/topics/10889

Making coffee the Chinese way

Today, I share with you a great insight.

Do you like coffee? do you get fed up with the paraphernalia needed to make it, and washing everything up? I do. Why can’t it be as simple as tea? The fact is, you don’t need any hardware except for a cup and a teaspoon. And, you will probably use less ground coffee in the process.

How does it work? take a look at one of the traditional methods the Chinese use for tea. They just put a few leaves in hot water, and when the leaves sink, it’s ready to drink. (And you can top up with water again to make the leaves go further – most good teas are fine for this.)

Well, the same works for coffee! So try this: 2 teaspooons of coffee grounds in a normal-size mug, with milk (optional) and then top up with near-boiling water, and leave for a few minutes. Stir occasionally. After about five minutes, most of the grounds will have sunk and you can drink the coffee.

It’s not bad, is it?

Worth mentioning: the ‘Byzantine’ method (Greek coffee etc) is kind of similar – heating the coffee and water mix to boiling and then not bothering to separate them.

Why free variable?

Someone asked. Well, I wanted something just a little bit geeky, but the delightful strictly positive has already gone, and terms like ‘iota reduction’ just don’t have the same ring. Plus, ‘free variable’ seems quite apt now that I’m away from the binder of organised education! – and having much more fun too.

Welcome to free-variable.org!

Featured

This site has been set up by  Paul Callaghan. I’ll probably use it to discuss various aspects of programming language technology. I like programming, and like anything that helps me do complex things more elegantly and precisely. So expect to see something about Haskell, Ruby, and dependent types in the coming weeks. You’ll also see something about interesting algorithms and how to express them in a flexible language. Stay tuned!