Thursday, March 12, 2009

The danger of "language" in programming

I recently stumbled upon this blog entry from Steve Riley. There are a number of things in this that can be agreed with or argued about, but one in particular striked me as especially dangerous:

"[C++] has a million nuances (like the English language) that somehow make expressing exactly what you want to say seem easier than other languages."

I can interpret this sentence in three ways. The first is meaningless. Just some pretty wording to impress people, only to say "C++ is good". I doubt Steve was that sloppy. The second is litteral. C++ share a trait with natural languages: the "million nuances". I don't think that was the intended meaning, though it was how I first understood it. The last (and most likely) is the analogy. C++ and English have more nuances than other languages —programming and spoken, respectively. I call it biased towards his native tongue, but his point remains.

Anyway, comparing natural and programming languages disturbs me, because drawing the line between similarities and differences is hard. That makes the jump from right premises to wrong conclusions way too easy. Analogies are a great tool in many cases, but when talking about computers, they just don't work

There is a fundamental difference between the two kind of languages wich is often overlooked: their usage. Natural languages are used to talk to people, while programming languages are used to talk to machines —and people too, if not primarily. The difference between machines and people is this: a machine does as it's told, a person does what's needed.

In virtually each use of a computer, we have some specific need, about which we tell the computer, using some programming language(s). (This need may evolve, but that's not the point.) The resulting description of this need is the program. What often happens at this point is a disagreement between the computer and the human about the meaning of this program —wich is a bug.

There are two ways to deal with bugs. Correcting them, and avoiding them. The production of a program of any kind always combine both aproaches. Most of the time, the (explicit) focus is on correction. Sometimes, it is on avoidance.

Of these two, the only one that is significantly influenced by the structure of programming languages is avoidance. It is therefore crucial to insist on it when choosing (or designing) one. Several characteristics of programming languages can help or impede bug avoidance. Conciseness can help, for it is known that the defect per line of code is pretty constant accross languages. A paranoid type system can help, if it is still convenient. But I think what helps most is obviousness. The more obvious, the better. Ideally no special case, no subtlety, no nuance. By itself, the language should be boring.

That leads us to why Steve's above sentence is so dangerous. Not because it's wrong (although it is), but because it's right. C++ does have many nuances. It is a very interesting and very subtle language, to the point even machines (namely compilers) disagree about its meaning. That is not good. That is a recipe for disaster.