Friday, January 12, 2007

Does OOP really suck?

Foreword: smaltalk zealots, don't be upset, I am bashing mainly C++ and Java.

Object Oriented Programming was quite a revolution. It even became mandatory in any project large enough. If you're not using OOP for your current project, then it is doomed. Or is it? Why?

Sure OOP has advantages. Two majors ones I heard of are : code reuse, and encapsulation. That sounds great. Code reuse means I will have less code to write, less code to read, and less code to debug. Encapsulation means I will not know irrelevant implementation details about what I use, which is quite useful for teamwork.

Just a few words about intuitiveness. Some people say OOP is meant to be intuitive, but I think such a concern is irrelevant. For example, representing numbers with a sequence of figures is most counter intuitive (yes it is, for children). A much more intuitive way would be to represent them with a set of small balls. Or with a set of different things (the Egyptian numeration system was based on that : one symbol for the units, another for the dozens, an so on for each power of ten). However, the positional system we use today is far more concise and easier to use, once we have learnt it. What I think is important is ease of use, and, to some extent, ease of learning. Not intuitiveness.

Before OOP, we were doing structured programming. Remember the old days when we had the data on one hand, and the code on the other hand. A program was roughly a set of modules. A module is just some "place" (typically a couple of files), in which we store the definition of a data structure and the code which process it (a set of functions). The nominal way to use a module is to call it's functions on variables of it's data-type.

/* How to use a module */
/* 1: create some instance of the type shmilblick*/
shmilblick data;
/* 2: Use it */
process(data);
process2(data, stuff);
/* ... */


Then, OOP provided us a new way to divide our programs. A program is roughly a set of classes. A class is some place (typically a couple of files), in which we store the definition of some data (the member data), and the code which process it (a set of methods). We can now turn our code above into:

// How to use a class
// 1: create some instance of the class shmilblick
class shmilblick data;

// 2: Use it
data.process();
data.process2(stuf);
// ...


I have deliberately chosen an example where the procedural code and the OO one were isomorphic. This can look unfair, but the reality is often like this: OO syntax with procedural semantics. So let's start with more serious stuff.

When doing OOP, directly accessing the data is considered evil. So OO languages provided us a way to forbid programmers to do so : the "private" members. I have an interface (the "public" methods), so I have nothing to do with implementation. Actually, it's fine. It is now more difficult for me to damage the program. Or is it? Encapsulated data is data accessible from a given place (a class, a module, or a file) and no other. We can achieve that even with languages which don't enforce such discipline. Data encapsulation may have been introduced by OOP, but it is not specifically OO. Sure, providing support for encapsulation is more convenient than not to, but a language does not have to be OO do do this.

Besides, Object Oriented Programming is meant to promote code reuse. That is The Big Thing. The more I reuse my code (or other's), the less code I write. Less code have less bug, is produced faster, and is likely to be more maintainable. Code reuse can mean two things : the code I write will be easier to reuse, or It will be easier for me to write code that can be reused. All other things being equals, I prefer the former over the later, since the later require a conscious effort when writing my code, or at least a fair bit of refactoring. It is also possible to have a mix of the two, depending on what mechanisms are available. When we talk about OO, we generally talk about these: inheritance, design patterns, polymorphism (ad-hoc), and genericity.

If not carefully used, inheritance is dangerous : some implementation changes in the base class can compromise the correctness of the inherited ones. We can loose the benefit of encapsulation. (I don't remember the toy example I was told about. It had something to do with list processing.) Code sharing through inheritance is then quite difficult, unless a certain level of stability can be assumed from the base class. However, the combination of inheritance and late binding lead to ad-hoc polymorphism, which is quite useful when one wants to have a set of object of the same type that behave differently. This feature is powerful, but not exclusively OO : the same behaviour can be achieved with dynamic typing. Just imagine a print function which behave appropriately (and differently) with integers and strings. Then, the real advantage of Ad-hoc polymorphism through inheritance is binding the code to the behaviour (and the class) rather than to the functionality. Apart from that, it looks like just a way of getting around a static type system.

Design patterns are more a way of programming than a language feature. They just fit well with the current usage of the most popular OO languages these days. A design pattern is roughly a incomplete snippet of code that solve some class of problems. Each time one have a similar problem, she copy (and completes) the appropriate snippet. At the expense of writing a few time (if not many) the same kind of code, we are supposed to write more maintainable and less buggy code. My concern here is : can't we factor out these patterns? I am personally more and more annoyed when I see similar pieces of code again and again (a good example being "for" loops, especially those verbose ones from C). Worse : some design patterns are factored out : those which mimic functional programming, just because the language lacks closures and anonymous functions.

Genericity is probably the most important: we often need to apply some behaviour on data of several (if not all) datatypes. Writing code for each datatype is almost nonsense, if we have a way around this. Genericity is such a way. Great, but definitely not an OO concept. Only true Object languages, in which everything is an object, can naturally achieve this through mere inheritance. Otherwise, we have the ugly C++ templates, or a type inference like in ML and Haskell, or just a dynamic type system.

Conclusion : I don't know if OO really suck. I don't know OO well enough. However, I am under the impression that OO designs are often a way to empower an otherwise weak language, but without giving it full power. For example, I am currently coding a neural network, for an application in the Real World. A neural network takes an sequence of real numbers as input, and return a real number as output. This is a function. More accurately, this is a parametrized function. The parameters (a number of thresholds) are used to define the exact behaviour of the network. And I want to work with several networks at once. So the function we need happens to be a closure. I use a language which don't support closures, so I am forced to wrap each network into an object, using a quite verbose pattern. That suck.

2 comments:

Anonymous said...

says Mr "28 pins" :)

Anonymous said...

Well said.