« Implementation Language Selected: Lisp | Main | Up and Running with Lisp »

September 11, 2004

History Lesson: When Did We Start Missing the Point?

Now that I’ve selected a truly ancient language, I’ve also selected a truly ancient database system to go with it: PROLOG. I think I’m due to give a short history lesson on the why’s involved.

Back in the early days of computer science, there were three models of computation: Turing’s (the Turing Machine), Von Neumann’s (the Von Neumann Architecture) and Church’s (the Lambda Calculus). Big proofs were made that demonstrated that these various models were in fact equivalent. Most of these proofs boiled down to, for example, implementing a limited Turing Machine in the Von Neumann archictecture or demonstrating that it could be done and vice-versa. Two of these three models are widely-referenced, and those are the Turing Machine and the Von Neumann Architecture. People liked the Von Neumann Architecture the best, because either it was a good picture of the hardware or because the hardware could be made to look like it without much difficulty (I’m not actually sure which is the case and I’m too lazy to look it up). People liked (and still like) the Turing Machine because it’s easy to explain and looks good on a white board, especially when you’re taking your Automata class and it’s sitting next to your DFA for matching regular languages or whatever.

Now, back in the early days, they actually made a handful of languages before Backus et. al. came and figured out what a language “is”—they actually implemented compilers of a sort without any of the real technology we think of today as being necessary for the job. As a result, FORTRAN is imminently difficult to parse, far more so than any other language we use today.

FORTRAN is interesting today primarily because of all the mistakes that were made in making it. Students every day take Principles of Programming Languages at their universities and learn all about FORTRAN, and while some of the notions of FORTRAN have survived, most of those notions were present before hand. I’m talking about arrays, functions, and algebraic notation.

At the time, users loved FORTRAN because it was so much more readable than assembly code, and because it was designed more-or-less to map onto the hardware IBM was making at the time, you didn’t lose much by using FORTRAN over assembly. In the years that followed, most of the compiler optimization literature would come to be written about optimizing FORTRAN, so that today it’s not uncommon for a good FORTRAN compiler to have as many as 17 passes, each one optimizing the code. This is also why FORTRAN is still the most-preferred language on supercomputers, though some of that is also because all of the historical code for supercomputers is written in FORTRAN, and nobody wants to have to rewrite any of their code for some kind of short-lived fad like structured programming.

LISP was invented in 1958, just 4 short years after FORTRAN. FORTRAN’s main idea was basically this: abstract the hardware to the procedure level, and stay close to the hardware for everything. Thus, it was born out of necessity—programmers needed some kind of abstraction in order to be productive. LISP, on the other hand, was created for several purposes: to be an implementation of Church’s Lambda Calculus (or something reasonably close to it), to be a mathematically-speaking high-level abstraction from the hardware, and for list processing. :) LISP is dynamic in truly amazing ways: take this “reader macro” from Successful Lisp :


(set-macro-character #\$ 
		     #'(lambda (stream char) 
		       (declare (ignore char)) 
		       (round (* 100 (read stream)))))

Now, whenever you type in, for example, “$14.99,” it will get translated to “1499” by the reader. If you use this convention throughout the code, you can avoid strange rounding errors with money. Quick: name another programming language that lets you modify the syntax on-the-fly.

LISP, as you can guess, ran abysmally slow. This was for many reasons, the most frequently mentioned is of course automatic memory management. In the mathematical land of free-floating functions, there is no such thing as memory, numbers do not occupy physical space, and so forth, so LISP couldn’t be bogged down in ridiculous implementation details like that. Working around those ridiculous implementation details back then meant that the computer spent mind-numbingly long periods of time figuring out which memory was in use and what wasn’t, and freeing what wasn’t. LISP also let you define anonymous functions and pass them around. It has a bevy of fantastically outer-spacey features, because it is meant to provide you with the simplest abstraction that encompasses all other abstractions, by allowing you to create abstraction-creation machines. What I mean by this is that LISP is intended to allow you to factor out anything you see in your code in more than one place, into a single place. Even if that “thing” is recursion, for example, using the commonly-cited-and-highly-dreaded applicative order Y-combinator, which I include below, in Scheme, mostly to fuck with people’s heads:

(define Y
   (lambda (X)
      ((lambda (proc)
          (X (lambda (arg)
            ((proc proc) arg))))
       (lambda (proc)
          (X (lambda (arg)
	    ((proc proc) arg)))))))

Is LISP the perfect language? I can hardly answer that question. It does seem to win based on some common criteria: it’s small, it’s extremely powerful, it’s fast (by today’s standards), etc. It also loses on some important points: string mangling is sort of shoddy compared to languages with built-in regular expression facilities, the syntax is “weird,” lack of infix arithmetic is especially strange. (+ (* 3 (* x x)) (- (* 2 x)) 5) doesn’t look much like 3 * x^^2 - 2*x + 5, which is much closer to what I really want to say, which probably won’t look right in the browser anyway, which is 3x2-2x+5.

But let’s look forward a little bit, past the presentation of LISP. About a decade goes by, with lots of *OL languages being invented (in other words, ugly languages designed with some perverted purpose to them). Then, suddenly in 1970 we see a very interesting language appear: PROLOG. In 1971, the curse of my existence is invented, C. I’ll discuss both of these in turn.

PROLOG, like LISP, was intended to bring humanity to a higher level of abstraction. Instead of saying, “do this,” we say “these are the qualities an answer would have” and PROLOG goes and finds your answer for you. It sparked a whole family of languages (as LISP did), called the declarative or logic-based family. Unlike LISP’s family, the functional languages, PROLOG to this day remains the most successful of its family, though some might caution that SQL comes pretty close to doing the same sorts of things. I would say the same thing about LISP, but for a connundrum I’ll address in a few minutes, which makes it very difficult to say what a functional language really is.

PROLOG is almost universally hated. I can think of many reasons for this: it has almost none of the traditional control structures, it has poor data types, backtracking is weird (though exception handling isn’t?), predicates don’t return values in the traditional sense, and often you want to do something and wind up doing something pretty unreadable to achieve it. For example, take the meat of this program I wrote or stole a while back which finds words which overlap:


overlap(NameL, NameR, Overlapped) :-
        name(NameL),
        name(NameR),
        NameL \== NameR,
        atom_chars(NameL, LNameL),
        atom_chars(NameR, LNameR),
        append(_, Seg, LNameL),
        append(Seg, RightSeg, LNameR),
        length(Seg, N), N > 1,
        append(LNameL, RightSeg, Appended),
        atom_chars(Overlapped, Appended).

This code works by describing the solution. It says basically this: I have a predicate overlap/3, which is true when the first argument overlaps with the second argument by more than one character, producing the third argument. In point of fact, it’s doubtful that this predicate would work if it were used in any way other than the normal way, which is find two names that overlap and the result of that overlap. The first three lines are pretty simple: give me a NameL and a NameR that aren’t the same (otherwise, you’ll get NameL = NameR = Overlapped). The following two lines deal with the fact that PROLOG’s types suck (it should probably be rewritten to work with strings instead). Then, we say something along the lines of: Suppose there is a right portion named Seg, which appended to some value we don’t give a damn about, produces LNameL. Further suppose that Seg can be appended to some RightSeg to produce LNameR. If Seg is longer than 1 character in length, append the left name (LNameL) to the right segment of the right word (RightSeg) giving Appended. Convert it back to an atom and that’s Overlapped, the third argument.

Comparable code, even in a nice language like Ruby, is going to be huge. The reason, of course, is because of all the helpful searching that PROLOG does for you, which would have to be coded in Ruby.

I hope this shows the power of PROLOG. Show me how to do that in SQL!

In the late 70’s and early 80’s, PROLOG compilation became possible, and what with all the AI research going around, PROLOG was also a fairly hot topic. Most of the information about PROLOG today is based on information that was produced in the 80’s, and most of the texts we use today in this field were written in the 80’s. Since then, it’s basically deteriorated into a community that makes the LISP community look vast. But there is PROLOG for every platform, so I guess I’m satisfied.

The bane of my existence, C, was conceived of in 1971, following PROLOG by just one year. But C, unlike LISP and PROLOG, became very successful. Linguistically speaking, C is pretty dull. According to the tree available at the Computer Programming Languages History , we see that C comes from BCPL, which comes from CPL, which comes from Algol 60, which comes from FORTRAN. Thus, C came from a long line of hardware-oriented programming languages. The innovation of C seems basically to be the concept of portability: C is intended to be fairly easy to implement, and abstract from a given platform, but still tightly coupled to the hardware. For example, you have to manage your own memory, you have to worry about the sizes of your integers, you have to worry about whether you’re about to make an invalid memory access, etc. Following along the lines of Worse is Better , C isn’t a massive improvement over BCPL or any other language, it’s just that it’s better enough to spread like wildfire. C++ also sucks, but is everywhere because it rides on the back of C. In a recent interview Bjarne Stroustrup himself admits basically that C++ in built on C because it’s C:

“I think that ‘Algol68 with Classes’ would have been a better language than ‘C with Classes.’ However, it would have been stillborn.”

Such an admission is probably a clue. Look to the current family of worthwhile scripting languages, Ruby , Python , and Io . Each one supports functional programming, in the sense that the Y combinator can be written in it and anonymous functions can be made. What does this mean? It means that programmers find these concepts interesting, and perhaps even useful. In Ruby, for example, it can be difficult to get anything done without a block, and the block is creating (essentially) a lexical closure, or an anonymous function. In Python, if you’re using generator, you’re basically making a closure and re-executing it each time—very cool, but not quite as awesome as what’s going on with Ruby:


100.times { |x| puts x }

And here in Python:


def getN(N):
    i = 0
    while i < N:
        yield i
        i += 1

for i in getN(100):
    print i

The for loop is basically a block, but you can’t make a block in Python and just pass it around to places. The generator/iterator facility is great, but not quite as good as the Ruby method, which lets you do things like this:


myDatabase.execute("SELECT name, age FROM members") do |name, age| 
    puts name + " is " + age + " today!"
end

Ruby is really getting there in terms of functionality. Can I create the reader macro $ in Ruby? Nope. Why are we so limited? Well, if you read Worse is Better up there, the assertion the author makes is that incremental improvement is favored by people for some reason. People don’t want blam, here’s LISP, learn it, love it, it does everything. People want Ruby—“_because it’s better than Python!_” People wanted Python—“_because it’s better than Perl!_” People wanted Perl—“_because it’s better than Sed & Awk!_” People wanted Sed & Awk over C, people wanted C over BCPL, people wanted BCPL over CPL, over Algol, over FORTRAN. Add to this the whole problem of magnetic personalities and programming language choice being (I hate to say it) almost a religious war, and things are even worse than this. We’ve been trying to escape from the wrong paradigm for so long, but all we can do is inch ever closer to where we want to be, adding syntax, built-in abstractions, new concepts, and so forth.

Am I bitter? I guess I am, to some extent, that I’ve invested all this time and energy into things which just aren’t as good, to such a point where I actually feel fear for what it’s going to be like to do Project SOULTRAIN in these other technologies. The accepted Right Answer these days would definitely be Python or Ruby + an SQL-based database. But then I would be facing the deliciously maddening inability to marry relational and object-oriented paradigms. I lose sleep over that problem almost every night. Well, going this route, I just don’t have to think about it. I hope it will be liberating, but I know that the liberation just lets me get back to the problem at hand, which I don’t know how to go about fixing. People don’t want to be liberated to work on real problems, they want to be liberated to worry about how to format this line of code, or how to get the database to give us the data we want in a timely manner. People aren’t interested in actually doing anything. We want to make it easier to do things, but we don’t really want to do things. Above all, I guess I feel bitter because I’m scared again, like I felt when I was trying to learn Python for the first time and I didn’t know anything else. Scared of LISP… damn, I hope Allan isn’t reading this.

I guess I’m proposing an extension to Worse is Better: Inefficient is Better. Mac owners trying to convince people to buy Apple, Linux users trying to convince Windows users to convert, widespread use of C++ and Java all point to a fundamental desire to not get things done. If we really wanted to get things done, we’d all be sitting in the console cranking out LISP. But we’re not. The Linux people are sitting in Fluxbox (whose “killer” features can be implemented from within FVWM2, Sawfish, and SCWM) staring at each other’s blogs, wondering how much better things would be if we had XML-RPC on our servers. Did we start missing the point at FORTRAN? If LISP came out today, would everyone start using it? I bet we probably would, and it would probably have a better regex library. Now, moving to LISP is an evolutionary step: get rid of the syntax. Back in the FORTRAN days, LISP was a revolution: functions as first-class citizens, garbage collection, recursion, machine-independence. These are all commonplace now, except for the syntax. Maybe Arc will make it happen, Paul Graham seems to have a lot of interesting wacky ideas.

Posted by FusionGyro at September 11, 2004 06:24 AM

Trackback Pings

TrackBack URL for this entry:
http://www.clanspum.net/~fusion/blog/admin/mt-tb.cgi/2

Comments

Post a comment




Remember Me?

(you may use HTML tags for style)

Want HTML? Use Textile instead.