| |
February 2008
Arc's been out for a few days now. Everyone seems to have an opinion
about it. The low end are much the same as the low end of opinions
about anything online. There's no high end yet, because no one has
had enough time yet to be able to speak from experience about what
it's like to program in Arc. Those are the responses I'm really
interested in. But in the meantime, we have the medium-level
responses: the opinions of people who have some understanding of
the issues involved, but who are writing based on first impressions.
I've noticed a strange pattern in these. The main fault they find
with Arc is that I don't seem to have had to work hard enough writing
it. Ron Garret
writes:
And that is my main gripe about Arc: it has been so long in the
making and set such lofty goals and then it seems to pretty much
punt on all the hard problems of language design.
I don't usually refute criticisms directly. Refutations tend to
be more gratifying to write than to read. But in this case I'm
going to, because in this case Ron & Co are mistaken in an illuminating
way. Explaining why will clarify some important issues about
language design.
Ron's right. I didn't decide what problems to work on based on how
hard they were. Instead I used what might seem a rather mundane
test: I worked on things that would make programs shorter.
Why would I do that? Because making programs short is what high
level languages are for. It may not be 100% accurate to say the
power of a programming language is in inverse proportion to the
length of programs written in it, but it's damned close. Imagine
how preposterous it would sound if someone said "The program is 10
lines of code in your language and 50 in my language, but my language
is more powerful." You'd be thinking: what does he mean by power,
then?
I'm not claiming that power is the only criterion by which to judge
programming languages. It would also be a legitimate goal, for
example, to design a language to be easy for kids to learn, or to compile
efficiently (though this is less common than it used to be), or to
limit the damage that can be done by individual bad programmers
within a group. But power is the test of languages not designed
for such special purposes.
So working on what makes programs short rather than what's hard
to implement translates to:
I chose what to work on based on the value to the user,
rather than the cost to me.
Surely this is the right order of priorities to have in designing
not just programming languages, but anything meant to be used
by other people.
This is not to say none of the stuff I did was hard. Some of it
seemed hard to me. But in language design, solving problems, whether
hard or easy, is not the goal. Making a good language is. The
real test of Arc—and any other general-purpose high level language—is not
whether it contains feature x or solves problem y, but how long
programs are in it.
The programs that get shorter should be the ones users actually
need to write. So my m.o. while working on Arc was to write
applications in it, then comb through them line by line trying to
imagine language features that would make them shorter. Then I'd
implement those features, rewrite the program to use them, and
start over. Here's a comment from the source of
Hacker News where I
occasionally kept track:
; results of (codetree "news.arc"):
; 8787, 8760, 8738, 8726, 8823, 8755, 8703, 8628, 8587, 8565
; 8633, 8573, 8552, 8520, 8510, 8498, 8549, 8515, 8684, 9025
; 9573, 12196, 12469, 12648, 12373
(The numbers go up sometimes because I'd added features to News.)
This is one reason the source code of Arc itself is so short,
incidentally. I did the same thing to it. But my first priority
was making applications shorter, not the language. There are
features, most notably Prolog-style pattern-matching, that seem to
promise great savings in length, but turn out only to be useful for
writing a few basic sequence operations like append, remove, and
so on. Prolog is a great language for writing append; after that
it's all downhill.
Another goal I had while writing Arc was to continue as long as
possible in the mode in which McCarthy began. In his original 1960
paper he built Lisp up from "axioms" like car, cdr, and cons, through
"theorems" like assoc and mapcar.
[1]
There must be some optimal path
all the way up to a complete language. What is it? McCarthy didn't
get very far along it in his paper. And after that the language
passed into the hands of his grad students, who at the time were
more worried about the exigencies of making an interpreter run on
the IBM 704 than continuing McCarthy's axiomatic approach. We've been
living with their hacks ever since. Steele and Sussman tried to
start over when they first began working on Scheme, but they seem
to have been practically the only ones. And they made, at least
from the point of view of brevity/power, some serious mistakes early
on.
This seemed a territory worth exploring. And I hoped that with the
axioms pushing from below and the demands of brevity in real
applications pushing from above, I'd be able to grow an optimal
core of operators. I'm not claiming I have yet, just that that's
the goal: to find an optimal path from a small number of axioms
up to a complete language for everyday use. I've made compromises
for efficiency. I'm not using Church numerals.
[2]
But I've tried to preserve
as much of the spirit of the original 1960 paper as I could.
Building up the language from axioms is not an end in itself either.
I'm only doing it because I suspect that's the way to get
maximum expressive power.
How well does Arc deliver so far? Does it make programs shorter
than they'd be in other languages? Let's try measuring.
I'm going to propose a simple problem as a challenge.
We'll collect solutions in each of the popular languages, and
compare their lengths. Here it is:
Write a program that causes the url said
(e.g. http://localhost:port/said) to produce a page
with an input field and a submit button. When the submit
button is pressed, that should produce a second page with
a single link saying "click here." When that is clicked
it should lead to a third page that says "you said: ..."
where ... is whatever the user typed in the original
input field. The third page must only show what the
user actually typed. I.e. the value entered in the
input field must not be passed in the url, or it would
be possible to change the behavior of the final page
by editing the url.
Though simple, as such tests have to be, this is not a contrived
example. Web apps have to do this sort of thing all the time. Nor
does it depend on some sort of esoteric libraries that Arc has and
other languages don't; this is all stuff that any language used to
write Web apps would have to have already.
Here's the answer in Arc:
(defop said req
(aform [w/link (pr "you said: " (arg _ "foo"))
(pr "click here")]
(input "foo")
(submit)))
If you're not used to Arc you'll have to take my word for it, but
this is not code that has been compressed using coding tricks. This
would be the standard way to write it in Arc.
The most meaningful test of the length of a program is not lines
or characters but the size of the codetree-- the tree you'd need
to represent the source. The Arc example has a codetree of 23
nodes: 15 leaves/tokens + 8 interior nodes. How long is it in your
favorite language?
(Code to import standard libraries doesn't count, of course; you
can assume those are already loaded.)
I've posted this problem on the Arc Forum at http://arclanguage.org/item?id=722.
If you have a solution in a language that's not yet represented,
or a shorter or more correct solution for a language that is, please
add it. It will be interesting to compare languages not just to
Arc but to one another.
Update: May 2009
The Arc version would now be slightly shorter:
(defop said req
(aform [onlink "click here" (pr "you said: " (arg _ "foo"))]
(input "foo")
(submit)))
This is 21 nodes: 14 leaves + 7 interior.
Notes
[1]
"Recursive Functions of Symbolic Expressions and Their Computation
by Machine, Part I," CACM, April 1960.
http://www-formal.stanford.edu/jmc/recursive/recursive.html
[2]
I did once try representing the integer n as a list of
length n, with horrifying results.
|
|