The genius of Stallman


In my attempt to write an RTL-based binary retranslator (essentially an
asm->RTL decompiler) i've had to read and understand the some of the GCC
generator programs in some detail. This is a project as part of the "Design
and implementation of the GNU Compiler collection" course , a course
specifically meant for understanding some of the complexities of GCC.

The piece de resistance of GCC is ofcourse the code-generator programs. The
programs are invoked during the compiler build process and they output C
programs. The input to the generator programs are the "machine description"
files, an elegant declarative way of specifying instruction patterns of a
machine. These generated programs are thus the  'machine dependent' part of
the code-generation phase. 

The whole set-up is extraordinarily convoluted,complicated, and elegant at
the same time. 

I've been mainly focussed on the instruction-pattern recognizer:genrecog.

Genrecog is the generator program that generates the RTL->asm_insn recognizer
program. It basically uses the declarative predicates present in the MD
files, converts them to C tests, and returns the code of the instruction to
be emitted.

It was by chance that i stumbled onto the GCC-1.27 source on my hard-disk.
"This will be fun", i said to myself, "It'll probably be just a simple C->asm
conventional compiler, easy to understand". In utter disbelief i stared at
the screen as it showed me the same source file(genrecog.c) i had been
reading for the past 2 weeks.
On top it says "Copyright (C) 1987,1988 Free Software Foundation, Inc."

So yes, the code is 23 years old, and not much has changed! Even way back
then,  back before i was even born, GCC had machine descriptions, RTL,
generator programs, all the infamous passes:
reload,combine,final,recog,expand.

So the code itself has certainly survived the test of time, and opensource.
Let's not forget that GCC was the first Free Software project undertaken,
before the days of the internet(well, almost), when there was no slashdot or
reddit to tell the world about your Great New Compiler.

Observe a file after 2 years in the Linux kernel and you wont have a clue
what happened. Entire subsystems get
elimninated,refactored,butchered,combined,merged,renamed .. you name it.

Developers devour your code like pirhanas, constantly making changes. But
what happened to GCC? It is indeed the Cathedral, standing tall after all
these years.

Coming back to the machine-descriptions. What was most surprising after
browsing the 1.21 source (the earliest version of GCC available,
unfortunately) was that such an amazingly advanced concept was used even at
that time, back when the compiler was in infancy. Even in those early stages,
GCC had it's own (and quite powerful) extensions to the C language (var
length arrays etc), supported a bunch of architectures, and had superb
documentation. Infact, most of the current documentation of GCC (the GCC
internals document) is exactly the same as the one present in 1.21, in the
internals.dvi (this is back in those days when pdf was yet to be invented). 

And here's the really good part: all those 1,25,000 lines of magnificence
were written by one man. Read the changelog to find out who. 

What an amazing foresight and uncomrehensible genius must RMS be having to
write a compiler framework using machine-descriptions and generator programs?
The fact that even today, 25 years after he wrote the first version, after
100s of the most talented hackers have worked on the compiler; his design,
his code, his documentation, his intuition behind the retargetability
mechanism, all stand intact?

GCC then, is the oldest "open source" software still in active use. Sure, a
lot of linux code can be traced back to early BSD/UNIX/SUN, but only in
spirit.

But no, there is something older than GCC. Its the editor i am writing this
on. 
Emacs.
For >35 years now it has been the most advanced text-editor on the planet,
and that is again due to the genius of Stallman's design and programming
skills. Sure, vi users might argue, but the only reason vi has survived so
long is because lots of users have managed to contribute extensions to it.
And it has more users than emacs because it appears to be easier to use. 
Emacs on the other hand is still growing in abilities at a blinding pace -
mainly due to it's brilliant design.


So RMS it seems, was not only the last of the true hackers, he was also the
greatest.