lilypond-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Uncommented code in LilyPond


From: David Kastrup
Subject: Re: Uncommented code in LilyPond
Date: Mon, 03 Sep 2012 11:45:50 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2.50 (gnu/linux)

"address@hidden" <address@hidden> writes:

> As a frequent producer of uncommented code that pushes into a code
> base with uncommented code, I am more than willing to put comments.
> However, I'll say that a lack of comments has never stopped me from
> understanding how something works in LilyPond.

With all due respect, you don't even understand how the performance
regression in your latest uncommented own submission comes about.

If you don't understand your own code right after writing it, how do you
expect anybody else to understand it?

In an ideal world, people start writing the comments.  First about what
they are doing, then it becomes more fine-grained, namely _how_ they are
going to do it.  If this is done properly, the code itself can then be
written by any code monkey.

And can subsequently be improved in passing by another code monkey who
has a better idea how to cast the described problem into code.

With LilyPond, this does not work.  When looking at any code, you first
need to analyze its _low-level_ operation.  Then you need to figure out
all the steps it does and hope that they form a coherent picture
_solving_ a problem.  Then you need to develop the problem description,
and _then_ you can think about a reimplementation.

Take a look at <URL:http://codereview.appspot.com/6498070/>.  The code
does not contain a _single_ comment.  You remove some things, juggle
some things around.  _What_ is this supposed to do?  _Why_ is something
happening that results in better output?  _What_ is happening?  Where is
the explanation about why the code you removed could be removed?

Did you understand why the original code was in place?  Probably not,
since the original code was not documented either.

The current LilyPond code base is a piece of unmaintainable crap.
High-level crap, but crap nevertheless since it is, due to lack of
comments, tied very hard to its original authors.

You are not responsible for the original heap of crap (which is rather
high-quality because much of it is well-discussed two-person crap
between two highly capable programmers who just did not bother writing
down their discussions).

But by heaping new layers of undocumented undesigned unmaintainable crap
on, LilyPond becomes manageable worse.  And more and more people think
this is the way to do it.

> It is a single threaded program, so if you are not getting a result
> you expect, it is easy with valgrind or gdb to find where things are
> being set and why.

Very funny.  There is a simple rule: debugging code is twice as hard as
writing it.  If you want to be able to debug your code, don't write at
the limit of your capacities.  Coding is not composing.  The work is
never finished, and other people need to be able to continue what you
started.  And understanding foreign code is twice as hard as
understanding your own.  If you are writing at the limit of your
capacity, it will take somebody twice as smart as you are to work with
your code.

Deal with it.  Good well-maintainable code is boring.  Its _design_ may
be clever, and that design needs to be documented.

> I don't get what the underlying design is of which you speak - what of
> it is undocumented?

You don't even write what your code is supposed to do.  _Nothing_ is
documented.

Take <URL:http://codereview.appspot.com/6493073>.  In contrast to most
of your code, there even is a comment.  Hooray.  Too bad it is
completely unusable for figuring out the situation triggering the
problem you try tackling.

How is _anybody_ supposed to maintain this kind of code when you are no
longer around?  How are even _you_ supposed to maintain this kind of
code in a year from now?  Or if you suffer a light stroke?

It is not like you are alone with that kind of attitude.  It is by far
the most prevalent cause of projects going down along with their
programmers.  Good code does not depend on the continued well-being of
its programmer.

LilyPond is too large by now to sacrifice it to the egos of its current
developers.  We can't afford that attitude.

We need to get into a state where LilyPond code, freshly submitted or
historic, is reviewable in any meaning of the word that makes sense.

At the current point of time, we are on a fast road to
unmaintainability, and we have gotten rather far already.  Graham's
answer to that is to make it easier to get patches approved since rather
few people have the confidence of attempting to work with the mess
LilyPond is.

We can't go down that road forever.

> Rarely are tricky things like "set_property" used and when they are,
> it is usually clear how and why (for example, the set_property calls
> in Stem::set_stem_positions are used to set stem positions).

Sure.  But why do you set the positions?  What cases are you dealing
with?  What changes affect the pure calculations, which ones the unpure
ones?  What assumptions do you make that are provided by different code?
All this is necessary to figure out how to make working changes.

> As I have relative little coding experience, I have no clue what is or
> isn't good commenting practice.  I would be fine putting a comment
> above Stem::calc_control_points saying "This function calculates the
> control points for a stem" but I think all the information you need is
> in the function name and in the function's logic.

Good comments don't repeat what the code already says.  They tell the
story _behind_ the code.  _Why_ the code does something.  _How_ it fits
together with other stuff.  _What_ the underlying problem is that the
code tackles.  If you find somebody on your premises building a wall and
ask him what he is supposed to be doing, you don't want him to explain
how he mixes mortar, and how to put one brick on the next brick, and how
to interleave bricks.

> What did take me time to learn is how everything fits together (what a
> callback is, where things are triggered when, what happens before
> what).  Some sorta automatically generated flow chart may help this -
> I'm sure there are tools for that.

Get off it.  There are no tools for reflecting the high-level design of
a program, unless that design is formalized and tools _particular_ to
that design are written.  We do that for grob properties and interfaces
and a few other things.  But the rest of the story is in comments, if
anywhere.

> I've read through libraries that put paragraphs of comments above
> single lines.  I spend so much time reading the prose that I can't
> focus on the logic.  If variable names were called x0f00___EE7 in
> LilyPond, I wouldn't be able to grock the logic, but I feel stuff is
> pretty clear.

Yes, to the degree where you ridicule people telling you otherwise
<URL:http://codereview.appspot.com/6498077/diff/21/lily/phrasing-slur-engraver.cc#newcode119>
until proven wrong. <URL:http://codereview.appspot.com/6498077#msg7>.

And if you then reply with something like

    The time hike seems awfully expensive - there must be a way to
    optimize it.  I'll post something that works when I have it and then
    we can figure out how to optimize it.

it gives me the creeps.  You still think you are the only person in the
world who should ever working in that area, so you still are not worried
about actually documenting what you want to be doing.  Instead we'll get
the next undocumented iteration which will at some point of time pass
our review and submission processes because people just give up on
trying to understand what you are doing.

>> It is also a timebomb for maintenance since changes might violate
>> underlying assumptions.
>
> The regtests catch this, no?

No.  Regtests are not designed for figuring out the weaknesses of your
code.  Humans are, if you give them something to work with.

Our regtests are nice for pointing out huge blunders affecting unrelated
areas.  But the _targeted_ regtests don't _construct_ corner cases like
human reviewers can.  They test what the creator of them did think of at
the time of creation.

>> And that is just talking about code that actually works for good
>> reasons.  Quite a bit of code works since it has been prodded into
>> not failing under those circumstances that tend to be tested.
>
> I agree about the faulty code thing but not about the prodded thing -
> I think once the right test case comes up one can almost always find
> what is failing and why.

Reality check: take a look at our open issues.

>> It is easy to make it easier to meddle with LilyPond code.  The low
>> number of contributors is not due to our toolchains.  It is because
>> few people are comfortable poking around in the dark.  And for good
>> reason.
>
> I agree that LilyPond is gargantuan, but I think the main impediment
> to understanding how things work is a global lack of documentation
> about how things fit together.

And you are improving this just how with your "not even the commit
message tells what I am doing here, let alone any comment" kind of
patches?

> This could be put in the CG.  If we can decide that certain basic
> elements of LilyPond's design will be frozen, this is worth doing.

Horse radish.  What is worth implementing, is worth documenting.

> I am completely on the same page with you about wanting to make
> LilyPond easier to contribute to.  Statements like "There is a lot of
> code that X" or "Quite a bit of code Y" may be true, but it is very
> tough to tackle the problem w/o concrete examples.

What's wrong with pointing out the complete absence of any commented
rationale in your submissions?

When I point out where particular code passages of yours are missing a
rationale, how about putting it _in_ the next iteration instead of doing
something else and giving me the "silly old fool" treatment?

Yes, I _am_ a silly old fool, but I have written a lot of code in my
lifetime.  Yes, this includes complex stuff written in assembly language
where the only comments were accumulated cycle counts (in order to keep
game timing, partly even including sound frequencies, accurate and
deterministic without relying on timer circuits).  And some of this
stuff remained astonishingly clear and well-designed when looking at it
decades after writing it.  But nobody except myself is going to maintain
it, and what you don't see are wastepaper baskets full of drafts and
diagrams.  They are inherent in the code, and that _does_ help in
understanding the code because it is _not_ just written from scratch,
but an actual implementation of a long-planned coherent design.

And the source code had to be stored on magnetic tape at 2400 bps, and
it had to fit in something like 50kB at the most.

We don't have those kind of restrictions any more.  The design, or at
least its constraints and how and why the current code meets them,
belong in the source code.  Not implicitly, but explicitly.  Code is
written once, and in any healthy project, read multiple times.  So it is
the task of the writer to make efficient use of the readers' resources.
When reading is faster than figuring things out, and this is pretty much
_always_ the case where the relations with other code pieces and the
choice of particular non-trivial data structures are concerned, a
comment saves even a single reader more time than it costs the writer.

You probably know the joke about the experimental physicist approaching
the university dean for more money.  And the dean says something like
"Money, money, money.  Why can't you take a leaf from the
mathematicians?  We just provide them with pencil, paper, and a
wastebasket, and that's all they need.  And the philosophers don't even
need a wastebasket..."

Regarding the expensive storage area for comments, some of the old code
might have been written by mathematicians.  Which is bad.  What is worse
if you figure "oh, this makes all sense, so I'll write without comments
as well and make equal sense" because you never got to see the
wastebasket.  Han-Wen and Jan explained the code to each other until it
made sense.  Too bad they did not write the explanations down.

By writing comments, you can at least explain the code to yourself.  If
you find that hampers your flow significantly, it is likely that you are
not writing code easily explained to anybody.  Including the computer.

I may be a silly old fool, but I've grown old enough to gain the
arrogance to consider code for which I don't understand its purpose
after reasonable consideration is bad, because I am far enough to the
right of the bell curve on understanding computers and programs that
code that is beyond me is for all reasonable purposes beyond meaningful
maintenance, possibly even by its own author but most certainly by an
average programmer.

-- 
David Kastrup



reply via email to

[Prev in Thread] Current Thread [Next in Thread]