[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gnu-arch-users] programming in the large (Re: On configs and huge s
From: |
Thomas Lord |
Subject: |
Re: [Gnu-arch-users] programming in the large (Re: On configs and huge source trees) |
Date: |
Tue, 18 Oct 2005 18:29:05 -0700 |
Alfred: you impressed me by digging fairly deep (e.g., coming up with
the unexec bogosity in package-framework) so I have a fairly long and
detailed reply for you.
And yes, dear trolls, this does remain Arch relevant.
Tom> Alfred argues that Autoconf plugs into the role of a configure
tool
Tom> for "programming in the large" because it handles sub-projects.
Tom> Eh.
Alfred> I guess you haven't ever used it for something big (tla is
Alfred> quite small in my opinion)
My first full-time job, back before I was fully into free software,
was writing a configure/build system and applying it to a reasonably
large system (the Andrew toolkit and the suite of applications built
on that).
Tom> Autoconf has climbed too far up the dependency stack (meaning it
Tom> relies on too much other software).
Alfred> Tom, you're smart, and I like you.
Thanks. You seem smart, too.
Alfred> But stop being silly, autoconf relies on less software than
Alfred> your package-framework (it relieas on awk and printf in
Alfred> addition to the tools that autoconf relies on).
There are two phases in autoconf (and this is part of its problem).
One phase is translation of `configure.in' and the other phase is
execution of `../configure'.
The first phase of autoconf relies on GNU m4 and perl. Increasingly,
in practice, it relies on automake and libtool. While jocularly
dismissed in the autoconf documentation, the circular dependency
between m4 and autoconf is a clear bootstrapping bug.
The awk dependencies in package-framework are extremely slight --
easily done away with. I don't claim that package-framework is the
right code-base to start-with, only that it's a good demonstration of
what is achievable.
One thing package-framework, in combination with a portability library
like libhackerlab demonstrate (to, let's say, a solid
working-prototype level) is that applications don't need the
two-phase hair of auto*. It's simpler, at least as effective,
and certainly more easily maintained to have applications above
the bootstrapping level depend on just a minimal GNU development
environment.
The portability gymnastics of that minimal development environment
are worth it, sure -- but auto* discourages leveraging the benefits
of going through that effort.
Add to that the number of packages that make poor or outright
incorrect use of auto* -- and the size of the documentation and
obscurity of the codebase for auto* -- and you should start to
wonder who is benefiting from its widespread use and why.
Tom> Autoconf, at least as commonly used, is lousy at dependency
Tom> discovery and awkward to control to override its defaults.
Alfred> I disagree, --with-FOO=/dir/to/foo is quite flexible. Far
Alfred> more flexible than hard coding crti.o as you have done for
Alfred> unexec on GNU/Linux platforms (did you know that the
Alfred> standard location for C run time init object files is
Alfred> actually /lib on GNU/Linux?)
Yeesh, you dig deep. I like that.
The code to which you refer is vestigial -- left over from an
experiment I did quite a few years ago to provide an emacs like
`unexec' for systas scheme.
Again, I don't claim that package-framework is polished and ready to
go, only that it's a good demonstration. So far as I know, it is safe
(and appropriate) to simply delete the `src/build-tools/configs'
subdir and the small amount of script code and Makefile scraps that
depend on it.
So, sure -- that code's a bug but one that's of no consequence.
Alfred> Infact, I think that normal users should simply use binary
Alfred> packages. If you are a developer and wish to hack on
Alfred> something, it is trivial to configure a program.
Actually, I strongly agree but with a qualification.
"Normal" users should be getting binaries, yes.
Those binaries should come from a competent supplier, preferably
geographically or at least logically close, and I personally have
in mind a ratio of engineers to users. I don't have my figures handy
but I recall working it out to something like 30 per 10,000.
That prices out, to consumers, like a fairly inexpensive premium cable
channel per personal computer, with money left over for R&D and,
across a nation or a globe, lots of paid labor left over for free
software R&D.
And I mean that 30:10,000 ratio to be specific and real. You've got
those 30 running a 10k-seat distribution business *and* a feedback
community. That gives lots of redundancy and grounds R&D in what
"normal" users are thinking and doing.
Tom> In other words, it does sorta ok at looking in "standard
Tom> locations" to find a dependency but that facility doesn't seem
Tom> to well-handle the case when you have sibling source components
Tom> in a tree being installed in a non-standard place.
Alfred> And package-framework does not fix any of that.
Weakly agree.
The REQS and OPTS dependency sorting stuff is part of a solution.
The emphasis on code-layout within a tree as a guide to simplified
construction is part of a solution. The unfinished
package-dependency stuff in there would round it out.
All of the above could deal with another pass now that experience
has been gained but there it is.
So, you're right, but still package-framework has good stuff to
say about the topic.
Alfred> The way you
Alfred> solve it with package-framework (from the looks, I only took
Alfred> a brief look at it right now so I might be completely of
Alfred> base) is that you include each library that is needed. Say
Alfred> you have this little GNOME program that needs some parts of
Alfred> GNOME, would you distribute the whole GNOME suit just so
Alfred> that you compile the program?
Well, yes partly. I'd certainly like to be able to instantiate such
an environment without standing on my head. It should suffice, for
that purpose, for the maintainer of the little GNOME project to
publish an Arch-type `config' file.
And, no -- I'd also want more standardized and better designed install
conventions so that I can mix a bunch of packages in one tree, give
one parameter, and have the `--with' stuff filled in for all the
sub-packages from that (so to speak -- the literal mechanism might
be different).
Alfred> Then there is the major deficency of tla using static
Alfred> libraries for hackerlab. Assume that you have a dozen
Alfred> programs using hackerlab, and you find some security issue
Alfred> or what not in some function, you will end up recompiling
Alfred> everything. Simply out of the question when you have a few
Alfred> hundred programs.
I think dynamic libraries are overrated and widely abused but, yes,
they are also sometimes very valuable.
The package-framework demonstration *would* have support for them
if libtool authors had bothered to float their collected knowledge
of how they work on various platforms in some form other than
their source code. There's a few man-months project there to tease
that information out into a more useful form (ideally making
libtool itself more data-driven from that database of wisdom).
Tom> Autoconf has also become notoriously bloated, etc. It's never
Tom> quite stabilized, even after all these years, which should at
least
Tom> make one suspicious.
Alfred> Once again, I ask you to stop being silly, autoconf has been
Alfred> stable since 2.50 when it got a huge overhaul. GCC is in
Alfred> more flux. It also has less bloat than tla, which
Alfred> implements its own C library just cause you happen to
Alfred> dislike libc for whatever silly reasons, while still needing
Alfred> to link against libc!
Between 2001 and 2004 I made various attempts to download packages in
source form to a FreeBSD system and build them. When packages had
lots of prereqs, config/build/install bugs in auto*-using packages
was most often the show-stopper.
Tom> One thing I wanted to show with package-framework and hackerlib
Tom> is that you can standardize a package-combining system and use
Tom> portability libraries and then you don't need autoconf's hair.
Alfred> Once again, you do not standardise something by inventing
Alfred> something new. I also fail to see what the exact hair in
Alfred> autoconf is, and I'm far to familiar with autoconf.
The two phase thing and its consequences. Lots of packages wind up
with "3rd party macros" that may work on Linux but sure didn't on
FreeBSD.
I admit that my experience is anecdotal and my conviction is based on
a priori consideration of the approach and actual code of auto*. It
could be refuted or made more rigorous by looking more carefully at
what labor goes into, for example, FreeBSD ports or Debian packages.
Tom> Alfred cites unoptimized strcmp as source of tla performance
Tom> issues.
Alfred> No I didn't, I said that it _might_be_ a source for some of
Alfred> tla's performace issues. If I had cited it as a source I'd
Alfred> provide a hard numbers.
Sorry to have mischaracterized you.
My understanding of the numbers is seat-of-your-pants engineering
rather than a patient careful study.
It's definately true that naked benchmarks of the `str_' functions
can't compete with good native `libc' semi-replacements. It's
consistently looked to me like this was never a big deal in tla
performance and the trade-offs (e.g., code so simple it serves as
documentation) have, so far, been more worth it than not.
Tom> by letting go of leadership on, for example GCC.
Alfred> The FSF never let go of the leadership on GCC, they still
Alfred> are and always have, been the leadership. They just did
Alfred> some changes in how it was exactlly managed (i.e. one person
Alfred> maintaining the whole thing and getting loads and loads of
Alfred> bad patches, sound familiar?).
The FSF doesn't have any serious leadership of GCC. It has some
loyalty over narrow issues. E.g.: GCC development involves paper
assignment forms. E.g.: GCC developers are not eager to see the
compilation phases split in certain ways (librifying parts of GCC)
since that would undermine the GPL (as opposed to LGPL) licensing
of it.
But, quick thought experiment: suppose RMS decided tomorrow that
it was desirable to float a simplifed bootstrapping compiler or
conduct a particular major code cleanup in a systematic way. Suppose
we wanted 10% of the effort to go in that direction. What do you
think would happen?
Tom> I rely on plenty of tools that already exist and replace a
Tom> relatively small subset with tools that have some advantages.
Alfred> Instead of replacing (you're quite fond of that it seems)
Alfred> why not fix them and add more advantages instead of doing
Alfred> complete rewrites? It will save both you (no need to
Alfred> rewrite the whole thing), and others (no need to try and
Alfred> understand how your rewrite differs) times.
Details matter. Note that hackerlib, string functions
notwithstanding, don't actually (despite all claims to the
contrary) replace much of libc at all. The `vu_' subsystem,
with the sole exception of `printfmt', doesn't reimplement
squat of libc (and, indeed, relies on libc). Similarly for
many other subsystems. The `rx' subsystem does reimplement
some standard functionality but with radically different (and,
I think, better) performance characteristics and an expanded
API. Out of the bulk of `hackerlib' there are a few 10 string
functions that people complain about and out of *that* alone
people construct arguments such as yours.
Alfred> My major grief with hackerlab is the rewrite of standard C
Alfred> functions, strcmp, printf, ... There are infact many nice
Alfred> things in hackerlab, but the majority is just a silly
Alfred> rewrite of the C library for no apparant reason. Seriously,
Alfred> I really cannot understand how you can justify a rewrite of
Alfred> something silly as strlen! It just makes it a hell for
Alfred> anyone who knows C to figure out how exactlly each new
Alfred> little function behaves.
Hardly the majority.
I took my necessity for replacing `printf' from two things: (1) not
wanting to have to depend on (the very fine but excessive for
this purpose) GMP; (2) certainly not wanting stdio-style buffering.
I totally swiped my better approach to buffering from Andrew Hume
and improved his approach by combining it with `vu's system-call
virtualization.
The `str_' functions have a more regular interface than libc.
There would certainly be no harm in porting, linking with, or
otherwise inheriting the work (where it overlaps) on
platform-optimized libc work-similars but it has never, in fact,
been worth the time.
-t
Re: [Gnu-arch-users] programming in the large (Re: On configs and huge source trees), Thomas Lord, 2005/10/18