bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

GNU Summer of Code application


From: Walter Mundt
Subject: GNU Summer of Code application
Date: Wed, 03 May 2006 05:11:44 -0400
User-agent: Thunderbird 1.5.0.2 (Windows/20060308)

Earlier I submitted a SoC application to work on the suggested findutils improvements.

Leslie Polzer (who I presume is reviewing many of the GNU apps since she doesn't appear in the bug-findutils archives) submitted a question on the app:
What are your plans when you have finished this project?
Will you take a maintenance role and/or developer role?
How much of a role I take afterwards will depend largely on how much I enjoy working with the findutils code over the summer. However, I'm certainly willing to commit to maintaining the new updatedb and database-reading code for at least 6-8 months after the SoC, or until any issues seem to be ironed out, whichever is longer.

I've decided to put the app-as-submitted up on the mailing list for discussion/critique, so that I can make improvements before I resubmit with that answer (unless I get some feedback indicating that this won't be necessary).

Here is the application as I submitted it, except for some line-wrapping fixes:

Project Name: GNU findutils - slocate compatibility and other enhancements

Summary
  Enhance locate
    1. Enhance locate to understand the database format used by slocate.
         Implement a replacement for the current updatedb shell script
           which does pretty much the same thing but is less ugly. Don't
           introduce a dependency on anything not in the base system
           install (i.e. /bin/sh and C are OK, but Perl probably isn't).
         Add updatedb functionality to traverse the filesystem as root,
           preserving enough permissions information to allow us to
           provide the same functionality as slocate. Use the same
           database format as slocate unless there is a reason not to.
  Enhance find
    Add tests which allow [acm]time to be compared against a specified
      timestamp, as opposed to the timestamp of a file (-newer) or an
      age (-mtime). Add relevant tests to the test suite and document
      the changes.
    Instrument find to allow us to improve the guesses that parser.c
      makes for struct predicate . est_success_rate. Measure the (lack
      of) performance increase in find 4.3.x with optimisation turned
      on.
  Enhance xargs
    1.Implement an optional feature in which xargs figures out how long
      a command line it can pass to exec() without necessarily believing
      ARG_MAX (because for example with the Linux kernel this can be an
      underestimate).

Benefits to the Community

Each of these enhancements will have their own benefits, so:
slocate compatibility: slocate compatibility will reduce user
  confusion and add important new capabilities to a tool that
  is installed by default on a vast number of Linux distributions.
updatedb replacement: a new updatedb will be easier to maintain,
  and may be faster as well.  Adding new capabilities or locate
  database enhancements will be less of a chore once the tool
  that produces the database is better-designed.
xargs enhancement: This enhancement appears primarily to be a
  performance enhancement for large-scale xargs usage.  However,
  as the easiest enhancement to implement, the lesser benefit
  is also acceptable.

Deliverables
  - NOTE: all patches are to include updates to all relevant
    documentation.
  - Patch to xargs to add optional automated ARG_MAX recaculation.
  - Patch for find to add new options for checks of [acm]time vs.
    a particular time/date.  Names and syntax to be discussed with
    project mentor(s).
  - Patch to find to add est_success_rate instrumentation/improvements.
  - New updatedb, either a C program or a clean shell script.  The new
    version will be capable of generating both current locate and
    slocate-style databases.
  - Patch for locate to add slocate compatibility and (in the presence
    of a slocate-style database) functionality.
  - Extra: if all of the above get done with time to spare, work on an
    additional patch to locate/updatedb to add ACL and support to the
    security-checking mechanism.  Alternately/additionally, add an
    option to locate to attempt to stat database hits to check for
    “real” access if all the database-supported permission checks pass.

Plan

Start with the xargs patch, which should be relatively quick.  Discuss
and prototype the new find predicates next; if there are issues, work
on this in parallel with the next item.  Continue working with find,
on the est_success_rate improvements.  After that, compare the locate
and slocate updatedb implementations and decide exactly how to
approach writing the new updatedb.  Finally, write the new version of
updatedb and the supporting changes to locate in parallel.

Qualifications:

I find this project appealing because, as a regular user of all of
these tools, I can really see myself making use of them.  They're
also in a domain of which I have a very good understanding.

I'm suited to work on this project because I have a thorough
understanding of C and shell scripting, as well as experience
in using these tools.  I'm also a competent generalist programmer:
I competed in this years International Collegiate Programming
Contest finals.  Finally, I do have some experience with Free
Software: I worked on the TWiki collaboration tool.  (see twiki.org)





reply via email to

[Prev in Thread] Current Thread [Next in Thread]