guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fix 'dirname' and 'basename' on MS-Windows


From: Nelson H. F. Beebe
Subject: Re: Fix 'dirname' and 'basename' on MS-Windows
Date: Wed, 9 Jul 2014 09:16:35 -0600 (MDT)

Eli Zaretskii <address@hidden> comments on misbehavior (or unexpected
behavior) of guile's (basename ...) function:

>    (basename ".foo" ".foo")  => "."
>    (basename "_foo" "_foo")  => "."
>
> Also, isn't the following result wrong as well?
>
>    (basename "/")  => "/"

According to built-in documentation:

        guile> (help basename)
        `basename' is a primitive procedure in the (guile) module.

         -- Scheme Procedure: basename filename [suffix]
             Return the base name of the file name FILENAME. The base name is
             the file name without any directory components.  If SUFFIX is
             provided, and is equal to the end of BASENAME, it is removed also.

So, let us see what these produce:

        guile> (basename ".foo" ".foo")
        "."

        guile> (basename "_foo" "_foo")
        "."

The documentation clearly indicates that the matching suffix is
removed, in which case, the result should be a empty string.  The
function therefore does not follow its documentation, and one or the
other are wrong.

However, the Unix (and POSIX) basename and dirname commands have been
around since at least 1979 (I found them in my Unix 7th edition
manuals from that year), and I think it would be wise to follow the
POSIX standard for their implementation:

        % basename /tmp/x/y/z/foo.bar
        foo.bar

        % basename /tmp/x/y/z/foo.bar .bar
        foo

        % basename /tmp/x/y/z/foo.bar bar
        foo.

        % basename foo.bar .bar
        foo

        % basename .bar .bar
        .bar

The possibly-surprising behaviour of that last example is due to the
wording in POSIX (IEEE Std 1003.1-2001):

>> ...
>> 6. If the suffix operand is present, is not identical to the
>>    characters remaining in string, and is identical to a suffix of the
>>    characters remaining in string, the suffix suffix shall be removed
>>    from string. Otherwise, string is not modified by this step. It
>>    shall not be considered an error if suffix is not found in string.
>> ...

The phrase `is not identical to the characters remaining in string'
means that ".bar" is the result, rather than "".

Also notice that POSIX defines a basename() library function, but it
takes only one argument, and thus does not have the same behavior as
the basename command when the latter has two arguments.  Because guile
offers a choice of 1 or 2 arguments, its basename function was
presumably modeled on the POSIX command, rather than the POSIX library
function.

Also, in guile documentation, would it not be better to replace "file
name", "base name", FILENAME, and BASENAME with the standard POSIX
terminology "pathname" and "filename"?

        /tmp/x/y/z/foo.bar      # a pathname
        /tmp/x/y/z              # the path to (or directory of) that pathname
        foo.bar                 # the filename of that pathname

POSIX says this about those names:
         
>> ...
>> 3.2     Absolute Pathname
>> 
>>         A pathname beginning with a single or more than two
>>         slashes; see also Section 3.266
>> ...
        
>> ...
>> 3.40    Basename
>> 
>>         The final, or only, filename in a pathname.
>> ...

>> ...
>> 3.169      Filename
>> 
>>         A name consisting of 1 to {NAME_MAX} bytes used to name a
>>         file. The characters composing the name may be selected
>>         from the set of all character values excluding the slash
>>         character and the null byte. The filenames dot and dot-dot
>>         have special meaning. A filename is sometimes referred to
>>         as a ``pathname component''.
>> ...
>>

>> ...
>> 3.266    Pathname
>> 
>>          A character string that is used to identify a file. In the
>>          context of IEEE Std 1003.1-2001, a pathname consists of, at
>>          most, {PATH_MAX} bytes, including the terminating null
>>          byte. It has an optional beginning slash, followed by zero or
>>          more filenames separated by slashes. A pathname may
>>          optionally contain one or more trailing slashes. Multiple
>>          successive slashes are considered to be the same as one
>>          slash.
>> ...

>> ...
>> 3.319    Relative Pathname
>> 
>>          A pathname not beginning with a slash.
>> ...
>>         

>> ...
>> 4.11      Pathname Resolution
>> 
>> ... long complex text omitted ...
>> 
>> A pathname consisting of a single slash shall resolve to the root
>> directory of the process. A null pathname shall not be successfully
>> resolved. A pathname that begins with two successive slashes may be
>> interpreted in an implementation-defined manner, although more than
>> two leading slashes shall be treated as a single slash.
>> ...

-------------------------------------------------------------------------------
- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- University of Utah                    FAX: +1 801 581 4148                  -
- Department of Mathematics, 110 LCB    Internet e-mail: address@hidden  -
- 155 S 1400 E RM 233                       address@hidden  address@hidden -
- Salt Lake City, UT 84112-0090, USA    URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]