coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: nl: -d option with a single character


From: KOBAYASHI Takashi
Subject: Re: nl: -d option with a single character
Date: Mon, 14 Dec 2020 14:37:09 -0800

Sorry for my rush job, thanks for your refactoring!
I agree with not breaking compat with POSIX.

However, with this change, the behavior of the -d blank character is
different than before.
I think the previous behavior was important to ensure that all input is
treated as a single page.
This case is not specifically mentioned in POSIX, so I propose it is kept
as the GNU extension.
The fix patch is modified.

Takashi

2020年12月14日(月) 8:53 Pádraig Brady <P@draigbrady.com>:

> On 13/12/2020 23:23, KOBAYASHI Takashi wrote:
> > Hello,
> >
> > I found a bug of -d in nl when a single character is specified. The
> patches
> > are also attached. I think there are two ways to fix this.
> >
> > The current behavior:
> > $ echo -e 'a\n@:@:\nb\n@@\nc' | nl -d '@'
> >       1 a
> >       2 @:@:
> >       3 b
> >       4 @@
> >       5 c
> >
> > POSIX expectations(Attach: nl-d-posix.patch):
> > $ echo -e 'a\n@:@:\nb\n@@\nc' | nl -d '@'
> >       1 a
> >
> >       1 b
> >       2 @@
> >       3 c
> >
> > POSIX(1003.1-2017)
> >> -d  delim
> >> Specify the delimiter characters that indicate the start of a logical
> > page section. These can be changed from the default characters "\:" to
> two
> > user-specified characters. If only one character is entered, the second
> > character shall remain the default character ':'.
> >
> > Also, Texinfo is written in the same behavior as POSIX.
> >> ‘-d CD’
> >> ‘--section-delimiter=CD’
> >>      Set the section delimiter characters to CD; default is ‘\:’.  If
> >>      only C is given, the second remains ‘:’.  (Remember to protect ‘\’
> >>      or other metacharacters from shell expansion with quotes or extra
> >>      backslashes.)
> >
> >
> > I don't like this complex POSIX specification because it doesn't allow us
> > to specify a single character delimiter.
> > Does anyone know why POSIX was specified the way it was? I would like to
> > know the background.
> > I believe that the section delimiter should be generated from the string
> > given to the option, regardless of the length of the characters. And by
> the
> > string given two characters, it is possible to make the second character
> > ":".
> >
> > The following is more clear(Attach: nl-d-incompatible.patch):
> > $ echo -e 'a\n@:@:\nb\n@@\nc' | nl -d '@'
> >       1 a
> >       2 @:@:
> >       3 b
> >
> >       1 c
> > $ echo -e 'a\n@:@:\nb\n@@\nc' | nl -d '@:'
> >       1 a
> >
> >       1 b
> >       2 @@
> >       3 c
>
> Nice find!
>
> I agree that supporting matching a single char would be preferable
> to the POSIX mandated behavior of assuming a second ':' character.
> However no-one has specifically asked for the single char support,
> and I don't think it's worth breaking compat with POSIX and other
> implementations. So we should not apply your second "incompatible"
> patch I think.
>
> BTW, we also have an undocumented behavior that strings longer
> than two characters can be provided.
> We should probably document that GNU extension.
>
> As for the implementation, this code is already a bit "copy and pastey",
> so I'd prefer not to copy the logic again.
>
> I've attached two patches to address this.
> The first changes nothing, but refactors things to simplify the fix.
> The second (in your name) applies the specific fix and adds a test.
>
> I'll apply these later if you're Ok with it.
>
> cheers,
> Pádraig
>

Attachment: nl-delim-fix2.patch
Description: Source code patch


reply via email to

[Prev in Thread] Current Thread [Next in Thread]