[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#24615: [PATCH] sed: handle the patterns which consist of ^ or $ manu
From: |
Norihiro Tanaka |
Subject: |
bug#24615: [PATCH] sed: handle the patterns which consist of ^ or $ manually |
Date: |
Fri, 07 Oct 2016 00:06:23 +0900 |
Hi assaf,
Thanks for reviewing.
On Wed, 5 Oct 2016 00:46:44 -0400
Assaf Gordon <address@hidden> wrote:
>
> 1.
> In the patch, I'd recommend using the global/extern variable
> 'buffer_delimiter' instead of hard-coded '\n' - to seamlessly handle "sed -z"
> for NUL-terminated lines.
> 2.
> While trying your patch, I think I uncovered a sed bug (not in your code):
> It seems 's///m' do not work with "-z".
>
> Compare, correct behavior, anchors match before/after every newline (in a
> pattern with multiple newlines):
>
> $ printf "a\nb\nc\n" | sed 'N;N;s/^/X/mg;s/$/Y/mg'
> XaY
> XbY
> XcY
>
> versus failure to detect NUL as line terminators:
>
> $ printf "a\0b\0c\0" | sed -z 'N;N;s/^/X/mg;s/$/Y/mg' | od -An -a
> X a nul b nul c Y nul
>
> Again, this is not a bug in your code, it was in sed before
> (even before the new DFA regex engine).
> I haven't pinpointed yet where does it originate from.
I also recognized your second point, I looked at your first point.
REG_NEWLINE can only matches '\n'. It can not match '\0'. We can also
look at the bug in sed 4.2.2 or prior.
To fix the bug, we will have to do input line-by-line with sed -z.
> 3.
> From cursory testing, I suspect the following code causes infinite loop with
> your patch:
>
> printf "a\nb\nc\n" | ./sed/sed 'N;N;s/^/X/mg;s/$/Y/mg'
>
>
> As the patch has few nested conditionals in a critical code path,
> I think some tests would be beneficial to ensure full coverage.
> I'll try to write them up in the coming days.
Sorry, it is a bug in the patch.
Now, I send patches to fix these issue.
Thanks,
Norihiro
0001-sed-handle-the-patterns-which-consist-of-or-manually.patch
Description: Text document
0002-sed-fix-matching-with-multi-line-option.patch
Description: Text document
0003-sed-fix-dfa-caller-in-sed-z.patch
Description: Text document
- bug#24615: [PATCH] sed: handle the patterns which consist of ^ or $ manually, Norihiro Tanaka, 2016/10/04
- bug#24615: [PATCH] sed: handle the patterns which consist of ^ or $ manually, Assaf Gordon, 2016/10/05
- bug#24615: [PATCH] sed: handle the patterns which consist of ^ or $ manually,
Norihiro Tanaka <=
- bug#24615: [PATCH] sed: handle the patterns which consist of ^ or $ manually, Assaf Gordon, 2016/10/17
- bug#24615: [PATCH] sed: handle the patterns which consist of ^ or $ manually, Norihiro Tanaka, 2016/10/18
- bug#24615: [PATCH] sed: handle the patterns which consist of ^ or $ manually, Assaf Gordon, 2016/10/20
- bug#24615: [PATCH] sed: handle the patterns which consist of ^ or $ manually, Norihiro Tanaka, 2016/10/21
- bug#24615: [PATCH] sed: handle the patterns which consist of ^ or $ manually, Assaf Gordon, 2016/10/30
- bug#24615: [PATCH] sed: handle the patterns which consist of ^ or $ manually, Norihiro Tanaka, 2016/10/30