coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: some concern about the fix of " tail: consistently output all data f


From: Zhang, Bingxuan (Nokia - CN/Hangzhou)
Subject: RE: some concern about the fix of " tail: consistently output all data for truncated files"
Date: Wed, 9 Nov 2016 07:18:39 +0000

Hi,

Let's not mix 2 problems here.

1. glusterfs problem  => We'll continue the investigation.

2. tail problem, let's discuss it separately from glusterfs bug, just from its 
own design.
        New version: when find file size reduce, print content from 0 to the 
reduced_size.
        Old version: when find file size reduce, stay in the end of the reduced 
size and wait for new content.
Both 2 ways has its limitation,  neither of them are perfect or precisely.
Here I just want to say the older version is better than new version in my 
understanding. 
Refer to man manual, the '-f' option is designed to print the file which is on 
append mode, but not designed for the file which might have truncate happen on 
it.
"tail" should focus on what is added, but not on the data from original printed 
size part of the file.
=============================
# man tail
TAIL(1)                          User Commands                         TAIL(1)


NAME
       tail - output the last part of files
...
       -f, --follow[={name|descriptor}]
              output appended data as the file grows;
...
=============================

Br, Jimmy

-----Original Message-----
From: Zizka, Jan (Nokia - CZ/Prague) 
Sent: Wednesday, November 09, 2016 3:08 PM
To: Zhang, Bingxuan (Nokia - CN/Hangzhou) <address@hidden>; Lian, George (Nokia 
- CN/Hangzhou) <address@hidden>; Pádraig Brady <address@hidden>; address@hidden
Cc: Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Bao, Xiaohui (Nokia - 
CN/Hangzhou) <address@hidden>
Subject: RE: some concern about the fix of " tail: consistently output all data 
for truncated files"

> -----Original Message-----
> From: Zhang, Bingxuan (Nokia - CN/Hangzhou)
> Sent: Wednesday, November 09, 2016 6:36 AM
> To: Lian, George (Nokia - CN/Hangzhou) <address@hidden>; Pádraig
> Brady <address@hidden>; address@hidden
> Cc: Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Zizka, Jan
> (Nokia - CZ/Prague) <address@hidden>; Bao, Xiaohui (Nokia -
> CN/Hangzhou) <address@hidden>
> Subject: RE: some concern about the fix of " tail: consistently output all 
> data
> for truncated files"
> 
> Hi,
> 
> I wonder the original requirement of "tail", what is the purpose of this tool?
> Referred to:
>       tail - output the last part of files
> 
> Here when "tail" found the some file length become small, is it really need
> to print old content?

but tail cannot know if that is old content. The truncate detection was added 
there
to overcome problem when someone overwrites the file being tailed, in which case
it should indeed start dumping the file from beggining. 

> My opinion is that ignore those old content is better alternative.

OK but how would you do that as tail doens't know that it is old content ...

> 
> It is possible those "old content" is written newly (e.g. truncate to 0, then
> write small content).
> It is also possible those "old content" is really old (e.g. truncate to small
> size).
> 
> So "tail" can do perfect design here to trace every piece of data write to the
> file.
> But it should focus on only the data to the last with current reality.
> 
> So my opinion is "revert to previous design" is better choice then currently.
> What you think?

If the change is reverted then you will get regressions on the cases for which 
this
was added so that is definately not an option.

What should be fixed is GlusterFS instead of trying to make workarounds for its
misbehaviour. As Pádraig also noted:

> This stale st_size behavior, giving a smaller value _after_ a read,
> seems quite problematic to lots of apps though, not just tail(1).

this will affect other applications and tools not only tail. If you make some 
kind of 
workaround in tail for this and GlusterFS is not fixed then this problem will 
stay 
hidden and will hit some other application sooner or later.

Jan


> 
> 
> Br, Jimmy
> 
> -----Original Message-----
> From: Lian, George (Nokia - CN/Hangzhou)
> Sent: Wednesday, November 09, 2016 9:36 AM
> To: Pádraig Brady <address@hidden>; address@hidden
> Cc: Zhang, Bingxuan (Nokia - CN/Hangzhou) <address@hidden>;
> Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Zizka, Jan (Nokia -
> CZ/Prague) <address@hidden>; Bao, Xiaohui (Nokia - CN/Hangzhou)
> <address@hidden>
> Subject: RE: some concern about the fix of " tail: consistently output all 
> data
> for truncated files"
> 
> Hi,
> >What network file system type is this?
> 
> The file systems is GlusterFS of Redhat,
> 
> >This stale st_size behavior, giving a smaller value _after_ a read,seems
> quite problematic to lots of apps though, not just tail(1).
> I agree, but I still suppose more application will do get st_size first then 
> do
> seek and read which will not over the size of file.
> 
> We also have submit the issue to GlusterFS community, but till now, they
> can't find the root cause in glusterfs.
> 
> I still complain to "tail application", even if there has some issue on
> glusterfs,
> but "tail" eat all the space of the disk (by continues pseudo-truncate for a
> large syslog file)  , I suggest "tail" could do some change to prevent it.
> 
> Thanks & Best Regards,
> George
> 
> -----Original Message-----
> From: Pádraig Brady [mailto:address@hidden]
> Sent: Tuesday, November 08, 2016 7:29 PM
> To: Lian, George (Nokia - CN/Hangzhou) <address@hidden>;
> address@hidden
> Cc: Zhang, Bingxuan (Nokia - CN/Hangzhou) <address@hidden>;
> Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Zizka, Jan (Nokia -
> CZ/Prague) <address@hidden>; Bao, Xiaohui (Nokia - CN/Hangzhou)
> <address@hidden>
> Subject: Re: some concern about the fix of " tail: consistently output all 
> data
> for truncated files"
> 
> On 08/11/16 02:50, Lian, George (Nokia - CN/Hangzhou) wrote:
> > Hi,
> >>> Add one more suggestion, if we have not a perfect solution to consider
> all the case of truncate, could we add an option to tail, such like tail -no-
> truncate
> >>> If tail run with this option, than application not consider any truncate
> case.
> >>>
> >>> For example, I suppose syslog output file will not have any truncate case
> in our environment, then the tail could use the option to avoid the mis-
> truncated case?
> >
> >> Note for case 2) above, we only update fspec->size _after_ the read,
> >> so I'm not sure how practical the race with reading a _smaller_ st_size
> after that is?
> >> I.E. the heuristic is fairly good I think,
> >> so an option may be overkill.
> >> We'd have to see a demonstratable issue to consider such an option.
> >
> > We have an issue now for tail a syslog file which stored in a network-based
> file system. A automated cased need tail the syslog about one hour to get
> the syslog of that period,
> > in that period of one hour , happen 6 times of  un-expected file truncated
> issue, so the output of tail has 6 times full syslog file, so the output file 
> is so
> huge and eat all of the disks.
> > The network-based file system maybe not so easy to change to meet the
> current implement of "tail" application.
> > So I need helps from yours :)
> >
> > And which your mean for demonstratable?  The issue we encounter could
> be easy to reproduce, maybe the file-system is not so strict like ext4 file
> system,
> > but I still suggest "tail" application could do some change to adapt this
> kinds network-based file system?
> 
> It's important info that you have seen the issue.
> What network file system type is this?
> We might just revert this change if the issue is widespread enough.
> 
> This stale st_size behavior, giving a smaller value _after_ a read,
> seems quite problematic to lots of apps though, not just tail(1).
> 
> thanks,
> Pádraig.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]