coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: date enhancement - filter to preface each line of input with timesta


From: Evan Rempel
Subject: Re: date enhancement - filter to preface each line of input with timestamp
Date: Mon, 2 Nov 2015 10:22:01 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.0.1

On 11/02/2015 09:42 AM, Pádraig Brady wrote:
On 02/11/15 16:47, Evan Rempel wrote:
On 11/02/2015 02:25 AM, Pádraig Brady wrote:
On 01/11/15 22:57, Evan Rempel wrote:
There is a lot of discussion on the net about prefacing a timestamp to
every line of a text stream. There are solutions with awk, perl, python
and bash. coreutils even provides one using perl. This seems to be a
very heavy tool to do something that date already does all of the hard
work for.

Does it make sense to add an option -p --prefix to make date prefix
every line of stdin with a date/time stamp?
It's useful functionality, though already available using other tools.
http://stackoverflow.com/q/21564/4421
The general rule we use is to not couple logic
unless it provides a functional benefit to do so.
Personally I like the awk solution presented above.

cheers,
Pádraig.
While I agree in principle with the general rule, I do not agree in this case. 
All of the solutions I can find on the web, including the one your cite, 
involve using a different programming language to write a program that solves 
the problem. In some languages, such as awk, the program is very simple, but it 
still requires invoking another interpreter to execue a program. Using that 
logic, there isn't any need for most of the coreutils programs. Just use awk to 
replace cut or perl to replace tr.
The coreutils makes this easier to use and easier to read/understand.

Also in this case, the coreutils package includes an application "ts" that is a 
perl script that addresses this exact problem. This implies that at some time in the 
past, coreutils accepted a solution to this problem.
Note `ts` is in the _moreutils_ package.

Sorry, I overlooked the change in packages.

I am proposing that this functionality be moved to the date program to make it 
less dependent on other system tools. The usual use case for this functionality 
is for log file generation/piping and ths invocation may be prior to these 
other tools from being available. It may also be on systems that don't have the 
required tools or at least where these other tools are not consistently 
available. The coreutils are as the name implies, core, and as such are more 
consistently available compared to
the other more general scripting tools (awk, perl, python).

This feature would arguably make scripts more readable. I would prefer the 
later of these two

<command> | awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0; fflush(); }'

<command> | date --prefix "+%Y-%m-%d %H:%M:%S"

To understand what the awk solution does, you have to understand awk.
Yes it's always a tradeoff. Thanks for detailing the options.
Note the fflush() awk bits are a separate part of the problem really,
and coreutils already gives control over that using stdbuf.
So the comparison really is:

<command> | stdbuf -oL awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0 }'

<command> | stdbuf -oL date --prefix '+%Y-%m-%d %H:%M:%S'

Quote close I think.

Just for effecientcies sake, timing these two solutions on a 1 million line 
input

awk
1.03user 1.06system 0:02.11elapsed 99%CPU
1.11user 1.04system 0:02.17elapsed 99%CPU
0.97user 1.17system 0:02.17elapsed 99%CPU

date --prefix
0.63user 0.81system 0:01.48elapsed 97%CPU
0.66user 0.80system 0:01.50elapsed 97%CPU
0.62user 0.84system 0:01.50elapsed 97%CPU

Almost 63% more CPU when using awk!
What buffering did you use with the date --prefix test?
How does it compare using: stdbuf -oL date --prefix
I suspect awk might be better than date(1) then
since it's tuned for I/O processing.

adding the "stdbuf -oL" to both, the timing for awk does not change, but the 
timing for date gues up. Obviously I was
using buffering for output.

time stdbuf -oL awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0 }'
1.05user 1.10system 0:02.17elapsed 99%CPU (0avgtext+0avgdata 2740maxresident)k
1.01user 1.13system 0:02.16elapsed 99%CPU (0avgtext+0avgdata 2768maxresident)k
1.01user 1.13system 0:02.17elapsed 99%CPU (0avgtext+0avgdata 2768maxresident)k

time stdbuf -oL tmp/date --prefix "+%Y-%m-%d %H:%M:%S"
0.78user 1.08system 0:01.89elapsed 98%CPU (0avgtext+0avgdata 1932maxresident)k
0.76user 1.09system 0:01.89elapsed 98%CPU (0avgtext+0avgdata 1908maxresident)k
0.82user 1.05system 0:01.89elapsed 98%CPU (0avgtext+0avgdata 1932maxresident)k


time stdbuf -o32K awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0 }'
0.90user 0.84system 0:01.78elapsed 98%CPU (0avgtext+0avgdata 2792maxresident)k
0.85user 0.87system 0:05.43elapsed 31%CPU (0avgtext+0avgdata 2740maxresident)k
0.86user 0.87system 0:01.77elapsed 98%CPU (0avgtext+0avgdata 2764maxresident)k

still slower with awk, even with buffering enabled.

I just wanted to get a feeling for how this feature would be received by the 
coreutils maintainers before I submitted the actual feature request and patch.
I'm still 60:40 against, given the above options.

If the perl dependency was an issue I'd be more inclined to reimplement ts(1) 
in C
calling into the date libs, and presenting some of the date(1) options,
because that provides additional functionality like relative timestamps.

thanks,
Pádraig.

I would be happy with the ts reimplemented in C. I just want to have a light 
weight date prefix tool that is easy to read
on the command line when used inside a bash script. I thought that since date 
already had options like --file that an additional option
would be fine. When I come accross the need for adding date stamps to liines, I 
always think of date first because it already prints dates.
After reading the man page for date, I find it can not do this, then I do 
something like.

man -k date | fgrep '(1)'

to look for another tool, and I don't find anythiing. ts only shows up if you 
look for time :-(

Then I find that ts is a perl script, and I'm back to searching the net.

So I should stop developing the patch for date and wait for your new ts?

--
Evan Rempel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]