coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: date enhancement - filter to preface each line of input with timesta


From: Evan Rempel
Subject: Re: date enhancement - filter to preface each line of input with timestamp
Date: Mon, 2 Nov 2015 12:52:32 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.0.1

The discussion is getting a little off track.

I don't think that there is any disagreement that there is a need for prefixing 
lines with some kind of date/time stamp. We are only discussing how it should 
be accomplish.

When administering a heterogenous environment with regards to different platforms such as AIX, Solaris, HP/UX, Linux or others, or even different rollouts of a single platform such as a fully provisioned server or a light weight HPC compute node, there is still a need to provide some level of consistency. The GNU Utils have done a fabulous job over the years in normalizing these many platforms. Some of these environments have Python, others have Perl and some don't have either. The need for working with log streams that fail to place a time stamp on each line is present on all of them, but this is one task that has not been standerdized without the requirement of an external interpreter. It is the presence or absence or even versioning differences of this interpreter that results in the current ts tool from being a realistic normalization of a solution to this task.

Without trivializing the complexities of dealing with dates and times, the task 
of placing a time stamp onto each line is trivial. It was this reason that made 
me think that extending the date command would be a good fit. It already does 
the hard work of producing many formats of dates and times. I don't really care 
if a new command is created, or if date is extended. I only want something that 
does not depend on some other programming laguage being present at runtime.

The task is easy to solve in nearly any language, but I can not depend on awk, 
perl, python, tcl, java or any other interpreter to be available in all of the 
enviroments where the problem of missing timestamps on log lines occurs. IMHO 
one of the greatest strengths of the core (or more) utils is that they provide 
this consistency and standardization when nothing else does.

Evan.


On 11/02/2015 12:06 PM, Bob Proulx wrote:
Evan Rempel wrote:
While I agree in principle with the general rule, I do not agree in this
case. All of the solutions I can find on the web, including the one your
cite, involve using a different programming language to write a program that
solves the problem.
I read this as saying you want the tool written in C for portability
and size.  Nothing prevents creating a tool written in C for this
purpose.  However others will suggest writing the tool in Python for
maximum understanding and flexibility.  Everyone won't have the same
set of priorities.

In some languages, such as awk, the program is very
simple, but it still requires invoking another interpreter to execue a
program. Using that logic, there isn't any need for most of the coreutils
programs. Just use awk to replace cut or perl to replace tr. The coreutils
makes this easier to use and easier to read/understand.
I have at times in the past seen people suggest exactly the above.
Most often I have seen this in regards to Python.  It has been
suggested that all commands be rewritten in Python including the
system boot up process so that all of them would be consistently
inter-operable!

Also in this case, the coreutils package includes an application
"ts" that is a perl script that addresses this exact problem. This
implies that at some time in the past, coreutils accepted a solution
to this problem.
As Pádraig noted that is a moreutils program and not a coreutils one.
However it is a good example.  Is the functionality of it not
sufficient?  Since it does exactly what you are asking.  And a two
letter 'ts' is hard to beat for small typing.

I am proposing that this functionality be moved to the date program to make
it less dependent on other system tools.
I strongly oppose adding any feature to 'date' that adds creeping
features other than reading and writing the system clock.

The purpose of 'date' is to get and set the system clock.  As such it
is poorly named.  I wish it had been named something different.
Perhaps 'sysclock' or some such similar to 'hwclock'.  Then people
wouldn't try to push other features into it.  I haven't seen a long
list of people trying to push creeping features into 'hwclock' for
example.  Oh well.  It is what it is.

At some point the human date parsing routines ala news was added to
date.  I think that was a bad idea.  That should have gone into a
different program.  Again, oh well.  It is what it is.  I wish it were
"sysclock $(dateformat --date="Mon, 02 Nov 2015 12:36:13 -0700" +SYSCLOCKFMT)"
instead in order to keep the functionality cleanly compartmentalized.

Instead we have a 'date' program that does 75% of what people want
date manipulations to do and 25% of what it does it does in ways that
people don't like.  Or some fraction.  They complain about it.  I am
sure many people suffer in silence.  The human date parsing routines
are seductive because they are a quick hack that covers a majority of
cases and stops there.  People are always having problems with the
simplistic parsing model used by date.  And yet there it has been for
years and now backward compatibility would be broken if it were
changed.  One problem is bad and the other problem is bad.  Which is
worse?

Adding text formatting capabilities to 'date' just continues down that
road of creeping features.  Would there soon be people asking for a
text templating engine to be added too?  I think it is not unlikely.

The usual use case for this
functionality is for log file generation/piping and ths invocation may be
prior to these other tools from being available. It may also be on systems
that don't have the required tools or at least where these other tools are
not consistently available. The coreutils are as the name implies, core, and
as such are more consistently available compared to the other more general
scripting tools (awk, perl, python).
Everyone wants their favorite feature in the core set installed
everywhere.  Everyone.

Bootstrapping a system has unique challenges.

This feature would arguably make scripts more readable. I would
prefer the later of these two

<command> | awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0; fflush(); }'

<command> | date --prefix "+%Y-%m-%d %H:%M:%S"

To understand what the awk solution does, you have to understand awk.
What is wrong with the 'ts' solution?

   <command> | ts
   <command> | ts "%Y-%m-%d %H:%M:%S"
   <command> | ts "%F %T"

It is an exact match for your request.

Just for effecientcies sake, timing these two solutions on a 1 million line 
input

awk
1.03user 1.06system 0:02.11elapsed 99%CPU
1.11user 1.04system 0:02.17elapsed 99%CPU
0.97user 1.17system 0:02.17elapsed 99%CPU

date --prefix
0.63user 0.81system 0:01.48elapsed 97%CPU
0.66user 0.80system 0:01.50elapsed 97%CPU
0.62user 0.84system 0:01.50elapsed 97%CPU

Almost 63% more CPU when using awk!
We could get even more efficiency if we wrote the code in hand tuned
machine code!  That would speed it up even more.  Should we?  No.
This is optimizing a very small thing and making this very small thing
slightly faster at the expense of creeping features making the code
larger and more complex.

I just wanted to get a feeling for how this feature would be
received by the coreutils maintainers before I submitted the actual
feature request and patch.

Further comments?
This clearly should be a different command other than 'date'.  Such as
'ts' from moreutils for example.

Bob



--
Evan Rempel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]