discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] comments on stream tags and metadata storage


From: Peter A. Bigot
Subject: Re: [Discuss-gnuradio] comments on stream tags and metadata storage
Date: Sat, 26 Jul 2014 06:25:12 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.0

I'm confused: are we to discuss these issues on the mailing list, or as comments on the wiki issues I created? I thought the latter was the right location. Putting detailed discussion in two places that do not link to each other is not a good approach.

My comments below address interpretation and philosophy, not technical issues.

On 07/26/2014 05:36 AM, Marcus Müller wrote:
Hi Peter,
I'm on very limited time and internet, so let me just answer things
in-text, shortening where contributing to readability.
On 25.07.2014 23:37, Peter A. Bigot wrote:
I've added five issues to cover the topics from my original email and
followups.

http://gnuradio.org/redmine/issues/698 proposes that the key of a
stream tag be a namespaced identifier to avoid conflicts between
individually-developed components.
That's a good idea, and actually, standarizing the way stream tags and
messages should be named and structured was one of the points of the
last dev call, and is thus work to be done.

I've updated the issue with my proposed approach. If this is already being worked and people have an alternative architecture in mind, then please comment on that issue so I stop thinking about it.


The other four are listed inline below with additional responses to
Marcus:

On 07/25/2014 08:46 AM, Marcus Müller wrote:
< issue at hand: inserting tags only seems save from within work, but
this is documented nowhere>

sooo let me just whip up a few comments:
(1) That's a documentation issue, isn't it?
I don't believe so.

http://gnuradio.org/redmine/issues/699 expands on point (1), that
add_item_tag() can only be safely called from within gr::block::work().
And I have to contradict Sean and you :)
Inserting tags *is* safe at any time. You get logical errors when you
don't take care that you don't attach tags to your stream that belong in
ranges that might already have been processed. This is logical; I do
think it would be a good idea to explicitely state it in the documentation.
However, there's no reason not to insert tags anywhere else, just make
sure that nitems_written() is bigger than your inserted tags' offsets.

I do not see how this allows me to guarantee association of a tag with a specific absolute sample. I can guess why you want to insert from outside work(), and could propose compromises, but I'd prefer to do that at http://gnuradio.org/redmine/issues/699



   Anyway, I'm not quite sure
you're right; the insertion of tags is mutexed IIRC, and the
get_tags_in_range() functionality, too, so once the user got his vector
of tags, that won't change anymore. There's the possibility that he
*misses* some of the tags for the range that get inserted after he
got_tags_in_range(), but that's only fair -- it's quite intuitive not to
insert tags after you've handed off the samples to which they would
belong to downstream tags.
I wasn't concerned about tags across multiple calls to
get_tags_in_range(), but for tags that are added to a stream after one
call to work() and before the end of the next call to work().
I'm not sure I understand you correctly.
If (general_)work gets called, the samples that can be processed within
that call to work have left the upstream block's work function already.
If that upstream block decides that he wants to insert tags to samples
he produced in a prior call already, my comment to (1) and basic "how
the hell is this going to look causal to the block downstream" applies!

I don't think we disagree on this behavior, only on how to ensure that the original insertion doesn't get lost or misordered (and whether that's a problem for which the infrastructure should take some responsibility).


(3) I don't see the relation to the discussion, as you said ;) but that
sounds like a bug, so if you opened up a new thread or filed a bug at
gnuradio.org, that would be awesome :)
http://gnuradio.org/redmine/issues/700 records point (3), that GRC
parameter callbacks can be invoked multiple times as a result of a
single user action due to the architecture of GUI and other components.
Thanks! Sounds hard to fix, though, if you say it's due to architecture.
(4) I'm fairly certain the buffers use a deque to store tags, not a map
of any kind. So maybe I'm misunderstanding you, or you misread code
somewhere?
I think what you're describing might be a bug in metadata_filesink, so
that might need some attention! see (3).
Yes, the lead-in example was specific to the behavior of
file_metadata_sink, but the fundamental requirement goes across the
entire infrastructure.  I'm hoping to see support for making that
requirement stick.
It is *NOT* a requirement (so far). So far it is the rule that everyone
that expects multiple tags will do the sorting. And this makes sense --
computationally. I don't think we should "break" correctness of existing
blocks by taking away that requirement from the tag readers just to
shift it to the tag producers. So I strongly disagree with you here, sorry.

If there is information that is conveyed by receiving tags in a non-monotonic offset order, I'd like to have more details.

If the concern is that having the infrastructure maintain the tag order I propose at http://gnuradio.org/redmine/issues/701 will introduce a performance bottleneck, I'd like to see an argument. Preferably as a comment on that issue.

Otherwise, since I feel that a deterministic non-decreasing order of arrival of tags simplifies implementing blocks that process those tags, if you feel that every block that might take advantage of ordered arrival should instead enforce it in their work() functions then we do indeed disagree.


http://gnuradio.org/redmine/issues/701 expands on points (2) and (4),
that the infrastructure should promise to maintain all tags inserted
by blocks in their original order (for any sample offset) and should
document the situations where tags may be discarded by the
infrastructure.


Also, I don't understand the bullet points at the end of your ticket #701.
* preserve order of insertion: GR does this, so how is this an issue?
You missed the "within the same offset". Yes, I really do want downstream blocks to be guaranteed that the tags they receive are sorted by non-decreasing offset. No, I don't want to make every block that inserts or receives tags have to do the sorting.

* preserve incoming tag order: I'm not quite sure I get your point, but
if I do: GR does this, so how is this an issue?
* never discard a tag at the block that inserted it: Why would someone
discard a tag he inserted it? GR doesn't discard tags, so how is this an
issue?

The implementation of file_metadata_sink discards tags, as described. This suggests that GNU Radio as a holistic system does not have an architectural requirement that metadata be preserved. I'd like to see that resolution adopted.

From the rest of your mail I get the impression that our expectations of what the framework should provide versus what individual component developers are responsible for handling are at odds. If the consensus of the GNU Radio development team is that the solution to all this is to clarify and document the existing behavior with no framework changes at all, then most of my issues can be closed once that's done. Ten years experience with systems named X-Midas and Midas 2k, which do much of what GNU Radio does, leads me to believe that would have significant long term costs in development effort and system reliability.

Peter

* clearly document duplicate elimination process: duplicates are *not*
detected, so how is this an issue?

Seriously, tag insertion is *this* 2 lines of code
(gnuradio-runtime/lib/buffer_detail.cc):
   void
   buffer::add_item_tag(const tag_t &tag)
   {
     gr::thread::scoped_lock guard(*mutex());
     d_item_tags.push_back(tag);
   }

So how did you come to the conclusion that there might be race
conditions, necessity to call it only from work, elimination and
reordering problems?
All of the problems you raise are logical problems that can -- in my
private opinion -- best be solved by what should be called a contract,
ie. "do not shoot yourself in the knee by inserting tags to item numbers
that you already processed, and do not read tags from samples you don't
have yet. If you need your tags sorted, sort 'em", which basically boils
down to "um, you now, don't try to circumvent causality by using tags,
and by the way, don't assume they're sorted", which, for the signal
processing guys can further be reduced to "tags aren't per se sorted.
Usual considerations apply."

As a serious attempt to get something like a private conclusion to this
discussion, I'd like to point out that so far tag propagation works ok;
the performance of get_tags_in_range leaves a lot of room for concepts
that perform better, but that should be applied under the hood; if there
are two tags at the same item: someone put them there, let's keep it
like that. If tags are in a certain order: we should keep it like that
(which metadata filesink seems to break, so thanks for pointing that
out!). If you insert tags to items that have been processed in the past:
don't assume someone else will ever look at them. If you're expecting
tags to be in a specific order and don't know who inserted them: sort
em. In most cases, this is a non-issue because tags will usually be
inserted to mark something in a stream, and therefore get the
item-chronological order automatically.

(5)>(5) All stream tags are placed in the extras block,
sorry, can't follow you there. Extras block?
The extras block is described in the section "Extras Information" at:
http://gnuradio.org/doc/doxygen/page_metadata.html

This is all about the on-disk storage format of the metadata file sink,
so it's not really related to the way tags are handled. Maybe we're
misunderstanding each other, but there is a section in the file format
for extras.

http://gnuradio.org/redmine/issues/702 records points (2) and (5),
that the existing file_meta_source/sink corrupt (IMO) the metadata.

Thanks for that bug report!

Again I may not have made it clear that I'm not intending to raise
these simply as complaints.  I have the necessary software skills to
fix them, and they generally include an outline of the approach I'd
propose.  At this time I won't make any promises, but I'd consider
contributing the necessary effort if I can get enough
feedback/sanity-checks/interface validation/similar assistance from
you folks to be confident that it'd be seen as a worthwhile enhancement.
:)
Greetings,
Marcus




reply via email to

[Prev in Thread] Current Thread [Next in Thread]