discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] Proposed enhancements for data networking


From: Eric Blossom
Subject: Re: [Discuss-gnuradio] Proposed enhancements for data networking
Date: Wed, 7 Jun 2006 03:32:33 -0700
User-agent: Mutt/1.5.9i

On Tue, Jun 06, 2006 at 10:33:29PM -0400, Michael Dickens wrote:
> Note to the discuss-gnuradio list: This is an ongoing discussion  
> about BBN's proposed enhancements for dealing with packets.  While  
> not all the previous discussion is here, the most relevant parts are.  
> - MLD
> 
> A brief background:
> 
> I view data as coming in 2 types: "explicit" and "implicit" packets.
> 
> The former are the actual structured packets (e.g. a TCP/IP packet).   
> The BBN proposal is to create a hierarchal scheduler which processed  
> these packets, with a means for incorporating the current gr-graph  
> inside a proposed m-block.
> 
> The latter are data streams, segmented arbitrarily (hence "implicit")  
> into chunks (hence "packets") of data.

Questions re "implicit" packets:

  How would you segment them?
  How is this expressed / represented?
  How would you work this into the GR framework?
  What if you don't know how long the packet will be until it's done.
  How would you associate "metadata" with the implicit packet?
  Would any of this be visible in the "work" or "general_work" method?


> From this viewpoint, there  
> are similarities between the data types, and it would not be  
> difficult to create a "queue" which handles both types.  By handling  
> both data types, I come up with a hybrid scheduler, one which has a  
> primary routine to handle prioritization, pausing / stopping / (re) 
> starting, and calling the different data types' "sub"-schedulers -  
> which handle just that one packet and then return.


> >We may not want to modify data in place, because other pieces of code
> >may be hold references to it.  Also, the input and output sizes may be
> >different.  That said, there's nothing in the plan that would prohibit
> >you from modifying data in place.
> 
> Am I understanding correctly that the m-scheduler will basically do  
> the same thing that the gr-scheduler does: it'll provide both an  
> input and output packet reference to the called m-block, which will  
> do processing, filling up the output packet as it goes?

Not exactly.  The handle_message callback of an m-block
is invoked by the m-block scheduler passing a message.  There is no
"output packet reference".  The callback is free to do whatever it
likes.  It may ignore the incoming message.  It may create and send
any number of messages to any port that is visible to it.  Some of the
messages may be very small -- just the "signal" -- e.g., "Reset".
Others may have substantial payloads -- a variable length number of
bytes that's to be turned into samples.  Recall also that the message
ports are bidirectional.

> Or is there something else which will be done?

The callback can do anything but block.

> >>>>Is there some reason why to create the whole new codes and not try
> >>>>to integrate into what's already there?
> >>>
> >>>First off, we *really* want hiearchical scheduling.  That is, an
> >>>m-block can fire off and run a flowgraph from beginning to end.   
> >>>This
> >>>gives us the ability to "wrap" for example a classic GR
> >>>modulator, while controlling the input and harvesting the output.
> >>>This allows the m-block to impose the "packetization" in such a way
> >>>that the existing GR code doesn't notice.
> >>>
> >>>Second, the two abstractions really have different scheduling needs.
> >>>The m-block is priority based (based on message priority), and
> >>>theres's no need to check for anything like "Do I have enough  
> >>>input" /
> >>>"do I have room for the expected output".  I've considered this
> >>>option, but didn't see a way to combine them that wasn't an extreme
> >>>kludge.  Also, keeping them separate will allow m-blocks to be  
> >>>reused
> >>>outside of gnuradio.
> >>
> >>I can accept the argument that you want to keep the programming as
> >>separate as possible so-as to be able to re-use m-blocks outside of
> >>GR.  I think it is wise to keep the current stream-scheduler code
> >>away from the new packet-scheduler code as much as possible since
> >>they're different beasts.  But ...
> >>
> >>Why hierarchical and not "hybrid"?  I assume by the former you mean
> >>that "doing packets" is the default, and "doing streams" is a sub-
> >>method - which might involve a slight penalty w/r.t. how many
> >>function calls it takes to execute a gr_block graph.
> >
> >I seriously doubt that it will be visible above the noise.

I'm not opposed to "hybrid", I just don't get the semantics.  I have
the feeling that you're not understanding our desire to "wrap" the
invocation of a flowgraph.  It would be very similar to how the
existing qa_*.py QA code works.  It sets up one or more sources, some
transormations and a sink or two, then uses fg.run() to run the
flowgraph until completion.


Do you envision that it would be possible for a block to have both
ports and streams?

> >>In a hierarchical scheduler, I'm
> >>guessing that "p2p" would be the default method, "s2s" would be the
> >>current scheduler operating as a subroutine of "p2p" which could be
> >>#define'd in or out (or some similar way that might look nicer), and
> >>the others would be glue to "make things work" between packets and
> >>streams - maybe some very specific sinks or sources which "know" how
> >>to deal with the various I/O necessities of "p2s" and "s2p".
> >
> >In the case of an m-block which contains an embedded flow graph (I
> >suspect that this will be a small minority of all m-blocks), the
> >handle_message callback of the m-block will pass any relevant data
> >to the source that talks to the flow graph and then will invoke
> >gr_single_threaded_scheduler (or something that calls it).
> >When gr_single_threaded_scheduler returns, the callback will finish up
> >whatever processing is required (if any) and will return (to the
> >m-block scheduler which invoked the callback.)
> 
> Ah ... so the m-block itself invokes a gr-scheduler.

yes.

> My initial   reaction is to question why this needs to be the case: why not 
> keep  
> data processing in the blocks and leave scheduling to the scheduler?   

Because the effect we are striving for is the ability to "delegate"
signal processing to the flowgraph.  We want to give it "this data",
then have it processes it, then harvest the result.  Then we want do do
something else with the resulting data.  Again, quite similar to how
the QA code works.  Among other actions, fg.run() calls the scheduler
and waits until every thing is done.

> True, part of handling a message might be to invoke some other  
> routine to work on the data .. and that routine could be in FFTW or  
> the gr-scheduler.  Either way, it's interesting.  I'll think about  
> this one a bit, as it wasn't obvious up front (to me).

OK.  FWIW, we expect that many m-blocks will implement a state machine
or state chart in their callback.  This makes it possible to handle
protocols, timeouts, etc, in a very straightforward way.  Pretty much
all of this "control plane" stuff is a pain to do using the data flow
abstraction. 

> As a related comment: I presume I could run a gr-graph alone inside  
> the m-block ... there would be no need to "zero-copy" data from the m- 
> block to the gr-graph since the sink and source would be in the gr- 
> graph.  Am I correct here, or will some other method be used to run a  
> gr-graph alone (without the need for the m-stuff)?

Yes, it could run without data from the m-block.

> >>Why not have a main scheduler loop which checks the next block in the
> >>queue for its type, then calls a subroutine to handle the execution
> >>(memory management and such) for that block -this 1 time-.  Upon
> >>return (with whatever information is appropriate to return from that
> >>block's execution), the super-scheduler would do whatever it needs to
> >>do re: prioritization and timing, then call the sub-scheduler for the
> >>highest priority block in the queue.  In this manner, all blocks in a
> >>graph would be executed over, no matter their type or priority.
> >
> >I'm not following this.
> 
> Think of it (roughly) as moving the gr-scheduler execution from the m- 
> block - as in BBN's proposal - to the a sub-scheduler of the super- 
> scheduler ("super" in the sense of a scheduler that calls the other  
> ones) ... processing one "data block" at a time (whichever is the  
> highest priority data block - either an explicit packet or a  
> "implicit packet" stream).

Still not clear on the "implicit packet" and how the code knows about
boundaries, etc. ;)

> If the highest priority data in the queue is a explicit packet
> (which requires packet-only processing), then the super-scheduler
> calls the "p2p" scheduler to deal with this packet.  If the highest
> priority data in the queue is an implicit packet (which requires
> stream-only processing) then the super- scheduler calls the "s2s"
> scheduler to deal with this stream data.  Upon return (with whatever
> appropriate arguments are returned), then the super-scheduler
> re-prioritizes the data to be processed.  Repeat until either all
> data is processed or the process is killed.

> >If I've got high priority m-blocks, I want to
> >run *only* them before taking a look at anything else.
> 
> This could be done by setting the priority correctly.  The super- 
> scheduler takes care of that.  As -any- type of sub-scheduler returns  
> after processing a given block type, the super-scheduler reorders the  
> current data "to process" list such that the highest priority data is  
> next in the queue ... the actual implementation would probably be  
> inserting the returned block into the queue at the correct priority  
> location.  Either way, very-high-priority data (of any type) could be  
> run through processing quickly, while ignoring all other data  
> temporarily.
> 
> >>There isn't a need to "wrap" m-blocks around gr_blocks ... data in
> >>the form of packets or streams would be moved appropriately by the
> >>sub-scheduler for its given block type.  m-blocks and gr_blocks could
> >>live "side-by-side" in the same graph.  The code could be easily
> >>tweaked (#def'd or whatever) to be m-block specific, and thus could
> >>be made to work outside GR.
> >
> >[#ifdef considered harmful...]
> 
> No smiley ... hmmm.  So how else would m-block and gr-stuff be kept  
> separate if an m-block method is calling gr-scheduler?  I'm going  
> back to the desire to keep the programming separate so that m-blocks  
> could be used independent of GR.  Ah ... a special block type which  
> "knows" how to execute the gr-scheduler, and does nothing else, yes?   

That's the basic idea.  In any sizable system it's a good idea to
reduce the amount of coupling between parts of the system.  This leads
to a much more maintainable and understandable system.

In this case you could imagine 3 modules:

  gnuradio-core (knows nothing of mblocks)
  mblock        (knows nothing of gnuradio-core)
  gr-mblock     (knows a little bit about both)

gr-mblock would probably represent less than 5% of the code in
gnuradio-core + mblock, and is the only code that depends on both
gnuradio-core and mblock.

> If not using GR, then that file just isn't compiled.  That would be  
> clever, and indeed much better separation of programming than the  
> hybrid model I'm proposing.

Yes.

> >The whole point of this exercise is to allow m-blocks to execute GR
> >flow graphs as "subroutines".
> 
> Why require m-blocks be the place where gr-graphs are called from?   

First off, nobody is forced to use the m-block stuff.  All the old
stuff will continue to work just as it does now.  That's part of the
reason for the strict separation.

> Why not incorporate both types of graphs together into one graph?   
> It's easier to check on "connect()" (if you choose to use a static  
> graph, or even if dynamic changes are allowed ... even allows for  
> dynamic changes to the gr-graph blocks) ...

Note that in the BBN proposal there are no "graphs".  
There are only mblocks, some of which contain other mblocks.

> ...speaking of which, how  
> are the gr-graphs checked?  I suppose they're connected first (which  
> does the checking), then the source and sink are passed into the m- 
> block in which they will be processed?

They'll be checked exactly as they are today.  When you call
fg.start() or fg.run(), there are many steps that take place including
ensuring that the contracts specified by the i/o signatures are
honored, buffers are allocated and linked into the behinds the scenes
parts of the gr_block_details, the graph is partitioned and
toplogically sorted to give us a traversal order that is likely to
keep the cache hot.

> Thus a special type of m-block which knows how to handle gr-graphs?

Could be, but more likely just an exported interface quite similar to
fg.run(), except that it may be invoked more than once without going
through all the overhead of allocating buffers, etc.

> [See above for the "lightbulb" ;-] And a special type of gr-block
> "source" and "sink" would need to be made to handle "p2s" and "s2p"
> zero-copy, yes?

Exactly!

> >This behavior allows us to associate
> >metadata with incoming and outgoing messages.
> 
> There is no reason meta-data couldn't be passed to a stream block ...  
> it would by default just be ignored (not written to or read from,  
> except for timing of block execution and anything else a generic  
> block type would want).  Of course, new gr-blocks could be written to  
> use the meta-data if they wished.

Still not clear on how a gr-block would see and/or generate an
"implicit packet" boundary.  Would it be possible for a gr_block to
reorder "implicit packets"?  How would you attach the metadata?


> A difference between the hierarchal and hybrid designs is that the  
> latter augments current gr-blocks with the basic properties of m- 
> blocks ... latency timing, meta-data availability, whatever would be  
> common between both block types.  While I like the current gr-blocks,  
> I would like them even more if they "knew" something about time and  
> the data they were processing.  I see no reason why to not allow meta- 
> data for streams ... could be useful (e.g. "this is a MP3 data VBR  
> 192 kpbs, joint stereo" ... and the receiver could dynamically [re] 
> configure a graph to deal with that particular set of meta-data).

OK.  I'm still not sure we're on the same page with regard to
metadata.  I'm not thinking only of metadata assocated with a given
stream, but rather metadata associated with a particular
packet/message.


> >If they weren't nested,
> >how would the m-block know when the graph had completed generating the
> >data?
> 
> In the hierarchal design as proposed, there would be no way to know  
> when the graph had completed.  Hence that design requires that the m- 
> block call the gr-scheduler.

Yes.

> In the hybrid design, the m-block would have no knowledge of  
> completion ... if wouldn't need that knowledge, just as gr-blocks  
> don't.

Agreed, but that doesn't solve the problem. An mblock may want to
"harvest" the output of executing the flow graph.  Just like the QA
code does.  [If you haven't already, now's a good time to take a look
at some of the qa_*.py files.]

> That knowledge would reside in the scheduler, as it currently  
> does - just an upgraded scheduler to handle all block types ... a  
> more integrated scheduler than that in the current proposal, which  
> has both benefits and drawbacks.

...

> >Also note that the m-block handle_message callback must "run to
> >completion".  That is, it is not allowed to block in the callback
> >waiting for something in the outside world.
> >
> >>I haven't actually implemented such a hybrid scheduler, but I've
> >>thought about it enough to believe it can be done without too much  
> >>to-
> >>do.  While it does involve re-writing a bit of GR's scheduler code, I
> >>think the add-on benefits would be worth the effort.  For example, in
> >>what I envision, the current gr_block's would be augmented with the
> >>basics needed to deal with time-based processing: min/max/avg latency
> >>computation, min/max/avg throughput computations, and/or anything
> >>else which would be generically required for any type of block no
> >>matter the data I and O types.
> >
> >>I can certainly write more about this, and might even possibly do
> >>some quick programming to see if/how well it would work.  Thoughts?
> >>Are these really the same thing, or am I just dreaming about mine?  
> >>- MLD
> >
> >I'm not sure that I'm following what you are saying...
> >But I'm interested in hearing more ;)
> 
> Ah, the smiley!  I like this discussion - though it does take time to  
> read and digest - since I think it's the future of GR in many ways,  
> no matter which scheduler type gets implemented.  I definitely am  
> getting a better understanding of the proposed enhancements than from  
> just reading the chapter.  Hence I will re-read it & see if there are  
> points which could be clarified ... tomorrow.

Sounds good.

> >How do you propose that m-blocks and gr-blocks interact?
> 
> They interact in the same was gr-blocks do in a gr-graph right now.   
> One could create a graph which links p_source -> p2p -> p2s -> s2s ->  
> s_sink.  There could be multiple p2p's linked together, or multiple  
> s2s's ... just one p2s between them.  It's possible to use the "zero- 
> copy" as mentioned in the proposal for the p2s part (and s2p part)  
> instead of an actual block, which might be the default, but it could  
> also be possible to create a specific block type which handles the  
> initial and eventual conversion to/from packets and streams.

Yes.

Remember too that there is "no flow" in the mblock world.  There are
just messages being sent and received, and they may pass
bidirectionally through ports.

Good questions and comments!

Eric




reply via email to

[Prev in Thread] Current Thread [Next in Thread]