|
From: | ziyang |
Subject: | Re: [Discuss-gnuradio] Try to improve E100's performance at high sample rate |
Date: | Tue, 17 Jan 2012 20:16:23 +0100 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.15) Gecko/20110419 Thunderbird/3.1.9 |
On 01/17/2012 07:36 PM, Josh Blum wrote:
On 01/16/2012 09:51 AM, ziyang wrote:On 01/13/2012 09:30 PM, Josh Blum wrote:To reduce the computation load of the processor, I tried two methods: 1) modify the gr.quadrature_demod_cf block, replace some multiplication operations with volk-based operations (gr.multiply and gr.multiply_const modules in gr_blocks);I like it. Make sure to contribute patches like that back. :-)Actually, what I did was writing a new quadrature_demod block without the multiplication and delay operations, and connect extra gr.multiply and gr.delay blocks instead in the flow graph. Because my understanding is that the volk functions take a vector (multiple values) as input, and I didn't figure out a way to do the single-item-operation in the volk style.I dont recommend using the extra blocks, that would probably cause more overhead. Looking at gr_quadrature_demod_cf::work, it looks like you can vectorize the operation of the conjugate multiply, then the atan, then the gain scaler. So, that would be one for loop that operates on 4 samples at a time, and calls 3 volk functions.
Josh, thank you for your advice! Before I tried using gr.multiply out of the block, I actually implemented a demodulation block in a way that's similar to your suggestion, but the loop operated on 100 samples at a time. I don't know if it was the 100-samples-vectorization that caused a bad performance. I will try processing 4 samples at a time.
Also, you may consider timing a particular operation as a performance metric, rather than counting the number of demodulated packets.I was wondering if there are examples from which I can learn how to do this?Sorry, I guess there isnt much in the way of examples. You can time individual work functions by adding some code before an after. We have some high resolution timers in gruel/include/gruel/high_res_timers.h
So I call the timer functions of high_res_timers.h before and after the operation in the work function, is that right?
I have also seen people time the block in a simple flow graph with a null source, head, your_block, null_sink. You can time tb.run() and compare run duration vs the non-vectorized code. -Josh
I got two questions about this: 1) Is the "head" block for generating data for the processing block?2) The initialization of uhd is done first after tb.run(), so how could I isolate the processing time from the time between tb.run() - tb.stop() ?
Thanks. Best Regards, Terry
_______________________________________________ Discuss-gnuradio mailing list address@hidden https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
[Prev in Thread] | Current Thread | [Next in Thread] |