discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] QT GUI time sink (float) unnecessary memmove()?


From: Tom Rondeau
Subject: Re: [Discuss-gnuradio] QT GUI time sink (float) unnecessary memmove()?
Date: Sun, 29 Mar 2015 17:20:22 -0700

On Sat, Mar 28, 2015 at 5:32 PM, Andy Walls <address@hidden> wrote:
When testing, I used 5 float streams rumning at over 150 Msps each, with 15 microsecomd bursts of 50 MHz at about 10 microseconds apart. I used enough x points to see two bursts on the gui. Normal trigger. (Free or auto trigger moght be too taxing.)

-Regards
Andy

Andy, if you have a chance, can you check out this new branch:

https://github.com/trondeau/gnuradio/tree/qtgui/controlpanel

It adds the fixes that we talked about. I just want to verify that things are still looking and behaving well for you.

The other trick of this branch is if you go into the QT GUI Time Sink properties and turn "Control Panel" to Yes. I wouldn't mind a quick bit of feedback there, either.

Tom


 
On March 28, 2015 8:06:08 PM EDT, Tom Rondeau <address@hidden> wrote:
On Sat, Mar 28, 2015 at 12:50 PM, Andy Walls <address@hidden> wrote:
On Sat, 2015-03-28 at 14:45 -0400, Andy Walls wrote:
> Hi Tom:
>
>
> On Sat, 2015-03-28 at 11:12 -0700, Tom Rondeau wrote:
> > On Sat, Mar 28, 2015 at 11:00 AM, Andy Walls
> > <address@hidden> wrote:
>
> >         Can this memmove() be safely skipped
> >
> >         https://github.com/gnuradio/gnuradio/blob/master/gr-qtgui/lib/time_sink_f_impl.cc#L627
> [snip]
> >         The volk_32f_convert_64f_u_avx() call is unavoidable as Qwt
> >         wants
> >         doubles for plotting and not floats. But it might also be able
> >         to be
> >         deferred to the very end when the decision to plot is known
> >         for sure.
> >         (But that's more surgery than I care to take on at the
> >         moment.)
>

>
> >  But thinking about the volk convert function, that's both copying the
> > data from the input buffer into the internal buffer as well as
> > performing the conversion. We can't just hold data in the input since
> > we don't want to back up the data until we're ready to plot both with
> > timing and with a full enough buffer -- it's just sampling a section
> > at a time and drops everything in between.
>
> Right.
>
> >  That part could be converted into a memcpy instead of the volk
> > convert. Then, when we're ready to plot, we call the volk convert that
> > also does the move from d_start to 0, so it combines those two
> > elements.
>
> Yeah, that's the surgery part. :)  It would require adding a new set of
> buffers to hold floats objects, and then convert them when a
> determination to plot was made.
>
> This also affects the memmove() of the tail for the trigger delay.  It
> would operate on the new set of float buffers (vs the buffers holding
> doubles).
>
> > Thoughts on those proposals?

Your proposal for implementing memcpy() and deferring volk_*() to do the
conversion and "memmove" in one step is great!  :)

I just implemented it, and the time_sink_f thread has gone from 41.5%
CPU down to 29.1% CPU in my tests. :)  memcpy() now dominates the
thread, but that's to be expected.



With my initial hack:

> CPU: Intel Sandy Bridge microarchitecture, speed 3.5e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
> samples  %        image name               symbol name
> 78158    39.0737  libvolk.so.0.0.0         volk_32f_convert_64f_u_avx
> 22777    11.3870  no-vmlinux               /no-vmlinux
> 13972     6.9851  libgnuradio-qtgui-3.7.7git.so.0.0.0 gr::qtgui::time_sink_f_impl::_test_trigger_slope(float const*) const
> 7781      3.8900  libgnuradio-qtgui-3.7.7git.so.0.0.0 gr::qtgui::time_sink_f_impl::_test_trigger_norm(int, std::vector<void const*, std::allocator<void const*> >)
> 7236      3.6175  libpthread-2.18.so       pthread_mutex_lock
> 6163      3.0811  libgnuradio-runtime-3.7.7git.so.0.0.0 boost::detail::sp_counted_base::release()
> 5942      2.9706  libpthread-2.18.so       pthread_mutex_unlock
> 4947      2.4732  libgnuradio-runtime-3.7.7git.so.0.0.0 gr::block_executor::run_one_iteration()
> 3826      1.9127  libgnuradio-runtime-3.7.7git.so.0.0.0 gr::block_detail::input(unsigned int)
> 3555      1.7773  libstdc++.so.6.0.19      /usr/lib64/libstdc++.so.6.0.19
> 3206      1.6028  libc-2.18.so             __memmove_ssse3_back
> [...]

With my implementation of your suggestion:

CPU: Intel Sandy Bridge microarchitecture, speed 3.5e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 90000
samples  %        image name               symbol name
27595    35.6051  libc-2.18.so             __memcpy_sse2_unaligned
12225    15.7736  no-vmlinux               /no-vmlinux
4051      5.2269  libpthread-2.18.so       pthread_mutex_lock
3739      4.8243  libgnuradio-runtime-3.7.7git.so.0.0.0 boost::detail::sp_counted_base::release()
3362      4.3379  libpthread-2.18.so       pthread_mutex_unlock
2876      3.7108  libgnuradio-runtime-3.7.7git.so.0.0.0 gr::block_executor::run_one_iteration()
2364      3.0502  libgnuradio-runtime-3.7.7git.so.0.0.0 gr::block_detail::input(unsigned int)
2091      2.6980  libstdc++.so.6.0.19      /usr/lib64/libstdc++.so.6.0.19
1388      1.7909  libgnuradio-runtime-3.7.7git.so.0.0.0 gr::tpb_detail::notify_upstream(gr::block_detail*)
1138      1.4683  libc-2.18.so             __memmove_ssse3_back
[...]
2         0.0026  libvolk.so.0.0.0         __volk_32f_convert_64f_d
[...]
1         0.0013  libvolk.so.0.0.0         volk_32f_convert_64f_a_avx


Regards,
Andy


Andy, 

Excellent!

I've got a few other minor patches for some things, I'll put this in there to and test on my end as well.

Tom
 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]