[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Discuss-gnuradio] QT GUI time sink (float) unnecessary memmove()?
From: |
Andy Walls |
Subject: |
Re: [Discuss-gnuradio] QT GUI time sink (float) unnecessary memmove()? |
Date: |
Sat, 28 Mar 2015 15:50:20 -0400 |
On Sat, 2015-03-28 at 14:45 -0400, Andy Walls wrote:
> Hi Tom:
>
>
> On Sat, 2015-03-28 at 11:12 -0700, Tom Rondeau wrote:
> > On Sat, Mar 28, 2015 at 11:00 AM, Andy Walls
> > <address@hidden> wrote:
>
> > Can this memmove() be safely skipped
> >
> >
> > https://github.com/gnuradio/gnuradio/blob/master/gr-qtgui/lib/time_sink_f_impl.cc#L627
> [snip]
> > The volk_32f_convert_64f_u_avx() call is unavoidable as Qwt
> > wants
> > doubles for plotting and not floats. But it might also be able
> > to be
> > deferred to the very end when the decision to plot is known
> > for sure.
> > (But that's more surgery than I care to take on at the
> > moment.)
>
>
> > But thinking about the volk convert function, that's both copying the
> > data from the input buffer into the internal buffer as well as
> > performing the conversion. We can't just hold data in the input since
> > we don't want to back up the data until we're ready to plot both with
> > timing and with a full enough buffer -- it's just sampling a section
> > at a time and drops everything in between.
>
> Right.
>
> > That part could be converted into a memcpy instead of the volk
> > convert. Then, when we're ready to plot, we call the volk convert that
> > also does the move from d_start to 0, so it combines those two
> > elements.
>
> Yeah, that's the surgery part. :) It would require adding a new set of
> buffers to hold floats objects, and then convert them when a
> determination to plot was made.
>
> This also affects the memmove() of the tail for the trigger delay. It
> would operate on the new set of float buffers (vs the buffers holding
> doubles).
>
> > Thoughts on those proposals?
Your proposal for implementing memcpy() and deferring volk_*() to do the
conversion and "memmove" in one step is great! :)
I just implemented it, and the time_sink_f thread has gone from 41.5%
CPU down to 29.1% CPU in my tests. :) memcpy() now dominates the
thread, but that's to be expected.
With my initial hack:
> CPU: Intel Sandy Bridge microarchitecture, speed 3.5e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
> mask of 0x00 (No unit mask) count 100000
> samples % image name symbol name
> 78158 39.0737 libvolk.so.0.0.0 volk_32f_convert_64f_u_avx
> 22777 11.3870 no-vmlinux /no-vmlinux
> 13972 6.9851 libgnuradio-qtgui-3.7.7git.so.0.0.0
> gr::qtgui::time_sink_f_impl::_test_trigger_slope(float const*) const
> 7781 3.8900 libgnuradio-qtgui-3.7.7git.so.0.0.0
> gr::qtgui::time_sink_f_impl::_test_trigger_norm(int, std::vector<void const*,
> std::allocator<void const*> >)
> 7236 3.6175 libpthread-2.18.so pthread_mutex_lock
> 6163 3.0811 libgnuradio-runtime-3.7.7git.so.0.0.0
> boost::detail::sp_counted_base::release()
> 5942 2.9706 libpthread-2.18.so pthread_mutex_unlock
> 4947 2.4732 libgnuradio-runtime-3.7.7git.so.0.0.0
> gr::block_executor::run_one_iteration()
> 3826 1.9127 libgnuradio-runtime-3.7.7git.so.0.0.0
> gr::block_detail::input(unsigned int)
> 3555 1.7773 libstdc++.so.6.0.19 /usr/lib64/libstdc++.so.6.0.19
> 3206 1.6028 libc-2.18.so __memmove_ssse3_back
> [...]
With my implementation of your suggestion:
CPU: Intel Sandy Bridge microarchitecture, speed 3.5e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask
of 0x00 (No unit mask) count 90000
samples % image name symbol name
27595 35.6051 libc-2.18.so __memcpy_sse2_unaligned
12225 15.7736 no-vmlinux /no-vmlinux
4051 5.2269 libpthread-2.18.so pthread_mutex_lock
3739 4.8243 libgnuradio-runtime-3.7.7git.so.0.0.0
boost::detail::sp_counted_base::release()
3362 4.3379 libpthread-2.18.so pthread_mutex_unlock
2876 3.7108 libgnuradio-runtime-3.7.7git.so.0.0.0
gr::block_executor::run_one_iteration()
2364 3.0502 libgnuradio-runtime-3.7.7git.so.0.0.0
gr::block_detail::input(unsigned int)
2091 2.6980 libstdc++.so.6.0.19 /usr/lib64/libstdc++.so.6.0.19
1388 1.7909 libgnuradio-runtime-3.7.7git.so.0.0.0
gr::tpb_detail::notify_upstream(gr::block_detail*)
1138 1.4683 libc-2.18.so __memmove_ssse3_back
[...]
2 0.0026 libvolk.so.0.0.0 __volk_32f_convert_64f_d
[...]
1 0.0013 libvolk.so.0.0.0 volk_32f_convert_64f_a_avx
Regards,
Andy