discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] QT GUI time sink (float) unnecessary memmove()?


From: Andy Walls
Subject: Re: [Discuss-gnuradio] QT GUI time sink (float) unnecessary memmove()?
Date: Sat, 28 Mar 2015 14:45:24 -0400

Hi Tom:


On Sat, 2015-03-28 at 11:12 -0700, Tom Rondeau wrote:
> On Sat, Mar 28, 2015 at 11:00 AM, Andy Walls
> <address@hidden> wrote:

>         Can this memmove() be safely skipped
>         
>         
> https://github.com/gnuradio/gnuradio/blob/master/gr-qtgui/lib/time_sink_f_impl.cc#L627
[snip]
>         The volk_32f_convert_64f_u_avx() call is unavoidable as Qwt
>         wants
>         doubles for plotting and not floats. But it might also be able
>         to be
>         deferred to the very end when the decision to plot is known
>         for sure.
>         (But that's more surgery than I care to take on at the
>         moment.)


> The for loop there is in case we're triggering with a delay set, so
> that sets d_start into the buffers. But we pass the vector of buffers
> to the plotting widget, which will start at index 0. There are a
> couple of things that could work here. We add an argument to
> TimeUpdateEvent that adds the index value. We could also only do the
> memcpy if d_start > 0.

Modifying TimeUpdateEvent() would work.  I don't know how much ripple
that would cause in forcing other code to change.

BTW, I do use delayed trigger.
I did a hack to only do the memmove() only if d_start != 0 *and* we were
going to plot in this call to work().  See below for results.



>  But thinking about the volk convert function, that's both copying the
> data from the input buffer into the internal buffer as well as
> performing the conversion. We can't just hold data in the input since
> we don't want to back up the data until we're ready to plot both with
> timing and with a full enough buffer -- it's just sampling a section
> at a time and drops everything in between.

Right.

>  That part could be converted into a memcpy instead of the volk
> convert. Then, when we're ready to plot, we call the volk convert that
> also does the move from d_start to 0, so it combines those two
> elements.

Yeah, that's the surgery part. :)  It would require adding a new set of
buffers to hold floats objects, and then convert them when a
determination to plot was made.

This also affects the memmove() of the tail for the trigger delay.  It
would operate on the new set of float buffers (vs the buffers holding
doubles).

> Thoughts on those proposals?


> 
> Tom


I just did this quick hack and tested it:

        --- a/gr-qtgui/lib/time_sink_f_impl.cc
        +++ b/gr-qtgui/lib/time_sink_f_impl.cc
        @@ -622,13 +622,15 @@ namespace gr {
         
               // If we've have a trigger and a full d_size of items in the 
buffers, plo
               if((d_triggered) && (d_index == d_end)) {
        -        // Copy data to be plotted to start of buffers.
        -        for(n = 0; n < d_nconnections; n++) {
        -          memmove(d_buffers[n], &d_buffers[n][d_start], 
d_size*sizeof(double));
        -        }
        -
                 // Plot if we are able to update
                 if(gr::high_res_timer_now() - d_last_time > d_update_time) {
        +          if(d_start != 0) {
        +            // Copy data to be plotted to start of buffers.
        +            for(n = 0; n < d_nconnections; n++) {
        +              memmove(d_buffers[n],
        +                      &d_buffers[n][d_start], d_size*sizeof(double));
        +            }
        +          }
                   d_last_time = gr::high_res_timer_now();


        
I got about 2-5% of my CPU back and memmove() fell into the "I don't
care" part of the oprofile stats. :)
Everything seemed ok in the GUI, but my input causes triggers to happen
quite often, so I haven't tested infrequent trigger cases.


CPU: Intel Sandy Bridge microarchitecture, speed 3.5e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (No unit mask) count 100000
samples  %        image name               symbol name
78158    39.0737  libvolk.so.0.0.0         volk_32f_convert_64f_u_avx
22777    11.3870  no-vmlinux               /no-vmlinux
13972     6.9851  libgnuradio-qtgui-3.7.7git.so.0.0.0 
gr::qtgui::time_sink_f_impl::_test_trigger_slope(float const*) const
7781      3.8900  libgnuradio-qtgui-3.7.7git.so.0.0.0 
gr::qtgui::time_sink_f_impl::_test_trigger_norm(int, std::vector<void const*, 
std::allocator<void const*> >)
7236      3.6175  libpthread-2.18.so       pthread_mutex_lock
6163      3.0811  libgnuradio-runtime-3.7.7git.so.0.0.0 
boost::detail::sp_counted_base::release()
5942      2.9706  libpthread-2.18.so       pthread_mutex_unlock
4947      2.4732  libgnuradio-runtime-3.7.7git.so.0.0.0 
gr::block_executor::run_one_iteration()
3826      1.9127  libgnuradio-runtime-3.7.7git.so.0.0.0 
gr::block_detail::input(unsigned int)
3555      1.7773  libstdc++.so.6.0.19      /usr/lib64/libstdc++.so.6.0.19
3206      1.6028  libc-2.18.so             __memmove_ssse3_back



Regards,
Andy




reply via email to

[Prev in Thread] Current Thread [Next in Thread]