discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] Google Summer of Code 2014 applicant : Optimizati


From: West, Nathan
Subject: Re: [Discuss-gnuradio] Google Summer of Code 2014 applicant : Optimization with VOLK
Date: Tue, 25 Feb 2014 16:37:35 -0600

>  > On Sun, 2/23/14, Abhishek Bhowmick <address@hidden>
>  wrote:
>  >
>  >  Subject: [Discuss-gnuradio] Google Summer of Code
>  2014 applicant : Optimization with VOLK
>  >  To: address@hidden
>  >  Date: Sunday, February 23, 2014, 8:52 AM
>  >
>  >  Hello,
>  >  I have completed a Bachelor's degree in
>  >  Electrical Engineering from IIT Bombay, India and
>  will be
>  >  joining a masters program in Computer Science in
>  August. For
>  >  the summer, I am interested in participating GSoC
>  2014 and
>  >  GNU Radio is an organization wheAbhishekre my background
>  fits
>  >  nicely.
>  >
>  >>  > --------------------------------------------

>  >  I went through the ideas page and was
>  >  particularly interested in doing performance
>  optimization
>  >  with VOLK. After going through some online
>  documentation
>  >  about the library and the SDR'12 paper, I
>  realised that
>  >  following areas need work :
>  >
>  >  1. Profiling GNU radio code to identify new
>  >  kernels and implement them for existing Intel
>  SIMD
>  >  extensions, also porting kernels to other ISA
>  extensions.
>  >  2. Better testing of the effects of more complex
>  >  scheduler logic on larger environments (beyond
>  simple
>  >  kernels)
>  >
>  >  3. Exploring extension of Volk to GPU ISAs, to
>  >  leverage chips such as AMD Fusion (However, this
>  seems to
>  >  more research than software development)
>  >
>  >  According to the GSoC proposal, point (1) seems
>  >  to be the expectation. Given this, I would like
>  some advice
>  >  on how to go ahead looking for potential ideas
>  (and some
>  >  feedback on feasibility of the other ideas as
>  well)
>  >
>  >
>  >  My background : C++, Python, Signal Processing,
>  >  Computer Architecture
>  >
>  >  Thanks,
>  >  Abhishek Bhowmick
>  >


This is a great conversation, and I'll take the opportunity to plug
the up coming VOLK working group call
(https://plus.google.com/u/1/events/ch3jrjcvp7mdiqelpismfieg3n0).
Bogdan, your results aren't particula>  >
--------------------------------------------
rly surprising, but the feedback is really good to hear.

Back to GSoC:

Abhishek,

>Thanks for the pointers to gr-atsc and gr-80211. I have started
>looking there as a
>starting point. Are there similar modules which are undergoing volk
>speedup fixes?
>I am also trying to meet up with other people who have been using GNU radio
>to identify potential modules for acceleration. As you are now a
>mentor organization, I feel it's a good time for us to get into
>detailed discussions.

>From the previous discussion it should be apparent that how algorithms
are implemented will make the biggest difference, and that the new
acceleration is primarily going to come from larger more complex
kernels. At the end of the day it's going to be your proposal. So far
on the list of places to look we have

* in-tree OFDM (contact Martin)
* gr-atsc (use Andrew Davis' fork)
* gr-dvbt
* gr-fecapi

For your proposal I would recommend looking at their code, then
getting in contact with the author(s) of those modules to ask about
their thoughts on accelerating blocks they have written. The reality
of this project is that we are accelerating some signal processing
algorithm and knowledge of that algorithm is useful for acceleration.
Whatever application you have interested and/or knowledge in (fresh
out of a BS it's more likely to be interest) should guide your
proposal. If you know anything about error correcting codes then the
latter 2 would be good fits. OFDM frame detection probably has a
gentler learning curve since at the basic level you're looking at
convolution, and there's papers you can look for on more involved
algorithms. Other algorithms to look at might include agc or
equalizers.

If you're interested in GPU programming don't forget to checkout gr-gpu.

>
>>
>> At the moment the only mainstream ISA not being targeted is probably
>> AVX2, which has
>> some nice features for the type of kernels we're doing.  If you went
>> that route it would likely need add
>> protokernels to a pretty large number of kernels.
>>
>> Nathan
>
>This also seems to be promising, though I guess it would require me to
>come up to speed with AVX2 (which I would love to do). Could you
>please elaborate
>a little on the kind of beneficial features you have in mind ? I am
>concerned that the
>job of adding proto-kernels might turn out to be mundane/tedious ? Is
>that a valid concern ?

Right, so as Martin mentioned the answer is sort of relative. I
wouldn't go so far as to say it's mundane, especially if you have
little experience with using intrinsics and SIMD instructions. One
reason AVX isn't so prominently featured (I suspect) is that the
instructions are almost the same as SSE instructions, but the vectors
are twice as long so that is actually mundane. AVX2/FMA extensions
introduce some new features to the amd64 instruction set. The most
obvious being that it looks like Intel and AMD finally settled in on
the same fused multiply-add (there's also a multiply-subtract that's
good for complex numbers) implementation. That will likely be able to
speed things up a bit, but I'm also looking forward to seeing gains
from the various load_gathers that have been introduced. They allow
you to do a single load operation that gathers vector elements that
span pretty large ranges. VOLK won't be so interested in the large
ranges (except maybe decimators), but it could be useful for loading
complex vectors. There's some other math functions we may be able to
leverage, but those are two features that I think would be widely
applicable.

In your proposal you should definitely include what ISAs you intend to
use, and if there are features specific to that instruction set then
point out why it's a good choice. This is mostly important for
choosing between SSE and friends, AVX, AVX2/FMA. It would be good to
see plans that include NEON support for anything you'd add to amd64
platforms, but that's not a requirement.


Nathan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]