Re: [Discuss-gnuradio] fast parallel filtering

discuss-gnuradio
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Discuss-gnuradio] fast parallel filtering

From:	Andy Walls
Subject:	Re: [Discuss-gnuradio] fast parallel filtering
Date:	Mon, 20 Mar 2017 21:30:29 -0400
Hi Dirk:

On Mon, Mar 20, 2017 at 1:04 PM, Dirk Gorissen <address@hidden> wrote:
> Hi Andy,
>
> I have been experimenting with the flowgraphs, tried with some live
> data and it all works as expected. Just had to switch the threshold to
> 2 orders of magnitude smaller when using an airspy.

Yeah, that's expected really.
You might want to write a block that implements a (F)CME algorithm in the
time domain, to periodically determine an adapted threshold between correlator
noise output vs correlator signal output.  I mention FMCE and CME because
it has a well defined Pfa input parameter.

Or you you could just grab 3 seconds of
correlator output every 1.5 seconds and use CME, or your favorite statistical
measure or distance metric for outlier detection, to find correlator
output signal
peaks.

> So thank you again for your time on this, I will ensure to give credit
> when the final thing is tested & operational :)

Thank you.

> Some comments/questions inline below:
>
>>To get faster lock in, you may want your frequency range of interest to
>>be somewhere in between Fs/4 and Fs/2; and not near DC.
>
> Actually my channel is 2khz wide, not 10khz so that does make life a
> bit easier. So in this case I should be decimating Fs down to
> something like 4k say. Correct?

Yes.  See the attached flowgraph, version 3, where I knock it down to
4ksps and have the filter starting to roll off at +/- 1kHz


> As you say this should also let me lower the pll bandwidth. Though I
> haven't quite grasped the intuition behind this. Experimenting a bit
> it seems a bit smaller than you set is better, definitely not larger.

Well yes, but too small is bad as well.


> How did you pick your ranges, they dont align with what the docs are
> saying ( around pi/200 - 2pi/100).

Empirically.  :)  Mostly...

So the theory behind what the parameters mean is a lecture on second order
control systems, which I will spare you.  (Also GNRadio is funny in that
the user specifies 2nd order *analog* control system parameters for
poles in the complex s-plane [Laplace transform plane] and then
approximately warps them over to the z-plane [Z transform plane]
into poles for a discrete 2nd order control system - it's kind of silly really.)

The short story is GNURadio picks the damping factor of 1.0/sqrt(2.0)
for a maximally flat second order PLL filter response, and then the user
sets loop_bw (aka. omega_n_T) to set how wide the PLL loop filter response
is.

The maximum loop_bw could really be is pi/(1.0/sqrt(2)), and that would be
a wide filter going from +/- the Nyquist rate of the phase detector
error signal;
which is useless.  So you need a smaller number.

Smaller loop_bw values result in a more sluggish PLL.
Larger loop_bw values result in a more reactive PLL.

For rapid locking on to an intermittent pulse that pops up with random
initial phase (compared to the PLL's currently unlocked wandering phase),
we want a more reactive PLL so a larger loop bandwidth.

If the PLL is too reactive it wont lock.
If the PLL is too slow, the pulse will be gone before it gets into lock.


>
>>b. Once you find some pulses, extract what frequency they were at from
>>the PLL's loop filter state at the time it was locked on the pulse.  You
>>can then use this exact frequency information for a dedicated
>>correlation filter to pull even weaker pulses out of the noise, and
>>maintain track.
>
> Yes this is what I had in mind as well. I looked around the carrier
> block docs, source code, and wider gr wiki but couldn't find how I can
> get my hands on that internal state?

It's not easy.  From python, you can call the get_frequency() method of
the block and get it asynchronously, but the units are funny (and it might
even be the period vs the frequency, I have to take a harder look).

An easier thing to do is to take the properly scaled d(phi)/dt of the
PLL's locked output and use that as an instantaneous frequency
value that is sample synchronous with the main sample stream.
That number is clean for strong pulses, and a meandering mess for
weak pulses.

See the lowest, disabled branch of the attached flowgraph.
I converted the Carrier Tracking PLL to a Reference Out PLL
block followed by a Multiply Conjugate.  It is functionally
identical, but now we have the PLL's locked
output available.

>>I wasn't intending to tweak the flowgraph, but today I decided to try
>>with a correlation filter with -5 kHz to +5 kHz chirp taps.  That
>>method turned out to be inferior to the PLL, so I didn't leave it in
>>the flowgraph.
>
> Thats good to know, thats what I was doing originally.
>
>>My gut feeling is that to detect these weak pulses
>>in the original file, you're going to have to tolerate a fairly high
>>false alarm rate, if you want to detect the pulses.
>
> Agreed. I would probably be somewhat permissive initially and then do
> some further filtering in a post processing step.
>
>>Doing things to not allow noise into the system in the first place,
>>such as a VHF bandpass filter before the SDR unit,
>
> Indeed. Actually for the second dataset there was a preamp & filter in place
>
>>and proper adjustment of the SDR's LNA and IF gains for best noise performance
>>would help.
>
> Yes, something I could optimise a bit more. My understanding here
> though is that there isnt much of a science, just manual fiddling of
> the lna/if/mixer gains in sdr# while listening to the signal of
> interest (?)

No, it's actually fairly systematic.
The theory behind reducing Noise Figure is that you receiver
system's noise figure is dominate by the losses before the first gain stage:
https://en.wikipedia.org/wiki/Friis_formulas_for_noise

So you want as much gain as possible up front, but you want
to stay just below the level of clipping.  With your situation,
with no information on the pulse tones, clipping is actually OK.

Manually you would initially set IF gain very low, and crank up
LNA gain, so that you don't clip when you receive you signal of interest.
Then you would turn up the IF gain just enough to use the full dynamic
range of the ADC that follow it.

The Rafael R820T2 of the airspy helps you out quite a bit.  The LNA and
Mixer output both have power detection circuits that are used in an
automatic gain control loop in the chip (if enabled).

So it's up to you to set the IF gain though.  You may just want to
try cranking it up, as I don't think clipping will matter for you and you
need the sensitivity.



> Final question is a more general one. I have a fairly tight
> computational budget and need to drop down the sample rate quite a
> bit. Im assuming the most efficient way to do this is decimate &
> filter over 3 stages. In each stage the decimation factor reduces and
> the filter order increases.

Well, I usually 3 or 4 stages.  And I initially use the smaller decimation
factors first with filters with very lax transition bands.  (See the attached
flowgraph, where I've mucked with filters more.)

FIR filtering costs O(n^2), IIRC, where n is the number of filter
taps.  FIR filters
that have extremely sharp transition bands or higher rejection have larger
number of coefficients.  There is a savings with decimation, as only 1 out
of M filtering operations need to be performed.  So usually it's best
to try and reduce filter tap count instead of upping decimation at first.
That O(n^2) is why FIR filtering in stages when decimating is
a computational win, reduce filter tap count matters a lot.

Running the attached flowgraph at RT prior I have this kind of CPU usage:

address@hidden grcs]$ ps -eLo pcpu,pid,tid,cls,rtprio,pcpu,comm | grep
4845 | sort -n
 0.0  4845  4848  TS      -  0.0 python2
 0.0  4845  4849  TS      -  0.0 python2
 0.0  4845  4850  TS      -  0.0 python2
 0.0  4845  4851  TS      -  0.0 python2
 0.0  4845  4852  TS      -  0.0 python2
 0.0  4845  4853  TS      -  0.0 python2
 0.0  4845  4854  TS      -  0.0 python2
 0.0  4845  4855  TS      -  0.0 python2
 0.0  4845  4856  TS      -  0.0 python2
 0.0  4845  4857  TS      -  0.0 python2
 0.4  4845  4871  RR     29  0.4 float_to_compl1
 0.5  4845  4865  RR     29  0.5 multiply_const_
 0.5  4845  4868  RR     29  0.5 delay13
 0.5  4845  4873  RR     29  0.5 number_sink2
 0.6  4845  4869  RR     29  0.6 moving_average_
 0.6  4845  4870  RR     29  0.6 complex_to_mag1
 0.7  4845  4864  RR     29  0.7 delay12
 0.9  4845  4866  RR     29  0.9 delay14
 1.0  4845  4862  RR     29  1.0 fir_filter_ccf4
 1.2  4845  4858  RR     29  1.2 file_source11
 1.2  4845  4863  RR     29  1.2 pll_refout_cc16
 1.5  4845  4859  RR     29  1.5 throttle6
 1.5  4845  4872  RR     29  1.5 time_sink_c1
 1.6  4845  4861  RR     29  1.6 fir_filter_ccf5
 1.6  4845  4867  RR     29  1.6 multiply_conjug
 3.2  4845  4860  RR     29  3.2 freq_xlating_fi
19.8  4845  4845  RR     29 19.8 python2

Most of those blocks (delay, number_sink, time_sink, float_to_complex)
and the final python2 thread are chewing up CPU for the sake of GUI displays.
So about 14% CPU total.  This laptop is a 4 core
"Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz", so kind of wimpy.

Regards,
Andy


> Cheers
> Dirk
>
>
implant_pulse_detect_3.grc
Description: Binary data
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [Discuss-gnuradio] fast parallel filtering, (continued)
Prev by Date: Re: [Discuss-gnuradio] [GSOC] A HTML-based GUI for GNU Radio: Draft of proposal
Next by Date: [Discuss-gnuradio] GPU accelerated Viterbi decoder?
Previous by thread: Re: [Discuss-gnuradio] fast parallel filtering
Next by thread: Re: [Discuss-gnuradio] fast parallel filtering
Index(es):
- Date
- Thread