After the make test failed for this module, I decided to poke
around to see if there is an easy fix. I made a script that simply
executes the test over and over until it seg faults and exits after the
core file is created.
address@hidden:~/src/gnuradio/build/gr-digital/python/digital$ ./runtests.sh
Using Volk machine: avx_64_mmx
Segmentation fault (core dumped)
address@hidden:~/src/gnuradio/build/gr-digital/python/digital$ gdb /usr/bin/python2.7 core
(gdb) bt
(gdb) bt
#0 0x00007fe8f627fb17 in volk_32fc_32f_dot_prod_32fc_a_avx ()
from /home/kelly/src/gnuradio/build/volk/lib/libvolk.so.0.0.0
#1 0x00007fe8f52dd25f in gr::filter::kernel::fir_filter_ccf::filter(std::complex<float> const*) ()
from /home/kelly/src/gnuradio/build/gr-filter/lib/libgnuradio-filter-3.8git.so.0.0.0
#2 0x00007fe8f143c45b in gr::digital::pfb_clock_sync_ccf_impl::general_work(int,
std::vector<int, std::allocator<int> >&,
std::vector<void const*, std::allocator<void const*> >&,
std::vector<void*, std::allocator<void*> >&) ()
from /home/kelly/src/gnuradio/build/gr-digital/lib/libgnuradio-digital-3.8git.so.0.0.0
#3 0x00007fe8f653809e in gr::block_executor::run_one_iteration() ()
from /home/kelly/src/gnuradio/build/gnuradio-runtime/lib/libgnuradio-runtime-3.8git.so.0.0.0
#4 0x00007fe8f6573622 in gr::tpb_thread_body::tpb_thread_body(boost::shared_ptr<gr::block>, int) ()
from /home/kelly/src/gnuradio/build/gnuradio-runtime/lib/libgnuradio-runtime-3.8git.so.0.0.0
#5 0x00007fe8f6565ea1 in boost::detail::function::void_function_obj_invoker0<gr::thread::thread_body_wrapper<gr::tpb_container>, void>::invoke(boost::detail::function::function_buffer&) ()
from /home/kelly/src/gnuradio/build/gnuradio-runtime/lib/libgnuradio-runtime-3.8git.so.0.0.0
---Type <return> to continue, or q <return> to quit---
#6 0x00007fe8f6526610 in boost::detail::thread_data<boost::function0<void> >::run() ()
from /home/kelly/src/gnuradio/build/gnuradio-runtime/lib/libgnuradio-runtime-3.8git.so.0.0.0
#7 0x00007fe8f9adc94a in ?? ()
from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.53.0
#8 0x00007fe8fc8a3f6e in start_thread (arg=0x7fe8e2ffd700)
at pthread_create.c:311
#9 0x00007fe8fc5ce9cd in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
Of
course, I had to recompile it with debugging info to glean anything
useful from the stack trace. So, I did that and I traced the bug to
this line:
c0Val = _mm256_mul_ps(a0Val, b0Val);
I can't dump
the values in a0Val or b0Val, though, because they're intermediate
values that are optimized away by the optimized kernel code. I tried
stepping through the assembler instructions but I'm not familiar with
the various sse and avx extensions. Heck, I'm not even familiar with the x86_64
instruction set. So I have a huge learning curve ahead of me, there.
Is it possible to just dump the values in these __m256 data types to a
file so I can debug it that way? If that's not easy to do, then I'm
willing to learn what I have to about the instruction set so I can debug
this thing. But I would sure appreciate some help if anyone has some
advice to offer.
Software version:
I rebased to the latest version of the next branch last night before I went to bed at around 1:30 am CDT.
Operating System:
address@hidden:~/src/gnuradio/volk/kernels/volk$ uname -a
Linux octs2 3.11.0-17-generic #31-Ubuntu SMP Mon Feb 3 21:52:43 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
It's Ubuntu 13.10
Hardware: ASUS X750J
Intel Quad Core i7 4700HQ 2.4GHz
cpuinfo:
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4700HQ CPU @ 2.40GHz
stepping : 3
microcode : 0x8
cpu MHz : 2401.000
cache size : 6144 KB
physical id : 0
siblings : 8
core id : 3
cpu cores : 4
apicid : 7
initial apicid : 7
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse
tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts
acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm
constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc
aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2
ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt
tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb
xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase
tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips : 4789.27
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management: