Que pasa Jorge!!
The soon i talk about interrupts, the soon somebody faces problems using
them!! maybe :)
This seem to be a hardware communication problem. Did you read my last
two mails? they may bring some information related to this problems.
Did you see the DMAR mapping Warning too during boot?? that can have
something to say here. The fact that a reboot use to solve it, makes me
think it can be an interrupt conflict.
Your logs say: "module/driver is sending commands to hardware, and it
didnt respond as we expected"
What can cause this? DMAR mess!!! and also hardware problems, like loose
of power, changes in hardware that derives in interrupt conflicts like
pluggin an e-sata, or faulty Atheros chip in the worse case.
You can _force_ module unload (and also, you can
_unload_dependent_modules_ first). Of course, you have to stop software
using this hardware too. Maybe something like can make your day:
$ sudo service network-manager stop (stop software)
$ sudo ifconfig whatever down (unload network)
$ modprobe -n ath (see dependent modules)
$ sudo modprobe -f whatever (unload dependencies first)
$ sudo modprobe -f ath (unload module)
$ sudo modprobe ath (reload module)
and test!
You should give some time to the commands to complete, and keep an eye
in syslog/dmesg to see resoults.
Given that the problem flaps (come and go), i would also check power and
heat (maybe replace charger with a travel one if you have, and place the
laptop in a cold environment), and see if fault time changes.
Also there is a previous warning with your sound card that can be
related:
snd_hda_intel 0000:00:1b.0: IRQ timing workaround is activated for card
#0
Is your sound card working when this error happens? If not, we may have
found the hardware interrupt conflict. They can be using the same
interrupt, and when sound gets "tweaked" the wifi goes crazy about that
delay in communications.
Good luck!! Im waiting for your experiences! :)
Regards,
D
El jue, 12-11-2015 a las 14:44 -0600, Jorge Araya Navarro escribió:
Hello!
I bought my Libreboot T60 from Gluglug in December of last year, and I'm very
happy with a machine
which works with 100% Free Software!
Since a couple of months ago is happening something strange to my wifi card, I
first thought the
issue was caused by a kernel update but I was wrong. What happens is that at
any random moment every
many or so weeks the wifi will drop the connection to never re-establish it
again, until reboot, and
after that sometimes the issue continues with the wifi card dropping the
connection once every 60
seconds.
Yesterday this thing happened again, so I decided to fire Emacs and takes some
notes and output with
org-mode. The first interesting thing is this from `dmesg`:
--8<---------------cut here---------------start------------->8---
nov 12 12:43:43 abril.charola kernel: snd_hda_intel 0000:00:1b.0: IRQ timing
workaround is activated for card #0. Suggest a bigger bdl_pos_adj.
nov 12 12:43:44 abril.charola kernel: ath: phy0: Failed to stop TX DMA,
queues=0x00a!
nov 12 12:43:44 abril.charola kernel: ath: phy0: DMA failed to stop in 10 ms
AR_CR=0xffffffff AR_DIAG_SW=0xffffffff DMADBG_7=0xffffffff
nov 12 12:43:44 abril.charola kernel: ath: phy0: Could not stop RX, we could be
confusing the DMA engine when we start RX up
nov 12 12:43:44 abril.charola kernel: ath: phy0: Chip reset failed
nov 12 12:43:44 abril.charola kernel: ath: phy0: Unable to reset channel, reset
status -22
nov 12 12:43:44 abril.charola kernel: ath: phy0: DMA failed to stop in 10 ms
AR_CR=0xffffffff AR_DIAG_SW=0xffffffff DMADBG_7=0xffffffff
nov 12 12:43:44 abril.charola kernel: ath: phy0: Could not stop RX, we could be
confusing the DMA engine when we start RX up
nov 12 12:43:44 abril.charola kernel: ath: phy0: Chip reset failed
nov 12 12:43:44 abril.charola kernel: ath: phy0: Unable to reset channel, reset
status -22
nov 12 12:43:44 abril.charola kernel: ath: phy0: DMA failed to stop in 10 ms
AR_CR=0xffffffff AR_DIAG_SW=0xffffffff DMADBG_7=0xffffffff
nov 12 12:43:44 abril.charola kernel: ath: phy0: Could not stop RX, we could be
confusing the DMA engine when we start RX up
nov 12 12:43:44 abril.charola kernel: ath: phy0: Chip reset failed
nov 12 12:43:44 abril.charola kernel: ath: phy0: Unable to reset channel, reset
status -22
nov 12 12:43:44 abril.charola kernel: ath: phy0: DMA failed to stop in 10 ms
AR_CR=0xffffffff AR_DIAG_SW=0xffffffff DMADBG_7=0xffffffff
nov 12 12:43:44 abril.charola kernel: ath: phy0: Could not stop RX, we could be
confusing the DMA engine when we start RX up
nov 12 12:43:44 abril.charola kernel: ath: phy0: Chip reset failed
nov 12 12:43:44 abril.charola kernel: ath: phy0: Unable to reset channel, reset
status -22
nov 12 12:43:44 abril.charola kernel: ath: phy0: DMA failed to stop in 10 ms
AR_CR=0xffffffff AR_DIAG_SW=0xffffffff DMADBG_7=0xffffffff
nov 12 12:43:44 abril.charola kernel: ath: phy0: Could not stop RX, we could be
confusing the DMA engine when we start RX up
nov 12 12:43:45 abril.charola kernel: ath: phy0: Chip reset failed
nov 12 12:43:45 abril.charola kernel: ath: phy0: Unable to reset channel, reset
status -22
nov 12 12:43:45 abril.charola kernel: ath: phy0: DMA failed to stop in 10 ms
AR_CR=0xffffffff AR_DIAG_SW=0xffffffff DMADBG_7=0xffffffff
nov 12 12:43:45 abril.charola kernel: ath: phy0: Could not stop RX, we could be
confusing the DMA engine when we start RX up
nov 12 12:43:45 abril.charola kernel: ath: phy0: Chip reset failed
nov 12 12:43:45 abril.charola kernel: ath: phy0: Unable to reset channel, reset
status -22
nov 12 12:43:45 abril.charola NetworkManager[445]: <warn> Connection
disconnected (reason -4)
nov 12 12:43:45 abril.charola NetworkManager[445]: <info> (wlp2s0): supplicant
interface state: completed -> disconnected
nov 12 12:43:45 abril.charola kernel: cfg80211: Exceeded CRDA call max
attempts. Not calling CRDA
nov 12 12:43:45 abril.charola kernel: ath: phy0: DMA failed to stop in 10 ms
AR_CR=0xffffffff AR_DIAG_SW=0xffffffff DMADBG_7=0xffffffff
nov 12 12:43:45 abril.charola kernel: ath: phy0: Could not stop RX, we could be
confusing the DMA engine when we start RX up
nov 12 12:43:45 abril.charola NetworkManager[445]: <info> (wlp2s0): supplicant
interface state: disconnected -> scanning
--8<---------------cut here---------------end--------------->8---
As I don't understand anything with such error messages, my guess is that it is
something
serious. after trying to unload the modules related to my wifi driver (ath
(which is impossible
because other modules requiring it are being use)) and typing `ifconfig wlp2s0
down` and what not, I
just gave up and restart my laptop. At some point journald register something
interesting:
--8<---------------cut here---------------start------------->8---
nov 12 12:44:00 abril.charola kernel: irq 17: nobody cared (try booting with the
"irqpoll" option)
nov 12 12:44:00 abril.charola kernel: CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.1.11-gnu-1-lts #1
nov 12 12:44:00 abril.charola kernel: Hardware name: LENOVO 1951F8G/1951F8G,
BIOS CBET4000 79ETE7WW (2.27 ) 05/18/2015
nov 12 12:44:00 abril.charola kernel: c1609907 4a9301f9 00000000 f5035f54
c14a49ec f53d0e9c f5035f74 c10abbac
nov 12 12:44:00 abril.charola kernel: c1575cc0 00000011 f5035f70 f85611db
f53d0e40 00000000 f5035f98 c10abf22
nov 12 12:44:00 abril.charola kernel: c1329d4a 0003ab5e 00000000 4a9301f9
f53d0e40 c1676e00 00000000 f5035fd4
nov 12 12:44:00 abril.charola kernel: Call Trace:
nov 12 12:44:00 abril.charola kernel: [<c14a49ec>] dump_stack+0x41/0x52
nov 12 12:44:00 abril.charola kernel: [<c10abbac>] __report_bad_irq+0x2c/0xd0
nov 12 12:44:00 abril.charola kernel: [<f85611db>] ?
ath9k_hw_intrpend+0x5b/0x70 [ath9k_hw]
nov 12 12:44:00 abril.charola kernel: [<c10abf22>] note_interrupt+0x212/0x250
nov 12 12:44:00 abril.charola kernel: [<c1329d4a>] ?
add_interrupt_randomness+0x16a/0x1a0
nov 12 12:44:00 abril.charola kernel: [<c10a99a2>]
handle_irq_event_percpu+0x122/0x190
nov 12 12:44:00 abril.charola kernel: [<c10a99a2>] ?
handle_irq_event_percpu+0x122/0x190
nov 12 12:44:00 abril.charola kernel: [<c10a9a3a>] handle_irq_event+0x2a/0x50
nov 12 12:44:00 abril.charola kernel: [<c10ac520>] ? handle_edge_irq+0xe0/0xe0
nov 12 12:44:00 abril.charola kernel: [<c10ac589>]
handle_fasteoi_irq+0x69/0x100
nov 12 12:44:00 abril.charola kernel: [<c1004906>] handle_irq+0x56/0x90
nov 12 12:44:00 abril.charola kernel: <IRQ> [<c14aa60c>] do_IRQ+0x3c/0xd0
nov 12 12:44:00 abril.charola kernel: [<c14a9c33>] common_interrupt+0x33/0x38
nov 12 12:44:00 abril.charola kernel: [<c138a553>] ?
cpuidle_enter_state+0x83/0x240
nov 12 12:44:00 abril.charola kernel: [<c138a744>] cpuidle_enter+0x14/0x20
nov 12 12:44:00 abril.charola kernel: [<c108fe89>]
cpu_startup_entry+0x299/0x3a0
nov 12 12:44:00 abril.charola kernel: [<c14a1f67>] rest_init+0x67/0x70
nov 12 12:44:00 abril.charola kernel: [<c167eb51>] start_kernel+0x3c9/0x3e2
nov 12 12:44:00 abril.charola kernel: [<c167e2e3>] i386_start_kernel+0x91/0x95
nov 12 12:44:00 abril.charola kernel: handlers:
nov 12 12:44:00 abril.charola kernel: [<f81083c0>] usb_hcd_irq [usbcore]
nov 12 12:44:00 abril.charola kernel: [<f860a890>] ath_isr [ath9k]
nov 12 12:44:00 abril.charola kernel: Disabling IRQ #17
--8<---------------cut here---------------end--------------->8---
Again, I don't know what it says but seems very serious. I'll attach the full
logs in case what I
provide is not enough. Hope someone can help me with this.
P.S.: I haven't clean my laptop from dust since I bought it, and it seems it
have some inside, this
sporadic issue can be caused by the dust, too.