[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Libreboot] [Libreboot T60 from Gluglug] Sometimes my Qualcomm Ather
From: |
Jorge Araya Navarro |
Subject: |
Re: [Libreboot] [Libreboot T60 from Gluglug] Sometimes my Qualcomm Atheros AR9285 Wireless Network Adapter drops the connection to never reestablish it again (until reboot) |
Date: |
Mon, 14 Dec 2015 20:50:05 -0600 |
User-agent: |
mu4e 0.9.15; emacs 24.5.1 |
This was indeed a heat issue. What I did was to buy an air can and open the
machine and clean the
stuff, it was filled with dust:
- https://www.instagram.com/p/_FmJajny4a/?taken-by=jorgejavieran
- https://www.instagram.com/p/_Fkln5ny2A/?taken-by=jorgejavieran
And now my beloved laptop is working as new! :D. Thank you very much for the
support, Daniel!
El lunes 14 de diciembre del 2015 a las 0357 horas, Daniel Tarrero escribió:
> good morning out there!
>
> hmmm, so in the end this seem a heat problem!
> overheat can cause a lot of different failures, but wifi/radio and cpu
> related are very common.
>
> !!! In this scenario, you should use the ath9k module parameter:
> "nohwcrypt=0" (i think it's by default, but you can be sure reading the
> output of "$ modinfo ath9k").
> This will make the wifi chip to handle the encryption, and this will
> cause less heat in general than usign the cpu.
>
> Well, overheat is good and bad =) it takes a few hours to replace a fan,
> but it wont cost you more than 20$
>
> can you check that the fans are working? in which condition?
> take a look at some packages called "sensors", like "lm-sensors"; in my
> system it gives me some component temperatures and fan revolutions per
> min messeaurment
>
> You can always dissasembly the laptop and see how fans are rolling, but
> the RPM and Cº will be more accurate :)
>
> Also you can do a "field test", moving the laptop to some other cooler
> place (maybe to the bathroom, basement or kitchen).
>
>
> good luck!
> D
>
>
>
> El mar, 08-12-2015 a las 22:32 -0600, Jorge Araya Navarro escribió:
>> Hello, again!
>>
>> bad news, after making the changes you suggested (kernel options, module
>> options), the issue still's
>> coming back :(. Happens every time the machine gets hot, and I don't have
>> any way to improve the
>> temperature of my environment :-/
>>
>> Maybe the fan needs some tweaking? I'm using laptop-tools btw.
>>
>> reloading the modules fixes 50% of the issue, however, I don't have my wifi
>> back... I expected that
>> unloading and reloading the modules would work and give me back the wifi so
>> I wouldn't need to
>> reboot the laptop. Here is some output after unloading and reloading the
>> modules:
>>
>> --8<---------------cut here---------------start------------->8---
>> [ +0,070512] ath9k: ath9k: Driver unloaded
>> [ +0,446462] cfg80211: Calling CRDA to update world regulatory domain
>> [ +0,035610] ath9k 0000:02:00.0: enabling device (0000 -> 0002)
>> [ +0,000206] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this
>> driver
>> [ +0,000005] ath: phy0: Unable to initialize hardware; initialization
>> status: -95
>> [ +0,000004] ath9k 0000:02:00.0: Failed to initialize device
>> [ +0,000077] ath9k: probe of 0000:02:00.0 failed with error -95
>> [ +3,117031] cfg80211: Calling CRDA to update world regulatory domain
>> [ +3,150005] cfg80211: Calling CRDA to update world regulatory domain
>> [ +3,160005] cfg80211: Calling CRDA to update world regulatory domain
>> [ +3,149996] cfg80211: Calling CRDA to update world regulatory domain
>> [ +3,150000] cfg80211: Calling CRDA to update world regulatory domain
>> [ +3,150009] cfg80211: Calling CRDA to update world regulatory domain
>> [ +3,159986] cfg80211: Calling CRDA to update world regulatory domain
>> [ +3,150013] cfg80211: Calling CRDA to update world regulatory domain
>> [ +3,159995] cfg80211: Calling CRDA to update world regulatory domain
>> [ +3,149998] cfg80211: Calling CRDA to update world regulatory domain
>> [ +3,159987] cfg80211: Exceeded CRDA call max attempts. Not calling CRDA
>> [dic 8 21:48] e1000e: enp1s0 NIC Link is Down
>> [ +30,020843] e1000e: enp1s0 NIC Link is Down
>> [dic 8 21:49] e1000e: enp1s0 NIC Link is Down
>> --8<---------------cut here---------------end--------------->8---
>>
>> Since I'm documenting this issue in a org-mode file, this is how I unload
>> and reload the modules,
>> with a org-mode source block!:
>>
>> --8<---------------cut here---------------start------------->8---
>> #+BEGIN_SRC sh :results silent :export both :dir /sudo::
>> modprobe -rf led_class
>> modprobe -rf cfg80211
>> modprobe -rf mac80211
>> modprobe -rf ath9k_hw
>> modprobe -rf ath9k_common
>> modprobe -rf ath9k
>> modprobe ath9k debug=1
>> modprobe ath9k_common
>> modprobe ath9k_hw
>> modprobe mac80211
>> modprobe cfg80211
>> modprobe led_class
>> #+END_SRC
>> --8<---------------cut here---------------end--------------->8---
>>
>> Talking about heat, I found this at the end of `dmesg`:
>>
>> --8<---------------cut here---------------start------------->8---
>> [dic 8 22:16] e1000e: enp1s0 NIC Link is Down
>> [ +4,030516] CPU1: Core temperature above threshold, cpu clock throttled
>> (total events = 1)
>> [ +0,000779] CPU1: Core temperature/speed normal
>> [ +34,513236] mce: [Hardware Error]: Machine check events logged
>> [dic 8 22:18] e1000e: enp1s0 NIC Link is Down
>> [dic 8 22:20] e1000e: enp1s0 NIC Link is Down
>> --8<---------------cut here---------------end--------------->8---
>>
>> Seems like my laptop is getting a little over heat indeed.
>>
>> El jueves 26 de noviembre del 2015 a las 0405 horas, Daniel Tarrero escribió:
>>
>> > Good morning dudes!
>> >
>> > El jue, 26-11-2015 a las 00:41 -0600, Jorge Araya Navarro escribió:
>> >> Hope we can squash it!
>> >> --8<---------------cut here---------------start------------->8---
>> >> $ lsmod | grep ath
>> >> ath9k 122880 0
>> >> ath9k_common 28672 1 ath9k
>> >> ath9k_hw 438272 2 ath9k_common,ath9k
>> >> ath 24576 3 ath9k_common,ath9k,ath9k_hw
>> >> mac80211 565248 1 ath9k
>> >> cfg80211 409600 4 ath,ath9k_common,ath9k,mac80211
>> >> led_class 16384 2 ath9k,thinkpad_acpi
>> >> --8<---------------cut here---------------end--------------->8---
>> >
>> > What i see here is that you use the ath9k kernel module/driver.
>> > We also see that it's a rather complex module; other modules actually
>> > depend on it, like ath, ath9k_hw, ath9k_common, mac80211, cfg80211 and
>> > led_class.
>> >
>> > So... reload that is a pain in the ass ^^
>> >
>> > $ modprobe -r led_class
>> > $ modprobe -r cfg80211
>> > $ modprobe -r mac80211
>> > $ modprobe -r ath9k_hw
>> > $ modprobe -r ath9k_common
>> > $ modprobe -r ath9k
>> > $ modprobe ath9k
>> > ... idem with: ath9k_common ath9k_hw mac80211 cfg80211 and led_class
>> >
>> > If module is busy, or something like that, you usually can force the
>> > module/driver to unload with "-f", for example:
>> >
>> > $ modprobe -rf ath9k
>> >
>> > It's important that you get the module unloaded and loaded again.
>> > Despite the benefit of not having to reboot when it crashes, you also
>> > will be able to pass "module parameters" to it, on the fly, when you
>> > reload
>> >
>> > Something like that:
>> >
>> > $ modprobe ath9k debug=1
>> >
>> > we will find it usefull later, keep on it:
>> >
>> >> --8<---------------cut here---------------start------------->8---
>> >> $ dmesg | grep firmware
>> >> [ +0,424592] psmouse serio2: trackpoint: IBM TrackPoint firmware: 0x0e,
>> >> buttons: 3/3
>> >> --8<---------------cut here---------------end--------------->8---
>> >
>> > No weird/propietary/bogus firmware being loaded for your atheros, good
>> > news :)
>> >
>> > Let's see the available module parameters:
>> >
>> >> --8<---------------cut here---------------start------------->8---
>> >> $ modinfo ath9k
>> >> filename:
>> >> /lib/modules/4.1.13-gnu-1-lts/kernel/drivers/net/wireless/ath/ath9k/ath9k.ko.gz
>> >> license: Dual BSD/GPL
>> >> description: Support for Atheros 802.11n wireless LAN cards.
>> >> author: Atheros Communications
>> >> alias: (...)
>> >> depends: ath9k_hw,mac80211,ath9k_common,led-class,cfg80211,ath
>> >> intree: Y
>> >> vermagic: 4.1.13-gnu-1-lts SMP mod_unload modversions 686
>> >> parm: debug:Debugging mask (uint)
>> >> parm: nohwcrypt:Disable hardware encryption (int)
>> >> parm: blink:Enable LED blink on activity (int)
>> >> parm: btcoex_enable:Enable wifi-BT coexistence (int)
>> >> parm: bt_ant_diversity:Enable WLAN/BT RX antenna diversity (int)
>> >> parm: ps_enable:Enable WLAN PowerSave (int)
>> >> --8<---------------cut here---------------end--------------->8---
>> >
>> > We see here those parameters:
>> >
>> > - "debug"
>> > overkill, probably dumps a lot of information to /var/log/syslog, but
>> > also probably we wont understand a shit. U can give it a try, but dont
>> > leave it enabled as it will consume a lot of resources. Intended for
>> > debuggin the module.
>> >
>> > - "nohwcrypt"
>> > disables hardware encryption, so it will be performed by CPU. If the
>> > encryption part of the chip is the buggy one, that can solve our problem
>> > with a little CPU cost)
>> >
>> > - "blink"
>> > disables wifi led, yujuu!
>> >
>> > - "btcoex_enable"
>> > enables bluetooth coexistence. Disabled by default, so nothing to
>> > scratch here.
>> >
>> > - "bt_ant_diversity"
>> > thats fun to give a try. Wifi/bluetooth cards are shipped with 1,2 or 3
>> > antennas. So it can from share one antenna for all (bad idea) to use
>> > several antennas for one service (diversity, sounds good in order to
>> > improve signal and general performance, if we dont use bluetooth)
>> >
>> > - "ps_enable"
>> > enables powersave. If your computer is set up to hibernate / suspend,
>> > that can be a parameter to test.
>> >
>> >
>> >
>> > Here come two options: you can reload module (so you can test parameters
>> > on the fly), or not (so you have to reboot each time you want to change
>> > a parameter).
>> >
>> > If you get to reload the module, pass the module parameters in the
>> > command line, following the module name, with modprobe:
>> >
>> > $ modprobe ath9k nohwcrypt=1
>> >
>> > If you can't reload it, the place you set it up to be catch on boot is
>> > in the files you'll find in "/etc/modprobe.d" directory. Just create a
>> > file there with content similar to these:
>> >
>> > options ath9k nohwcrypt=1
>> >
>> > You can do it with one command line, like that:
>> >
>> > $ echo "options ath9k nohwcrypt=1" > /etc/modprobe.d/ath9k.conf
>> >
>> > ... and reboot it in order to apply the changes.
>> >
>> >
>> > I recommend you to try the "most conservative" parameters we've talk
>> > about:
>> >
>> > module parameters:
>> > nohwcrypt=1
>> > btcoex_enable=0
>> > bt_ant_diversity=1
>> > ps_enable=0
>> >
>> > With that, the load for the wifi hardware will be minimal: encryption
>> > will be performed by cpu, bluetooth will be disabled, and it's bluetooth
>> > antenna (if it have one) will be used for wifi.
>> >
>> >> --8<---------------cut here---------------start------------->8---
>> >> $ egrep '(vmx|svm)' /proc/cpuinfo
>> >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
>> >> mca cmov clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc
>> >> bts aperfmperf pni monitor vmx est tm2 xtpr pdcm dtherm
>> >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
>> >> mca cmov clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc
>> >> bts aperfmperf pni monitor vmx est tm2 xtpr pdcm dtherm
>> >> --8<---------------cut here---------------end--------------->8---
>> >
>> > Your cpu have Hardware Virtualization support (VMX). That has cause
>> > problems in the past with atheros modules.
>> >
>> > Try this kernel parameter on boot:
>> >
>> > intel_iommu=off
>> >
>> >
>> >
>> > With all that applied, i run out of ideas! Give it a try and let us know
>> > if that improves your system stability :)
>> >
>> > i go for a coffee truck =)
>> > regards,
>> > Dani
>> >
>> >>
>> >>
>> >> El miércoles 25 de noviembre del 2015 a las 0823 horas, Daniel Tarrero
>> >> escribió:
>> >>
>> >> > Hi again!
>> >> >
>> >> > sorry to hear that :( we have to keep putting the stick in the hole till
>> >> > the bug comes out :)
>> >> >
>> >> > i have a couple questions and a tweak worth to try:
>> >> > - which module/firmware do you use? the kernel's ath9k module?
>> >> > $ lsmod | grep ath
>> >> > $ dmesg | grep firmware
>> >> >
>> >> > - which options does this module support?
>> >> > $ modinfo ath9k
>> >> >
>> >> > - which processor do you have? has it virtualization supporT?
>> >> > $ egrep '(vmx|svm)' /proc/cpuinfo
>> >> >
>> >> > ----------
>> >> > - virtualization tecnologies have cause this kind of conflicts in the
>> >> > past, so try this if you see output from the previous command:
>> >> > kernel boot parameter "intel_iommu=off"
>> >> >
>> >> > - atheros driver tweaks: we will see which module options can we adjust
>> >> > from the modinfo command :)
>> >> >
>> >> >
>> >> > luck and regards!
>> >> > D
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > El sáb, 21-11-2015 a las 15:19 -0600, Jorge Araya Navarro escribió:
>> >> >> Well, today the issue show his face again! :( You were right, the
>> >> >> kernel flag don't solve this
>> >> >> problem. However, after rebooting my laptop, the connection is stable,
>> >> >> I don't experience the
>> >> >> reconnection-every-60-seconds-phase anymore.
>> >> >>
>> >> >> I don't remember pasting the exact error message I get when the issue
>> >> >> appears, in any case, here it
>> >> >> is:
>> >> >>
>> >> >> --8<---------------cut here---------------start------------->8---
>> >> >> [ +0,116708] ath: phy0: Chip reset failed
>> >> >> [ +0,000007] ath: phy0: Unable to reset channel, reset status -22
>> >> >> [ +0,080357] ath: phy0: DMA failed to stop in 10 ms AR_CR=0xffffffff
>> >> >> AR_DIAG_SW=0xffffffff DMADBG_7=0xffffffff
>> >> >> [ +0,000016] ath: phy0: Could not stop RX, we could be confusing the
>> >> >> DMA engine when we start RX up
>> >> >> --8<---------------cut here---------------end--------------->8---
>> >> >>
>> >> >> I was unable to reload the `ath` module, something start again
>> >> >> NetworkManager's service when I stop
>> >> >> it with `systemctl stop NetworkManager`, `systemctl list-dependencies
>> >> >> NetworkManager` shows many
>> >> >> services that I don't believe all of them depend on NetworkManager's
>> >> >> service.
>> >> >>
>> >> >> typing `sudo iwconfig wlp2s0 power off` doesn't work because that
>> >> >> feature isn't supported by my wifi
>> >> >> card. The sound works well except for some sound glitches, but that
>> >> >> happens because systemd-journal uses a
>> >> >> lot of CPU registering the never ending error message (the one above).
>> >> >>
>> >> >> Here is the information you requested, hope this sheds some light with
>> >> >> this problem:
>> >> >>
>> >> >> --8<---------------cut here---------------start------------->8---
>> >> >> $ sudo journalctl -b -1 | grep DMA
>> >> >> nov 21 12:26:57 abril.charola kernel: DMA [mem
>> >> >> 0x0000000000001000-0x0000000000ffffff]
>> >> >> nov 21 12:26:57 abril.charola kernel: DMA zone: 40 pages used for
>> >> >> memmap
>> >> >> nov 21 12:26:57 abril.charola kernel: DMA zone: 0 pages reserved
>> >> >> nov 21 12:26:57 abril.charola kernel: DMA zone: 3999 pages, LIFO
>> >> >> batch:0
>> >> >> # [...]
>> >> >> nov 21 14:29:06 abril.charola kernel: ath: phy0: Failed to stop TX
>> >> >> DMA, queues=0x008!
>> >> >> nov 21 14:29:06 abril.charola kernel: ath: phy0: DMA failed to stop in
>> >> >> 10 ms AR_CR=0xffffffff AR_DIAG_SW=0xffffffff DMADBG_7=0xffffffff
>> >> >> nov 21 14:29:06 abril.charola kernel: ath: phy0: Could not stop RX, we
>> >> >> could be confusing the DMA engine when we start RX up
>> >> >> --8<---------------cut here---------------end--------------->8---
>> >> >>
>> >> >> --8<---------------cut here---------------start------------->8---
>> >> >> $ lspci | grep -e Ethernet -e Network
>> >> >> 01:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet
>> >> >> Controller
>> >> >> 02:00.0 Network controller: Qualcomm Atheros AR9285 Wireless Network
>> >> >> Adapter (PCI-Express) (rev 01)
>> >> >> --8<---------------cut here---------------end--------------->8---
>> >> >>
>> >> >> --8<---------------cut here---------------start------------->8---
>> >> >> $ uname -a
>> >> >> Linux abril.charola 4.1.13-gnu-1-lts #1 SMP Sat Nov 14 09:15:27 UYT
>> >> >> 2015 i686 GNU/Linux
>> >> >> --8<---------------cut here---------------end--------------->8---
>> >> >>
>> >> >> El lunes 16 de noviembre del 2015 a las 0340 horas, Daniel Tarrero
>> >> >> escribió:
>> >> >>
>> >> >> > Hi!
>> >> >> >
>> >> >> > This logs seem to me like an interrupt conflict, hardware failure, or
>> >> >> > unrecoverable state.
>> >> >> >
>> >> >> > I think that the kernel boot option "intremap" wont help you.
>> >> >> >
>> >> >> > Usually, remove and load again a module use to restablish it's
>> >> >> > functionallity (when succesfully performed). Of course, modules and
>> >> >> > kernel have a tree kind structure, so you have to unload its
>> >> >> > dependencies before unload a module.
>> >> >> >
>> >> >> > -----
>> >> >> > Things you can give a try:
>> >> >> >
>> >> >> > * Look for any other interesting messages during boot:
>> >> >> >
>> >> >> > $ dmesg | more
>> >> >> >
>> >> >> > ... and more concrete, boot messages about DMA:
>> >> >> >
>> >> >> > $ dmesg | grep DMA | more
>> >> >> >
>> >> >> > * Disable "suspend" mode of the card (maybe it enters suspension-mode
>> >> >> > and never come back: not all cards support suspension):
>> >> >> >
>> >> >> > $ sudo iwconfig wlan0 power off
>> >> >> >
>> >> >> > * I also would try to _disable_sound_card_ in BIOS, and see if that
>> >> >> > makes a difference with your Wifi crashes.
>> >> >> >
>> >> >> >
>> >> >> > ----------
>> >> >> > For more info:
>> >> >> >
>> >> >> > which wifi card you have?
>> >> >> >
>> >> >> > $ lspci
>> >> >> > $ lsusb
>> >> >> >
>> >> >> > which kernel you have?
>> >> >> >
>> >> >> > $ uname -a
>> >> >> >
>> >> >> > is this the propper list for that?
>> >> >> >
>> >> >> > probably not ^^
>> >> >> >
>> >> >> >
>> >> >> > good morning dudes!
>> >> >> > Dani
>> >> >> >
>> >> >> >
>> >> >> > El vie, 13-11-2015 a las 12:47 -0600, Jorge Araya Navarro escribió:
>> >> >> >> Yo! lol.
>> >> >> >>
>> >> >> >> When this thing happens, I don't have anything playing sounds, so
>> >> >> >> I'm
>> >> >> >> not sure if the sound card gets affected. I wonder if setting that
>> >> >> >> kernel flag will prevent this issue from happening. I also too
>> >> >> >> wonder if
>> >> >> >> unloading and reloading the drivers will do something useful
>> >> >> >> regarding
>> >> >> >> my issue.
>> >> >> >>
>> >> >> >> I'm going to set the flag and came back here if something happens.
>> >> >> >>
>> >> >> >> El viernes 13 de noviembre del 2015 a las 0534 horas, Daniel
>> >> >> >> Tarrero escribió:
>> >> >> >>
>> >> >> >> > Que pasa Jorge!!
>> >> >> >> >
>> >> >> >> > The soon i talk about interrupts, the soon somebody faces
>> >> >> >> > problems using
>> >> >> >> > them!! maybe :)
>> >> >> >> >
>> >> >> >> > This seem to be a hardware communication problem. Did you read my
>> >> >> >> > last
>> >> >> >> > two mails? they may bring some information related to this
>> >> >> >> > problems.
>> >> >> >> >
>> >> >> >> >>> Did you see the DMAR mapping Warning too during boot?? that can
>> >> >> >> >>> have
>> >> >> >> > something to say here. The fact that a reboot use to solve it,
>> >> >> >> > makes me
>> >> >> >> > think it can be an interrupt conflict.
>> >> >> >> >
>> >> >> >> > Your logs say: "module/driver is sending commands to hardware,
>> >> >> >> > and it
>> >> >> >> > didnt respond as we expected"
>> >> >> >> >
>> >> >> >> > What can cause this? DMAR mess!!! and also hardware problems,
>> >> >> >> > like loose
>> >> >> >> > of power, changes in hardware that derives in interrupt conflicts
>> >> >> >> > like
>> >> >> >> > pluggin an e-sata, or faulty Atheros chip in the worse case.
>> >> >> >> >
>> >> >> >> > You can _force_ module unload (and also, you can
>> >> >> >> > _unload_dependent_modules_ first). Of course, you have to stop
>> >> >> >> > software
>> >> >> >> > using this hardware too. Maybe something like can make your day:
>> >> >> >> >
>> >> >> >> > $ sudo service network-manager stop (stop software)
>> >> >> >> > $ sudo ifconfig whatever down (unload network)
>> >> >> >> > $ modprobe -n ath (see dependent modules)
>> >> >> >> > $ sudo modprobe -f whatever (unload dependencies first)
>> >> >> >> > $ sudo modprobe -f ath (unload module)
>> >> >> >> > $ sudo modprobe ath (reload module)
>> >> >> >> >
>> >> >> >> > and test!
>> >> >> >> > You should give some time to the commands to complete, and keep
>> >> >> >> > an eye
>> >> >> >> > in syslog/dmesg to see resoults.
>> >> >> >> >
>> >> >> >> > Given that the problem flaps (come and go), i would also check
>> >> >> >> > power and
>> >> >> >> > heat (maybe replace charger with a travel one if you have, and
>> >> >> >> > place the
>> >> >> >> > laptop in a cold environment), and see if fault time changes.
>> >> >> >> >
>> >> >> >> > Also there is a previous warning with your sound card that can be
>> >> >> >> > related:
>> >> >> >> > snd_hda_intel 0000:00:1b.0: IRQ timing workaround is activated
>> >> >> >> > for card
>> >> >> >> > #0
>> >> >> >> >
>> >> >> >> > Is your sound card working when this error happens? If not, we
>> >> >> >> > may have
>> >> >> >> > found the hardware interrupt conflict. They can be using the same
>> >> >> >> > interrupt, and when sound gets "tweaked" the wifi goes crazy
>> >> >> >> > about that
>> >> >> >> > delay in communications.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > Good luck!! Im waiting for your experiences! :)
>> >> >> >> >
>> >> >> >> > Regards,
>> >> >> >> > D
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > El jue, 12-11-2015 a las 14:44 -0600, Jorge Araya Navarro
>> >> >> >> > escribió:
>> >> >> >> >> Hello!
>> >> >> >> >>
>> >> >> >> >> I bought my Libreboot T60 from Gluglug in December of last year,
>> >> >> >> >> and I'm very happy with a machine
>> >> >> >> >> which works with 100% Free Software!
>> >> >> >> >>
>> >> >> >> >> Since a couple of months ago is happening something strange to
>> >> >> >> >> my wifi card, I first thought the
>> >> >> >> >> issue was caused by a kernel update but I was wrong. What
>> >> >> >> >> happens is that at any random moment every
>> >> >> >> >> many or so weeks the wifi will drop the connection to never
>> >> >> >> >> re-establish it again, until reboot, and
>> >> >> >> >> after that sometimes the issue continues with the wifi card
>> >> >> >> >> dropping the connection once every 60
>> >> >> >> >> seconds.
>> >> >> >> >>
>> >> >> >> >> Yesterday this thing happened again, so I decided to fire Emacs
>> >> >> >> >> and takes some notes and output with
>> >> >> >> >> org-mode. The first interesting thing is this from `dmesg`:
>> >> >> >> >>
>> >> >> >> >> --8<---------------cut here---------------start------------->8---
>> >> >> >> >> nov 12 12:43:43 abril.charola kernel: snd_hda_intel
>> >> >> >> >> 0000:00:1b.0: IRQ timing workaround is activated for card #0.
>> >> >> >> >> Suggest a bigger bdl_pos_adj.
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: Failed to stop
>> >> >> >> >> TX DMA, queues=0x00a!
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: DMA failed to
>> >> >> >> >> stop in 10 ms AR_CR=0xffffffff AR_DIAG_SW=0xffffffff
>> >> >> >> >> DMADBG_7=0xffffffff
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: Could not stop
>> >> >> >> >> RX, we could be confusing the DMA engine when we start RX up
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: Chip reset
>> >> >> >> >> failed
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: Unable to reset
>> >> >> >> >> channel, reset status -22
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: DMA failed to
>> >> >> >> >> stop in 10 ms AR_CR=0xffffffff AR_DIAG_SW=0xffffffff
>> >> >> >> >> DMADBG_7=0xffffffff
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: Could not stop
>> >> >> >> >> RX, we could be confusing the DMA engine when we start RX up
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: Chip reset
>> >> >> >> >> failed
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: Unable to reset
>> >> >> >> >> channel, reset status -22
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: DMA failed to
>> >> >> >> >> stop in 10 ms AR_CR=0xffffffff AR_DIAG_SW=0xffffffff
>> >> >> >> >> DMADBG_7=0xffffffff
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: Could not stop
>> >> >> >> >> RX, we could be confusing the DMA engine when we start RX up
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: Chip reset
>> >> >> >> >> failed
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: Unable to reset
>> >> >> >> >> channel, reset status -22
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: DMA failed to
>> >> >> >> >> stop in 10 ms AR_CR=0xffffffff AR_DIAG_SW=0xffffffff
>> >> >> >> >> DMADBG_7=0xffffffff
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: Could not stop
>> >> >> >> >> RX, we could be confusing the DMA engine when we start RX up
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: Chip reset
>> >> >> >> >> failed
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: Unable to reset
>> >> >> >> >> channel, reset status -22
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: DMA failed to
>> >> >> >> >> stop in 10 ms AR_CR=0xffffffff AR_DIAG_SW=0xffffffff
>> >> >> >> >> DMADBG_7=0xffffffff
>> >> >> >> >> nov 12 12:43:44 abril.charola kernel: ath: phy0: Could not stop
>> >> >> >> >> RX, we could be confusing the DMA engine when we start RX up
>> >> >> >> >> nov 12 12:43:45 abril.charola kernel: ath: phy0: Chip reset
>> >> >> >> >> failed
>> >> >> >> >> nov 12 12:43:45 abril.charola kernel: ath: phy0: Unable to reset
>> >> >> >> >> channel, reset status -22
>> >> >> >> >> nov 12 12:43:45 abril.charola kernel: ath: phy0: DMA failed to
>> >> >> >> >> stop in 10 ms AR_CR=0xffffffff AR_DIAG_SW=0xffffffff
>> >> >> >> >> DMADBG_7=0xffffffff
>> >> >> >> >> nov 12 12:43:45 abril.charola kernel: ath: phy0: Could not stop
>> >> >> >> >> RX, we could be confusing the DMA engine when we start RX up
>> >> >> >> >> nov 12 12:43:45 abril.charola kernel: ath: phy0: Chip reset
>> >> >> >> >> failed
>> >> >> >> >> nov 12 12:43:45 abril.charola kernel: ath: phy0: Unable to reset
>> >> >> >> >> channel, reset status -22
>> >> >> >> >> nov 12 12:43:45 abril.charola NetworkManager[445]: <warn>
>> >> >> >> >> Connection disconnected (reason -4)
>> >> >> >> >> nov 12 12:43:45 abril.charola NetworkManager[445]: <info>
>> >> >> >> >> (wlp2s0): supplicant interface state: completed -> disconnected
>> >> >> >> >> nov 12 12:43:45 abril.charola kernel: cfg80211: Exceeded CRDA
>> >> >> >> >> call max attempts. Not calling CRDA
>> >> >> >> >> nov 12 12:43:45 abril.charola kernel: ath: phy0: DMA failed to
>> >> >> >> >> stop in 10 ms AR_CR=0xffffffff AR_DIAG_SW=0xffffffff
>> >> >> >> >> DMADBG_7=0xffffffff
>> >> >> >> >> nov 12 12:43:45 abril.charola kernel: ath: phy0: Could not stop
>> >> >> >> >> RX, we could be confusing the DMA engine when we start RX up
>> >> >> >> >> nov 12 12:43:45 abril.charola NetworkManager[445]: <info>
>> >> >> >> >> (wlp2s0): supplicant interface state: disconnected -> scanning
>> >> >> >> >> --8<---------------cut here---------------end--------------->8---
>> >> >> >> >>
>> >> >> >> >> As I don't understand anything with such error messages, my
>> >> >> >> >> guess is that it is something
>> >> >> >> >> serious. after trying to unload the modules related to my wifi
>> >> >> >> >> driver (ath (which is impossible
>> >> >> >> >> because other modules requiring it are being use)) and typing
>> >> >> >> >> `ifconfig wlp2s0 down` and what not, I
>> >> >> >> >> just gave up and restart my laptop. At some point journald
>> >> >> >> >> register something interesting:
>> >> >> >> >>
>> >> >> >> >> --8<---------------cut here---------------start------------->8---
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: irq 17: nobody cared (try
>> >> >> >> >> booting with the "irqpoll" option)
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: CPU: 0 PID: 0 Comm:
>> >> >> >> >> swapper/0 Not tainted 4.1.11-gnu-1-lts #1
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: Hardware name: LENOVO
>> >> >> >> >> 1951F8G/1951F8G, BIOS CBET4000 79ETE7WW (2.27 ) 05/18/2015
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: c1609907 4a9301f9
>> >> >> >> >> 00000000 f5035f54 c14a49ec f53d0e9c f5035f74 c10abbac
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: c1575cc0 00000011
>> >> >> >> >> f5035f70 f85611db f53d0e40 00000000 f5035f98 c10abf22
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: c1329d4a 0003ab5e
>> >> >> >> >> 00000000 4a9301f9 f53d0e40 c1676e00 00000000 f5035fd4
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: Call Trace:
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c14a49ec>]
>> >> >> >> >> dump_stack+0x41/0x52
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c10abbac>]
>> >> >> >> >> __report_bad_irq+0x2c/0xd0
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<f85611db>] ?
>> >> >> >> >> ath9k_hw_intrpend+0x5b/0x70 [ath9k_hw]
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c10abf22>]
>> >> >> >> >> note_interrupt+0x212/0x250
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c1329d4a>] ?
>> >> >> >> >> add_interrupt_randomness+0x16a/0x1a0
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c10a99a2>]
>> >> >> >> >> handle_irq_event_percpu+0x122/0x190
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c10a99a2>] ?
>> >> >> >> >> handle_irq_event_percpu+0x122/0x190
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c10a9a3a>]
>> >> >> >> >> handle_irq_event+0x2a/0x50
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c10ac520>] ?
>> >> >> >> >> handle_edge_irq+0xe0/0xe0
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c10ac589>]
>> >> >> >> >> handle_fasteoi_irq+0x69/0x100
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c1004906>]
>> >> >> >> >> handle_irq+0x56/0x90
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: <IRQ> [<c14aa60c>]
>> >> >> >> >> do_IRQ+0x3c/0xd0
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c14a9c33>]
>> >> >> >> >> common_interrupt+0x33/0x38
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c138a553>] ?
>> >> >> >> >> cpuidle_enter_state+0x83/0x240
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c138a744>]
>> >> >> >> >> cpuidle_enter+0x14/0x20
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c108fe89>]
>> >> >> >> >> cpu_startup_entry+0x299/0x3a0
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c14a1f67>]
>> >> >> >> >> rest_init+0x67/0x70
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c167eb51>]
>> >> >> >> >> start_kernel+0x3c9/0x3e2
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<c167e2e3>]
>> >> >> >> >> i386_start_kernel+0x91/0x95
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: handlers:
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<f81083c0>] usb_hcd_irq
>> >> >> >> >> [usbcore]
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: [<f860a890>] ath_isr
>> >> >> >> >> [ath9k]
>> >> >> >> >> nov 12 12:44:00 abril.charola kernel: Disabling IRQ #17
>> >> >> >> >> --8<---------------cut here---------------end--------------->8---
>> >> >> >> >>
>> >> >> >> >> Again, I don't know what it says but seems very serious. I'll
>> >> >> >> >> attach the full logs in case what I
>> >> >> >> >> provide is not enough. Hope someone can help me with this.
>> >> >> >> >>
>> >> >> >> >> P.S.: I haven't clean my laptop from dust since I bought it, and
>> >> >> >> >> it seems it have some inside, this
>> >> >> >> >> sporadic issue can be caused by the dust, too.
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>>
--
👋 Pax et bonum.
Jorge Araya Navarro
https://es.gravatar.com/shackra