qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [PATCH] [SPARC] SlavIO interrupt controller fix


From: Aurelien Jarno
Subject: [Qemu-devel] [PATCH] [SPARC] SlavIO interrupt controller fix
Date: Thu, 15 Mar 2007 02:05:01 +0100
User-agent: Mutt/1.5.13 (2006-08-11)

Hi all,

I have finally been able to fix the Linux kernel crash that occurs on
the Sparc target (sun4m) when doing intensive disk I/O (see the dmesg 
log below).

slavio_pic_set_irq() in slavio_intctl.c calls slavio_check_interrupts()
when an interrupt is activated, but also when interrupt is deactivated.
This can cause in very rare conditions a spurious interrupt that 
perturbates the ESP driver that leads to a kernel crash. 

>From what I have been able to trace, it occurs when an interrupt is 
being serviced, and an interrupt with a lower level is being cleared 
before the interrupt routine in the target disables the first interrupt.
To have a bad effect on the ESP driver, it should also occur when a 
DMA transfer is scheduled. That explains why this bug is not so easy 
to reproduce (it usually occurs between half an hour and two hours under
intensive disk I/O, and up to 24 hours with very few disk I/O), though
it is very annoying.

Note that all other functions from this file that activate and 
deactivate interrupts only call slavio_check_interrupts() in  interrupt
activation cases, so they are already correct.

The patch below fixes the problem. With this patch I am currently running
a Sparc target with intensive I/O disk for 24 hours without crash.

Cheers,
Aurelien


esp0: !BSERV after data, probably to msgout
esp0: Aborting command
esp0: dumping state
esp0: dma -- cond_reg<a4000211> addr<f0251000>
esp0: SW [sreg<00> sstep<04> ireg<18>]
esp0: HW reread [sreg<83> sstep<00> ireg<10>]
esp0: current command [tgt<00> lun<00> pphase<DATAOUT> cphase<DATAOUT>]
esp0: disconnected
esp0: Aborting command
esp0: dumping state
esp0: dma -- cond_reg<a4000210> addr<f0251000>
esp0: SW [sreg<00> sstep<04> ireg<18>]
esp0: HW reread [sreg<03> sstep<00> ireg<10>]
esp0: current command [tgt<00> lun<00> pphase<UNISSUED> cphase<UNISSUED>]
esp0: disconnected
esp0: Resetting scsi bus
esp0: SCSI bus reset interrupt
Unable to handle kernel NULL pointer dereference
tsk->{mm,active_mm}->context = 0000000d
tsk->{mm,active_mm}->pgd = fc048800
              \|/ ____ \|/
              "@'/ ,. \`@"
              /_| \__/ |_\
                 \__U_/
apt-get(4250): Oops [#1]
PSR: 04400fc6 PC: fe61f128 NPC: fe61f12c Y: 00000000    Not tainted
PC: <esp_do_data_finale+0x3b4/0x3f8 [esp]>
%G: f2cb4000 ffffffff  00000014 fd0da000  00000000 00000020  f2cb4000 00000001
%O: fe620800 f79d8800  00000010 00000008  f00d8eac f0234000  f2cb5b18 fe61edd0
RPC: <esp_do_data_finale+0x5c/0x3f8 [esp]>
%L: f79f3600 00000000  00000000 f7956500  00000000 ea7afb00  f3004000 00989680
%I: f021529c 00000000  00000000 00000000  00000000 fff00000  f2cb5b80 fe61de10
Caller[fe61de10]: esp_work_bus+0x64/0x6c [esp]
Caller[fe61f7e8]: esp_intr+0x1e0/0x310 [esp]
Caller[f0013160]: handler_irq+0x94/0xd4
Caller[f0010bd8]: patch_handler_irq+0x8/0x24
Caller[f019b744]: here+0x18/0x90
Caller[f019c538]: do_nanosleep+0x44/0x88
Caller[f0046af8]: hrtimer_nanosleep+0x30/0x130
Caller[f0046c74]: sys_nanosleep+0x7c/0x94
Caller[f0011634]: syscall_is_too_hard+0x3c/0x40
Caller[5035a36c]: 0x5035a374
Instruction DUMP: c22420ec  8400a014  c42420e8 <c200a010> c22420e4  c200a00c  
c22420e0  c20e203a  82086007
Kernel panic - not syncing: Aiee, killing interrupt handler!
 <0>Press Stop-A (L1-A) to return to the boot prom



--- hw/slavio_intctl.c  2007-02-06 00:01:54.000000000 +0100
+++ hw/slavio_intctl.c  2007-03-14 13:50:18.000000000 +0100
@@ -293,6 +293,7 @@
            if (level) {
                s->intregm_pending |= mask;
                s->intreg_pending[s->target_cpu] |= 1 << pil;
+               slavio_check_interrupts(s);
            }
            else {
                s->intregm_pending &= ~mask;
@@ -300,7 +301,6 @@
            }
        }
     }
-    slavio_check_interrupts(s);
 }
 
 void slavio_pic_set_irq_cpu(void *opaque, int irq, int level, unsigned int cpu)

-- 
  .''`.  Aurelien Jarno             | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   address@hidden         | address@hidden
   `-    people.debian.org/~aurel32 | www.aurel32.net




reply via email to

[Prev in Thread] Current Thread [Next in Thread]