qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] hw/ide: Remove status register read side effect


From: jasper.lowell
Subject: Re: [PATCH] hw/ide: Remove status register read side effect
Date: Wed, 26 Feb 2020 05:22:55 +0000

> Problem with that patch is that it removes this clearing from the
> func 
> that's also used to emulate ISA IDE ioports which according to their
> spec 
> should clear irq on read so that function should be OK but maybe
> should 
> not be called by PCI IDE code?

This might be it.

The patch I provided is definitely incorrect and deviates from the
specification as Mark mentioned earlier. I misunderstood what
ide_ioport_read/write were for and haven't been thinking about legacy
mode. 

The bug that I believe exists is present when the CMD646 is operating
in PCI native mode. Yeah, I think a possible solution might be to avoid
using the ioport_read/write functions from the PCI code if they have
side effects that assume the device is in legacy mode. I'll have to
spend more time reading through the code and documentation.

> Except the legacy IDE spec that does say reading status is clearing
> IRQ 
> but not sure PCI native mode should do the same but it seems to use
> the 
> same function in QEMU so it will clear IRQ as in legacy IDE mode.

According to the CMD646U2 specification:
"When an IDE port is in PCI IDE Legacy Mode, the PCI646U2 is compatible
with standard ISA IDE. The IDE task file registers are mapped to the
standard ISA port addresses, and IDE drive interrupts occur at IRQ14
(primary) or IRQ15 (secondary)."

In legacy mode, IRQ14 and IRQ15 mirror the state of INTRQ on each of
the selected IDE devices. QEMU appears to emulate this correctly.

In PCI native mode, INTRQ is not mirrored or given a single IRQ.
Interrupts are provided by the PCI IDE controller depending on the
controller's logic. For instance, an IDE device can raise an interrupt
but the CMD646 may not propagate that interrupt if MRDMODE has certain
bits set. I'm thinking that maybe the controller does not have logic to
unset the interrupt bits in CFR and ARTTIM23 when the IDE device lowers
INTRQ. This might mean that the controller will continue to assert an
interrupt while bits in CFR and ARTTIM23 remain set, even if the IDE
device lowers INTRQ. This would explain why the CMD646 documentation
instructs developers to lower them explicitly.

> Except the legacy IDE spec that does say reading status is clearing
> IRQ 
> but not sure PCI native mode should do the same but it seems to use
> the 
> same function in QEMU so it will clear IRQ as in legacy IDE mode. But
> this 
> Linux driver says IRQ is cleared on read for PCI as well:
> 
> 
https://github.com/torvalds/linux/blob/master/drivers/ata/libata-sff.c
> 
> as does the CMD646 driver:
> 
> 
https://github.com/torvalds/linux/blob/master/drivers/ata/pata_cmd64x.c
> 
> in cmd64x_sff_irq_check() although for different chip revisions it
> uses 
> cmd648_sff_irq_* functions which does this differently and avoids
> reading 
> status reg and clears irq explicitely. It also has a warning at the 
> beginning that UDMA mode is broken on most of these chips so it won't
> try 
> to use it on anything below CMD646U2 so this suggests maybe there's
> a 
> problem with clearing IRQs on at least some CMD646 chip revisions. I
> think 
> the Sun Ultra 10 used CMD646U but not sure what the Solaris driver
> expects 
> and if it can work with later chip revisions. Maybe we should either 
> emulate the chip bugs or change something to identify as CMD646U2
> which 
> should behave more like stadard PCI IDE controllers? Although if I
> got 
> that correctly Linux thinks revisions over 5 are OK and QEMU has 7.

I'm not sure what it expects. If the Sun Ultra 10 shipped with the
CMD646U, I reason that Solaris 10 either expects it or has support for
it.

The Linux driver code appears to be consistent with the behaviour that
I'm seeing from Solaris 10.

The following appears to be used to initialise the CMD646U.

{       /* CMD 646U with broken UDMA */
        .flags = ATA_FLAG_SLAVE_POSS,
        .pio_mask = ATA_PIO4,
        .mwdma_mask = ATA_MWDMA2,
        .port_ops = &cmd646r3_port_ops
},

The port operations it uses are defined as so:

static struct ata_port_operations cmd646r3_port_ops = {
        .inherits       = &cmd64x_base_ops,
        .sff_irq_check  = cmd648_sff_irq_check,
        .sff_irq_clear  = cmd648_sff_irq_clear,
        .cable_detect   = ata_cable_40wire,
}

As you mention, cmd648_sff_irq_clear clears interrupts explicitly by
setting bits in MRDMODE - consistent with the CMD646U2 documentation.
This behaviour is very similar to Solaris 10.

> Although if I got 
> that correctly Linux thinks revisions over 5 are OK and QEMU has 7.

I'm not sure how revision numbers work with these chips. Do CMD646 and
CMD646U2 refer to different revisions of the CMD646 chip?

Thanks,
Jasper Lowell.


On Tue, 2020-02-25 at 16:08 +0100, BALATON Zoltan wrote:
> On Tue, 25 Feb 2020, address@hidden wrote:
I don't believe the
> quick interrupt here is the problem. Solaris 10
will spin for a short
> time while waiting for the interrupt bit to be
set before continuing
> with its routine. If it doesn't see the interrupt
bit is set before
> some timeout, it will print an error about the
missing interrupt and
> give up loading the driver.

I don't think missing delay should cause any problem either just
pointed 
this out as a difference from real controller which may have an
effect but 
I agree this is probably not a problem.

pci_cfg_read 53.231 pid=147030 dev=b'cmd646-ide' devid=0x3
fnid=0x0
offs=0x50 val=0x4
ide_ioport_read 35.577 pid=147030 addr=0x7 reg=b'Status'
val=0x50
bus=0x55b77f922d10 s=0x55b77f922d98
ide_ioport_read 29.095 pid=147030 addr=0x7 reg=b'Status'
val=0x50
bus=0x55b77f922d10 s=0x55b77f922d98

So these ide_ioport_read calls clear the irq bit...

That's right. The line that I proposed removing in the patch clears
CFG_INTR_CH0 on ide_ioport_read.

Problem with that patch is that it removes this clearing from the
func 
that's also used to emulate ISA IDE ioports which according to their
spec 
should clear irq on read so that function should be OK but maybe
should 
not be called by PCI IDE code?

ide_ioport_write 19.146 pid=147030 addr=0x6 reg=b'Device/Head'
val=0xe0 bus=0x55b77f922d10 s=0x55b77f922d98
pci_cfg_read 9.468 pid=147030 dev=b'cmd646-ide' devid=0x3
fnid=0x0
offs=0x50 val=0x0
pci_cfg_read 127.712 pid=147030 dev=b'cmd646-ide' devid=0x3
fnid=0x0 offs=0x50 val=0x0
pci_cfg_read 101.942 pid=147030 dev=b'cmd646-ide' devid=0x3
fnid=0x0 offs=0x50 val=0x0

...that would be checked here?

That's right.

Solaris is performing pci_cfg_read on offs=0x50 until it either
sees
the interrupt bit set or times out. If it times out, you get a
fatal
error for the driver. The behaviour is not expected and
aggressively
checked against by the Solaris kernel. From what I can tell, Linux
and
OpenBSD don't check if the bit is set before clearing it.

What I don't get is why ide_ioport_read is called at all and from
where 
if that's meant to emulate legacy ide ISA ioport reads and we
have a 
PCI device accessed via PCI regs?

What I meant was where is ide_ioport_read() is called in this case
when we 
have a PCI IDE controller? Searching for it I think it may come from 
pci_ide_data_read() in hw/ide/pci.c. This document:

http://www.bswd.com/pciide.pdf

has some info on this and there are mentions of status using
Alternate 
Status (which does not clear interrupt bit) but I think that
corresponds 
to pci_ide_cmd_read() which already uses ide_status_read() so that
seems 
correct. I did not find anything about contents of the Command Block
in 
this doc which the function with ide_ioport_read call implements so
not 
sure if that's expected to clear interrupt in PCI native mode or
should we 
change pci_ide_data_read() to avoid that.

mention irq in a lot of regs (all say write to clear) but I don't
understand their relation to each other and irq raised by the
drive.

I agree and I think that's part of the problem. The documentation
does
not explicitly mention their relation to each other. I can't see
anything that suggests that reading the status register on the
drive
will unset bits in the pci configuration space of the controller.
They
are seperate devices.

Except the legacy IDE spec that does say reading status is clearing
IRQ 
but not sure PCI native mode should do the same but it seems to use
the 
same function in QEMU so it will clear IRQ as in legacy IDE mode. But
this 
Linux driver says IRQ is cleared on read for PCI as well:

https://github.com/torvalds/linux/blob/master/drivers/ata/libata-sff.c

as does the CMD646 driver:

https://github.com/torvalds/linux/blob/master/drivers/ata/pata_cmd64x.c

in cmd64x_sff_irq_check() although for different chip revisions it
uses 
cmd648_sff_irq_* functions which does this differently and avoids
reading 
status reg and clears irq explicitely. It also has a warning at the 
beginning that UDMA mode is broken on most of these chips so it won't
try 
to use it on anything below CMD646U2 so this suggests maybe there's
a 
problem with clearing IRQs on at least some CMD646 chip revisions. I
think 
the Sun Ultra 10 used CMD646U but not sure what the Solaris driver
expects 
and if it can work with later chip revisions. Maybe we should either 
emulate the chip bugs or change something to identify as CMD646U2
which 
should behave more like stadard PCI IDE controllers? Although if I
got 
that correctly Linux thinks revisions over 5 are OK and QEMU has 7.

I think we need advice from someone more knowledgeable about real
hardware 
on this.

Regards,
BALATON Zoltan

reply via email to

[Prev in Thread] Current Thread [Next in Thread]