qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] hw/ide: Remove status register read side effect


From: jasper.lowell
Subject: Re: [PATCH] hw/ide: Remove status register read side effect
Date: Thu, 27 Feb 2020 05:56:56 +0000

> I'll submit a RFC V2 patch with a proposed fix.

This will have to wait.

Recent commits have caused Solaris 10 to error out of booting much
earlier than previously.

Jasper Lowell.

On Thu, 2020-02-27 at 05:10 +0000, address@hidden wrote:
> I've since looked at a Ultra 5 board and can confirm that it shipped
> with a CMD646U chip like the Ultra 10.
> 
> > Since both the generic PCI IDE and CMD646 Linux drivers mention irq
> > is 
> > cleared on reading status I think using ide_ioport_read in
> > hw/ide/pci.c 
> > may be correct for the generic case.
> 
> For clarity, the Linux drivers mention that for some chips reading
> CFR
> or ARTTIM23 will clear interrupts. Here in
> /linux/drivers/ata/pata_cmd64x.c, for instance:
> 
> static bool cmd64x_sff_irq_check(struct ata_port *ap)
> {
>       struct pci_dev *pdev = to_pci_dev(ap->host->dev);
>       int irq_mask = ap->port_no ? ARTTIM23_INTR_CH1 : CFR_INTR_CH0;
>       int irq_reg  = ap->port_no ? ARTTIM23 : CFR;
>       u8 irq_stat;
> 
>       /* NOTE: reading the register should clear the interrupt */
>       pci_read_config_byte(pdev, irq_reg, &irq_stat);
> 
>       return irq_stat & irq_mask;
> }
> 
> I might be misunderstanding but isn't this a different side effect
> than
> clearing interrupts from the IDE device when the IDE device status
> register is read? This is saying that reading ARTTIM23 or CFR will
> clear INTA#.
> 
> This code is also only used for the CMD643, CMD646, and CMD646 rev 1
> -
> none of which I believe QEMU is attempting to emulate. This doesn't
> look relevant to us.
> 
> > I'm not sure either but from what I've seen so far I think CMD646
> > either 
> > refers to the whole family (i.e. all versions) or early versions
> > depending 
> > on context while there are at least two more newer versions
> > referred
> > to as 
> > CMD646U and CMD646U2 but probably there are more revisions within
> > these as 
> > U2 seems to be rev5. QEMU sets the revision field to 7 but I'm not
> > sure 
> > that's the same Linux checks for. There's some more info on this
> > here:
> > 
> > https://ata.wiki.kernel.org/index.php/Pata_cmd64x
> 
> I've seen CMD646 used to refer to the family as well.
> 
> > QEMU sets the revision field to 7
> 
> In /linux/drivers/ide/cmd64x.c I found the following comment:
> 
> * UltraDMA only supported on PCI646U and PCI646U2, which
> * correspond to revisions 0x03, 0x05 and 0x07 respectively.
> 
> I guess the PCI646U2 is both revisions 0x05 and 0x07.
> 
> According to /linux/drivers/ata/pata_cmd64x.c, interrupt behaviour
> doesn't change across the CMD646U, CMD646U2, CMD648, and the CMD649.
> These chips set the interrupt bit explicitly and do not have the
> comment you highlighted earlier about clearing interrupts when
> reading
> ARTTIM23 or CFR.
> 
> > This may explain why
> > Linux 
> > checks alt status and clears interrupt instead of reading status
> > register.
> 
> I think Linux does this when there is no PCI IDE controller. The code
> in /linux/drivers/ide/ide-io.c acts directly on IDE devices.
> 
> > I don't know but sounds plausible that reading the status reg
> > clears
> > irq 
> > but reading the pci config words that mirrors it won't clear it.
> > But
> > the 
> > traces you had show that ide_ioport_read was called so driver was
> > likely 
> > reading status and not the config regs?
> 
> When in PCI native mode, ide_ioport_read is called because of the
> following code in /qemu/hw/ide/pci.c.
> 
> static uint64_t pci_ide_data_read(void *opaque, hwaddr addr, unsigned
> size)
> {
>     IDEBus *bus = opaque;
> 
>     if (size == 1) {
>         return ide_ioport_read(bus, addr);
>     } else if (addr == 0) {
>         if (size == 2) {
>             return ide_data_readw(bus, addr);
>         } else {
>             return ide_data_readl(bus, addr);
>         }
>     }
>     return ((uint64_t)1 << (size * 8)) - 1;
> }
> 
> ide_ioport_read calls qemu_irq_lower when the IDE device status
> register is read. This will propagate all the way to the root bus. I
> think that when there is a PCI IDE controller inbetween and in PCI
> native mode, this is not always propagated past the PCI IDE
> controller.
> In the case of the CMD646U2, I think the lowering of interrupts is
> only
> propagated when the controller is in legacy/compatibility mode.
> 
> From what I can tell cmd646_set_irq is only ever called from IDE
> device
> code, ie. ide_ioport_read. If the controller is not supposed to
> propagate the lowering of interrupts, I think a possible fix would be
> changing cmd646_set_irq to do nothing when the provided level is 0.
> Interrupts can still be cleared by writing to CFR, ARTTIM23, and
> MRDMODE which leverage cmd646_update_irq instead. This fix is not
> suitable if the emulated CMD646 can be used in compatibility mode but
> I
> don't think we have support for that anyways. 
> 
> I'll submit a RFC V2 patch with a proposed fix.
> 
> Thanks,
> Jasper Lowell.
> 
> On Wed, 2020-02-26 at 12:07 +0100, BALATON Zoltan wrote:
> > On Wed, 26 Feb 2020, address@hidden wrote:
> > > > Problem with that patch is that it removes this clearing from
> > > > the
> > > > func 
> > > > that's also used to emulate ISA IDE ioports which according to
> > > > their 
> > > > spec should clear irq on read so that function should be OK but
> > > > maybe 
> > > > should not be called by PCI IDE code?
> > > 
> > > This might be it.
> > > 
> > > The patch I provided is definitely incorrect and deviates from
> > > the
> > > specification as Mark mentioned earlier. I misunderstood what
> > > ide_ioport_read/write were for and haven't been thinking about
> > > legacy
> > > mode. 
> > > 
> > > The bug that I believe exists is present when the CMD646 is
> > > operating
> > > in PCI native mode. Yeah, I think a possible solution might be to
> > > avoid
> > > using the ioport_read/write functions from the PCI code if they
> > > have
> > > side effects that assume the device is in legacy mode. I'll have
> > > to
> > > spend more time reading through the code and documentation.
> > 
> > Since both the generic PCI IDE and CMD646 Linux drivers mention irq
> > is 
> > cleared on reading status I think using ide_ioport_read in
> > hw/ide/pci.c 
> > may be correct for the generic case. Not sure if the CMD646 has
> > some 
> > pecularity but maybe the difference in drivers is to avoid bugs
> > not 
> > because of CMD646 not clearing irq. The wikipedia page of CMD640:
> > 
> > https://en.wikipedia.org/wiki/CMD640
> > 
> > mentions some versions of it has a bug similar to RZ-1000 for
> > which 
> > there's a doc referenced that says the problem is that it forgets
> > last 
> > data word after raising (or clearing?) IRQ and a workaround is to
> > avoid 
> > checking status until all data transferred. This may explain why
> > Linux 
> > checks alt status and clears interrupt instead of reading status
> > register.
> > 
> > > According to the CMD646U2 specification:
> > > "When an IDE port is in PCI IDE Legacy Mode, the PCI646U2 is
> > > compatible
> > > with standard ISA IDE. The IDE task file registers are mapped to
> > > the
> > > standard ISA port addresses, and IDE drive interrupts occur at
> > > IRQ14
> > > (primary) or IRQ15 (secondary)."
> > > 
> > > In legacy mode, IRQ14 and IRQ15 mirror the state of INTRQ on each
> > > of
> > > the selected IDE devices. QEMU appears to emulate this correctly.
> > > 
> > > In PCI native mode, INTRQ is not mirrored or given a single IRQ.
> > > Interrupts are provided by the PCI IDE controller depending on
> > > the
> > > controller's logic. For instance, an IDE device can raise an
> > > interrupt
> > > but the CMD646 may not propagate that interrupt if MRDMODE has
> > > certain
> > > bits set. I'm thinking that maybe the controller does not have
> > > logic to
> > > unset the interrupt bits in CFR and ARTTIM23 when the IDE device
> > > lowers
> > > INTRQ. This might mean that the controller will continue to
> > > assert
> > > an
> > > interrupt while bits in CFR and ARTTIM23 remain set, even if the
> > > IDE
> > > device lowers INTRQ. This would explain why the CMD646
> > > documentation
> > > instructs developers to lower them explicitly.
> > 
> > I don't know but sounds plausible that reading the status reg
> > clears
> > irq 
> > but reading the pci config words that mirrors it won't clear it.
> > But
> > the 
> > traces you had show that ide_ioport_read was called so driver was
> > likely 
> > reading status and not the config regs?
> > 
> > I've found some further logs:
> > 
> > https://forums.gentoo.org/viewtopic-t-270357.html
> > https://www.redhat.com/archives/axp-list/2000-October/msg00070.html
> > https://www.linuxtopia.org/online_books/linux_beginner_books/debian_linux_desktop_survival_guide/Docking_Station.shtml
> > 
> > These show Linux messages for early CMD646 revisions that had bugs
> > but 
> > what I've noticed is that they say something about not 100% native
> > mode 
> > which seems to be similar to what I had with via-ide when it uses
> > IRQ14-15 
> > even in native mode. Could your problem be similar? Maybe you could
> > search 
> > for more such logs for Linux booting on Sun Ultra machines and see
> > what 
> > those say and check how it determines which IRQ number it should
> > use.
> > This 
> > may depend on some setting that's not emulated correctly.
> > 
> > > The Linux driver code appears to be consistent with the behaviour
> > > that
> > > I'm seeing from Solaris 10.
> > > 
> > > The following appears to be used to initialise the CMD646U.
> > > 
> > > { /* CMD 646U with broken UDMA */
> > >   .flags = ATA_FLAG_SLAVE_POSS,
> > >   .pio_mask = ATA_PIO4,
> > >   .mwdma_mask = ATA_MWDMA2,
> > >   .port_ops = &cmd646r3_port_ops
> > > },
> > > 
> > > The port operations it uses are defined as so:
> > > 
> > > static struct ata_port_operations cmd646r3_port_ops = {
> > >   .inherits       = &cmd64x_base_ops,
> > >   .sff_irq_check  = cmd648_sff_irq_check,
> > >   .sff_irq_clear  = cmd648_sff_irq_clear,
> > >   .cable_detect   = ata_cable_40wire,
> > > }
> > > 
> > > As you mention, cmd648_sff_irq_clear clears interrupts explicitly
> > > by
> > > setting bits in MRDMODE - consistent with the CMD646U2
> > > documentation.
> > > This behaviour is very similar to Solaris 10.
> > 
> > I think this may be to avoid bug with CMD646U.
> > 
> > > > Although if I got 
> > > > that correctly Linux thinks revisions over 5 are OK and QEMU
> > > > has
> > > > 7.
> > > 
> > > I'm not sure how revision numbers work with these chips. Do
> > > CMD646
> > > and
> > > CMD646U2 refer to different revisions of the CMD646 chip?
> > 
> > I'm not sure either but from what I've seen so far I think CMD646
> > either 
> > refers to the whole family (i.e. all versions) or early versions
> > depending 
> > on context while there are at least two more newer versions
> > referred
> > to as 
> > CMD646U and CMD646U2 but probably there are more revisions within
> > these as 
> > U2 seems to be rev5. QEMU sets the revision field to 7 but I'm not
> > sure 
> > that's the same Linux checks for. There's some more info on this
> > here:
> > 
> > https://ata.wiki.kernel.org/index.php/Pata_cmd64x
> > 
> > Regards,
> > BALATON Zoltan
> > 

reply via email to

[Prev in Thread] Current Thread [Next in Thread]