qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] Megasas HBA emulation and SCSI update v.2


From: Nicholas A. Bellinger
Subject: Re: [Qemu-devel] [PATCH] Megasas HBA emulation and SCSI update v.2
Date: Thu, 02 Dec 2010 01:16:25 -0800

On Thu, 2010-12-02 at 08:51 +0100, Hannes Reinecke wrote:
> On 12/02/2010 01:14 AM, Nicholas A. Bellinger wrote:
> > On Wed, 2010-12-01 at 16:46 +0100, Hannes Reinecke wrote:
> >> On 12/01/2010 03:18 PM, Hannes Reinecke wrote:

<SNIP>

> >> Hmpf. Using a new vista x86 image (build 6002) with SP2 preloaded
> >> megasas works, too.
> >> Dodgy build I had, apparently.
> >>
> > 
> > Thanks for the update..  After testing the lastest megasas.v3 HEAD
> > at commit:
> > 
> > * megasas.v3 978e61e megasas: Fixup PD query return value
> > 
> > it appears that the same Win7 64-bit Build 7600 that is functioning with
> > v0.12.5 windows7-megasas-working will now BSOD the guest.  After further 
> > checking
> > it appears that this is not megasas HBA specific, and is due to your tree 
> > being
> > slightly more out of date than mine.  ;)
> > 
> Yes, this is totally weird. AFAICS the MMIO register data is
> _exactly_ identical for both, the old working one and the new
> implementation. Yet Win7 is behaving differently in both cases.
> So it must be indeed the qemu base which is doing odd things here.
> 

Ok, after spending the better part of the evening trying to identify
differences between the two resv, I am inclined to agree with you here.

After merging megasas.v3 into megasas-upstream-v1 and pushing into
qemu-kvm.git, I did finally run into a semi meaningful BSOD with the
64-bit guest here:

http://linux-iscsi.org/builds/megasas-emulation-logs/win7-64bit-megasas-BSOD-12022010-1.png

which is happening after the initial run of DCMDs complete successfully,
and for the first 16-byte INQUIRY frame into megasas_handle_scsi()..  

Here is a snippet from the log:

<SNIP past initial DCMDs completed successfully>

megasas: Enqueue frame 1 count 0 context 3e6 tail 0 busy 1
megasas: frame 1: MFI DCMD opcode 1040500
megasas: DCMD controller event wait
megasas: MFI DCMD wrote 0 bytes
megasas: Complete frame context 3e6

<Last DMCD before first MFI_CMD_PD_SCSI_IO frame:

megasas: writel mmio 0xa0: ffffffff
megasas: Update reply queue head 0 busy 0
megasas: writel mmio 0x34: 7ffffffb
megasas: writel mmio 0x40: 1fd76381
megasas: Received frame addr 1fd76380 count 0
megasas: MFI cmd 4 context 0 count 0
megasas: Return new frame 2 cmd 0x7fb7cebd53e0
megasas: Enqueue frame 2 count 0 context 0 tail 0 busy 1
megasas: PD SCSI physical dev 0 lun 0 sdev 0x139b9f0 xfer 16
megasas: 16 bytes of data available for reading
megasas: PD SCSI req 0x13bb1c0 cmd 0x7fb7cebd53e0 lun 0x139b9f0 command 
completed, arg 16
megasas: PD SCSI req 0x13bb1c0 cmd 0x7fb7cebd53e0 lun 0x139b9f0 read finished, 
len 16
megasas: PD SCSI req 0x13bb1c0 cmd 0x7fb7cebd53e0 lun 0x139b9f0 command 
completed, arg 0
megasas: PD SCSI req 0x13bb1c0 cmd 0x7fb7cebd53e0 lun 0x139b9f0 finished with 
status 0 len 16
megasas: Complete frame context 0 tail 0 busy 0 doorbell 0
megasas: readl mmio 0x30: 80000001
megasas: writel mmio 0xa0: 80000001
megasas: Update reply queue head 1 busy 0

....  and at this point a BSOD is triggered in the 64-bit Win7 guest with
DRIVER_IRQL_NOT_LESS_OR_EQUAL Stop Code = 0xD1, which seems from a quick
google search to indicate a problem wrt to paging and DMA transfers.

So, after that I started to compare both versions w/ all megasas
debugging enabled, and had another look at the code I did notice a few
subtle differences however between megasas.v3 in
megasas_mmio_writel:MFI_IQP* that you changed recently.  So your
recent change here revets back to v0.12.5 logic, but the frame_count
assignment is still different:

       case MFI_IQP:
            /* Received MFI frame address */
            frame_addr = (val & ~0xF);
            /* Add possible 64 bit offset */
            frame_addr |= (uint64_t)s->frame_hi;
            s->frame_hi = 0;
            frame_count = (val & 0xF) >> 1;
            <SNIP>

and v0.12.5 windows7-megasas-working:

    case MFI_IQP:
        /* Received MFI frames; up to 8 contiguous frames */
        frame_addr = (val & ~0xF);
        /* Add possible 64 bit offset */
        frame_addr |= (uint64_t)s->frame_hi;
        s->frame_hi = 0;
        frame_count = (val >> 1) & 0x7;
        <SNIP>

Unfortuately this does not appear to make a difference when changing
megasas.v3 follow the existing windows7-megasas-working code, and the
frame_addr assignment recently changed back in megasas.v3 now matches
v0.12.5 code.

Both logs are attached for reference, and aside from the frame_count,
the only other thing that I am noticing is that the struct megasas_cmd_t
%p pointers in the working v0.12.5 are showing low memory addresses,
for example: 

megasas: writel mmio 40: 1f15c381
megasas: Received frame addr 1f15c380 count 0
megasas: MFI cmd 4 context 0 count 0
megasas: Return new frame 2 cmd 0xf077a8
megasas: Enqueue frame context 0 tail 0 busy 1
megasas: PD SCSI dev 0 lun 0 sdev 0xf1e5a0 xfer 16
megasas: PD SCSI req 0xf38120 cmd 0xf077a8 lun 0xf1e5a0 finished with status 0 
len 16
megasas: Complete frame context 0 tail 0 busy 0 doorbell 0

and the latest code is showing the same pointers for *cmd as:

megasas: writel mmio 0x40: 1fd76381
megasas: Received frame addr 1fd76380 count 0
megasas: MFI cmd 4 context 0 count 0
megasas: Return new frame 2 cmd 0x7fb7cebd53e0
megasas: Enqueue frame 2 count 0 context 0 tail 0 busy 1
megasas: PD SCSI physical dev 0 lun 0 sdev 0x139b9f0 xfer 16
megasas: 16 bytes of data available for reading
megasas: PD SCSI req 0x13bb1c0 cmd 0x7fb7cebd53e0 lun 0x139b9f0 command 
completed, arg 16
megasas: PD SCSI req 0x13bb1c0 cmd 0x7fb7cebd53e0 lun 0x139b9f0 read finished, 
len 16
megasas: PD SCSI req 0x13bb1c0 cmd 0x7fb7cebd53e0 lun 0x139b9f0 command 
completed, arg 0
megasas: PD SCSI req 0x13bb1c0 cmd 0x7fb7cebd53e0 lun 0x139b9f0 finished with 
status 0 len 16
megasas: Complete frame context 0 tail 0 busy 0 doorbell 0

I am not sure if this is related, but this seems like it could be something
worth investigating.

Also forceed fw_sge=8 and fw_cmds=1000 in megasas_scsi_init() to follow the
defaults with the working v0.12.5, the again, the MMIO writes and reads
up until the first 16-byte INQUIRY do appear to be identical AFAICT.

Here is the full log for the megasas.v3 -> megasas-upstream-v1 code:

http://linux-iscsi.org/builds/megasas-emulation-logs/win764-bit-megasas-v3.txt

and the working v0.12.5 boot:

http://linux-iscsi.org/builds/megasas-emulation-logs/win764-bit-megasas-v1.txt


> But that's a good hint, I'll be updating my tree and see how far
> I'll progress.
> 
> > But the good news is that WinXP SP2 is now working via scsi-generic ->
> > TCM_Loop in megasas.v3, and even w/o the original sync ioctl patch we
> > required  in v0.12.5 megasas code.  Very excellent work Hannes!
> > 
> > So, I will be merging the latest changes from megasas.v3 -> 
> > megasas-upstream-v1
> > shortly and retesting with 64-bit Build 7600.
> > 
> Cool. Thanks. I'll be rebasing my patches, too. I guess it's time
> for megasas.v4.
> 

Sounds good, and please let me know if you have any other ideas or would
like me to test something else.

Thanks Hannes!

--nab




reply via email to

[Prev in Thread] Current Thread [Next in Thread]