qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/9] s390x: fix invalid use of cc 1 for SSCH


From: Halil Pasic
Subject: Re: [Qemu-devel] [PATCH 2/9] s390x: fix invalid use of cc 1 for SSCH
Date: Wed, 13 Sep 2017 16:05:34 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0


On 09/13/2017 12:08 PM, Cornelia Huck wrote:
> On Thu, 7 Sep 2017 13:01:34 +0200
> Halil Pasic <address@hidden> wrote:
> 
>> On 09/07/2017 10:02 AM, Dong Jia Shi wrote:
>>> * Cornelia Huck <address@hidden> [2017-09-06 13:25:38 +0200]:
>>>   
>>>> On Wed, 6 Sep 2017 16:27:20 +0800
>>>> Dong Jia Shi <address@hidden> wrote:
>>>>  
>>>>> * Halil Pasic <address@hidden> [2017-09-05 19:20:43 +0200]:
>>>>>  
>>>>>>
>>>>>>
>>>>>> On 09/05/2017 05:46 PM, Cornelia Huck wrote:    
>>>>>>> On Tue, 5 Sep 2017 17:24:19 +0200
>>>>>>> Halil Pasic <address@hidden> wrote:
>>>>>>>     
>>>>>>>> My problem with a program check (indicated by SCSW word 2 bit 10) is
>>>>>>>> that, in my reading of the architecture, the semantic behind it is: The
>>>>>>>> channel subsystem (not the cu or device) has detected, that the 
>>>>>>>> the channel program (previously submitted as an ORB) is erroneous. 
>>>>>>>> Which
>>>>>>>> programs are erroneous is specified by the architecture. What we have
>>>>>>>> here does not qualify.
>>>>>>>>
>>>>>>>> My idea was to rather blame the virtual hardware (device) and put no 
>>>>>>>> blame
>>>>>>>> on the program nor he channel subsystem. This could be done using 
>>>>>>>> device
>>>>>>>> status (unit check with command reject, maybe unit exception) or 
>>>>>>>> interface
>>>>>>>> check. My train of thought was, the problem is not consistent across a
>>>>>>>> device type, so it has to be device specific.    
>>>>>>>
>>>>>>> Unit exception might be a better way to express what is happening here.
>>>>>>> At least, it moves us away from cc 1 and not towards cc 3 :)
>>>>>>>     
>>>>>>
>>>>>> I will do a follow up patch pursuing device exception.
>>>>>>     
>>>>>>>>
>>>>>>>> Of course blaming the device could mislead the person encountering the
>>>>>>>> problem, and make him believe it's an non-virtual hardware problem.
>>>>>>>>
>>>>>>>> About the misleading, I think the best we can do is log out a message
>>>>>>>> indicating what really happened.    
>>>>>>>
>>>>>>> Just document it in the code? If it doesn't happen with Linux as a
>>>>>>> guest, it is highly unlikely to be seen in the wild.
>>>>>>>     
>>>>>>
>>>>>>
>>>>>> Well we have two problems here:
>>>>>> 1) Unit exception can be already defined by the device type for the
>>>>>> command (reference: 
>>>>>> http://publibfp.dhe.ibm.com/cgi-bin/bookmgr/BOOKS/dz9ar110/2.6.10?DT=19920904110920).
>>>>>> I think this one is what you mean. And I agree that's best handled
>>>>>> with comment in code.    
>>>>> Using unit check, with bit 3 byte 0 of the sense data set to 1, to
>>>>> indicate an 'Equipment check', sounds a bit more proper than unit
>>>>> exception.  
>>>>
>>>> I don't agree: Equipment check sounds a lot more dire (and seems to
>>>> imply a malfunction). I like unit exception better.  
>>> Got the point. Fair enough!
>>>   
>>
>> I do see some benefit in doing unit check over unit exception. Just
>> kept quite to see the discussion unfold. As already said, unit exception
>> seems to be something reserved for the device type to define in a more
>> or less arbitrary but unambiguous way. I agreed to use this, because
>> I trust Connie's assessment about not really being used by the
>> devices in the wild (obviously nothing changed here).
>>
>> If we consider the semantic of unit check with command reject, it's
>> a surprisingly good match: basically device detected a programming
>> error (which can not be detected by the channel-subsystem because it
>> is device (type) specific). For reference see:
>> http://publibfp.dhe.ibm.com/cgi-bin/bookmgr/BOOKS/dz9ar110/2.7.2.1?DT=19920904110920
>>
>> IMHO that's almost exactly what we have here: the channel-program
>> is good from the perspective of the channel subsystem, but the device
>> can't deal with it. So we would not lie that the device is at fault
>> (was Connie's concern initially) but we would not lie about having
>> a generally invalid channel program (was my concern).
>>
>> So how about an unit check with a command reject? (The only problem
>> I see is is on the device vs device type plane -- but that ain't better
>> for unit exception.)
> 
> I don't know, it feels a bit weird if I look at the cases where I saw
> command reject in the wild before, even if seems to agree with the
> architecture... but just a gut feeling.
> 

Then let's settle for unit exception for now. I will let this topic
(series) rest for a couple of days in favor of things like virtio-crypto
spec review, maybe IDA, and some other stuff. But I definitely intend
to pick this series up again.

Halil




reply via email to

[Prev in Thread] Current Thread [Next in Thread]