[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v5 8/8] hw/mem/cxl_type3: Add CXL RAS Error Injection Support
|
From: |
Markus Armbruster |
|
Subject: |
Re: [PATCH v5 8/8] hw/mem/cxl_type3: Add CXL RAS Error Injection Support. |
|
Date: |
Fri, 27 Oct 2023 06:54:39 +0200 |
|
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) |
I'm trying to fill in QMP documentation holes, and found one in commit
415442a1b4a (this patch). Details inline.
Jonathan Cameron <Jonathan.Cameron@huawei.com> writes:
> CXL uses PCI AER Internal errors to signal to the host that an error has
> occurred. The host can then read more detailed status from the CXL RAS
> capability.
>
> For uncorrectable errors: support multiple injection in one operation
> as this is needed to reliably test multiple header logging support in an
> OS. The equivalent feature doesn't exist for correctable errors, so only
> one error need be injected at a time.
>
> Note:
> - Header content needs to be manually specified in a fashion that
> matches the specification for what can be in the header for each
> error type.
>
> Injection via QMP:
> { "execute": "qmp_capabilities" }
> ...
> { "execute": "cxl-inject-uncorrectable-errors",
> "arguments": {
> "path": "/machine/peripheral/cxl-pmem0",
> "errors": [
> {
> "type": "cache-address-parity",
> "header": [ 3, 4]
> },
> {
> "type": "cache-data-parity",
> "header":
> [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31]
> },
> {
> "type": "internal",
> "header": [ 1, 2, 4]
> }
> ]
> }}
> ...
> { "execute": "cxl-inject-correctable-error",
> "arguments": {
> "path": "/machine/peripheral/cxl-pmem0",
> "type": "physical"
> } }
>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
[...]
> diff --git a/qapi/cxl.json b/qapi/cxl.json
> new file mode 100644
> index 0000000000..ac7e167fa2
> --- /dev/null
> +++ b/qapi/cxl.json
> @@ -0,0 +1,118 @@
> +# -*- Mode: Python -*-
> +# vim: filetype=python
> +
> +##
> +# = CXL devices
> +##
> +
> +##
> +# @CxlUncorErrorType:
> +#
> +# Type of uncorrectable CXL error to inject. These errors are reported via
> +# an AER uncorrectable internal error with additional information logged at
> +# the CXL device.
> +#
> +# @cache-data-parity: Data error such as data parity or data ECC error
> CXL.cache
> +# @cache-address-parity: Address parity or other errors associated with the
> +# address field on CXL.cache
> +# @cache-be-parity: Byte enable parity or other byte enable errors on
> CXL.cache
> +# @cache-data-ecc: ECC error on CXL.cache
> +# @mem-data-parity: Data error such as data parity or data ECC error on
> CXL.mem
> +# @mem-address-parity: Address parity or other errors associated with the
> +# address field on CXL.mem
> +# @mem-be-parity: Byte enable parity or other byte enable errors on CXL.mem.
> +# @mem-data-ecc: Data ECC error on CXL.mem.
> +# @reinit-threshold: REINIT threshold hit.
> +# @rsvd-encoding: Received unrecognized encoding.
> +# @poison-received: Received poison from the peer.
> +# @receiver-overflow: Buffer overflows (first 3 bits of header log indicate
> which)
> +# @internal: Component specific error
> +# @cxl-ide-tx: Integrity and data encryption tx error.
> +# @cxl-ide-rx: Integrity and data encryption rx error.
> +##
> +
> +{ 'enum': 'CxlUncorErrorType',
> + 'data': ['cache-data-parity',
> + 'cache-address-parity',
> + 'cache-be-parity',
> + 'cache-data-ecc',
> + 'mem-data-parity',
> + 'mem-address-parity',
> + 'mem-be-parity',
> + 'mem-data-ecc',
> + 'reinit-threshold',
> + 'rsvd-encoding',
> + 'poison-received',
> + 'receiver-overflow',
> + 'internal',
> + 'cxl-ide-tx',
> + 'cxl-ide-rx'
> + ]
> + }
> +
> +##
> +# @CXLUncorErrorRecord:
> +#
> +# Record of a single error including header log.
> +#
> +# @type: Type of error
> +# @header: 16 DWORD of header.
> +##
> +{ 'struct': 'CXLUncorErrorRecord',
> + 'data': {
> + 'type': 'CxlUncorErrorType',
> + 'header': [ 'uint32' ]
> + }
> +}
> +
> +##
> +# @cxl-inject-uncorrectable-errors:
> +#
> +# Command to allow injection of multiple errors in one go. This allows
> testing
> +# of multiple header log handling in the OS.
> +#
> +# @path: CXL Type 3 device canonical QOM path
> +# @errors: Errors to inject
> +##
> +{ 'command': 'cxl-inject-uncorrectable-errors',
> + 'data': { 'path': 'str',
> + 'errors': [ 'CXLUncorErrorRecord' ] }}
> +
> +##
> +# @CxlCorErrorType:
> +#
> +# Type of CXL correctable error to inject
> +#
> +# @cache-data-ecc: Data ECC error on CXL.cache
> +# @mem-data-ecc: Data ECC error on CXL.mem
Missing:
# @retry-threshold: ...
I need suitable description text. Can you help me?
> +# @crc-threshold: Component specific and applicable to 68 byte Flit mode
> only.
> +# @cache-poison-received: Received poison from a peer on CXL.cache.
> +# @mem-poison-received: Received poison from a peer on CXL.mem
> +# @physical: Received error indication from the physical layer.
> +##
> +{ 'enum': 'CxlCorErrorType',
> + 'data': ['cache-data-ecc',
> + 'mem-data-ecc',
> + 'crc-threshold',
> + 'retry-threshold',
> + 'cache-poison-received',
> + 'mem-poison-received',
> + 'physical']
> +}
> +
> +##
> +# @cxl-inject-correctable-error:
> +#
> +# Command to inject a single correctable error. Multiple error injection
> +# of this error type is not interesting as there is no associated header log.
> +# These errors are reported via AER as a correctable internal error, with
> +# additional detail available from the CXL device.
> +#
> +# @path: CXL Type 3 device canonical QOM path
> +# @type: Type of error.
> +##
> +{ 'command': 'cxl-inject-correctable-error',
> + 'data': { 'path': 'str',
> + 'type': 'CxlCorErrorType'
> + }
> +}
[...]
- Re: [PATCH v5 8/8] hw/mem/cxl_type3: Add CXL RAS Error Injection Support.,
Markus Armbruster <=