qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 04/10] block/pflash_cfi02: Implement interele


From: Stephen Checkoway
Subject: Re: [Qemu-devel] [PATCH v4 04/10] block/pflash_cfi02: Implement intereleaved flash devices
Date: Mon, 24 Jun 2019 12:36:05 -0700


> On Jun 24, 2019, at 12:05, Philippe Mathieu-Daudé <address@hidden> wrote:
> 
>> On 6/22/19 2:25 PM, Philippe Mathieu-Daudé wrote:
>> Hi Stephen,
>> 
>> This series haven't fall through the cracks, however it is taking me
>> longer than expected to review it.
>> 
>>> On 4/26/19 6:26 PM, Stephen Checkoway wrote:
>>> It's common for multiple narrow flash chips to be hooked up in parallel
>>> to support wider buses. For example, four 8-bit wide flash chips (x8)
>>> may be combined in parallel to produce a 32-bit wide device. Similarly,
>>> two 16-bit wide chips (x16) may be combined.
>>> 
>>> This commit introduces `device-width` and `max-device-width` properties,
>>> similar to pflash_cfi01, with the following meanings:
>>> - `width`: The width of the logical, qemu device (same as before);
>>> - `device-width`: The width of an individual flash chip, defaulting to
>>>  `width`; and
>>> - `max-device-width`: The maximum width of an individual flash chip,
>>>  defaulting to `device-width`.
>>> 
>>> Nothing needs to change to support reading such interleaved devices but
>>> commands (e.g., erase and programming) must be sent to all devices at
>>> the same time or else the various chips will be in different states.
>> 
>> After some thoughts on this, I'd rather we model how hardware manage
>> interleaved devices: do it at the bus level, and instanciate N devices
>> in an interleaved config.
>> I believe that would drastically reduce this device complexity, and we
>> would match the real internal state machine.
>> Also this could be reused by other parallel devices used in a such config.
>> 
>>> For example, a 4-byte wide logical device can be composed of four x8/x16
>>> devices in x8 mode. That is, each device supports both x8 or x16 and
>>> they're being used in the byte, rather than word, mode. This
>>> configuration would have `width=4`, `device-width=1`, and
>>> `max-device-width=2`.
>> 
>> 
>> I'm thinking of this draft:
>> 
>> FlashDevice # x8
>>  MemoryRegionOps
>>    .valid.max_access_size = 1
>> 
>> FlashDevice # x16
>>  MemoryRegionOps
>>    .valid.min_access_size = 2
>>    .valid.max_access_size = 2
>> 
>> FlashDevice # x8/x16
>>  MemoryRegionOps
>>    .valid.min_access_size = 1
>>    .valid.max_access_size = 2
>> 
>> We might use .impl.min_access_size = 2 and consider all NOR flash using
>> 16-bit words internally.
>>    .impl.max_access_size = 2 is implicit.
>> 
>> So for you example we'd instanciate one:
>> 
>> InterleaverDevice
>>  Property
>>    .bus_width = 4 # 4-byte wide logical device, `width=4`
>>    .device_width = 1 # `device-width=1`
>>  MemoryRegionOps
>>    .valid.max_access_size = .bus_width # 4, set at realize()
>>    .impl.max_access_size = .device_width # 1, set at realize()
>> 
>> Then instanciate 4 pflash devices, and link them to the interleaver
>> using object_property_set_link().
>> 
>> typedef struct {
>>    SysBusDevice parent_obj;
>>    MemoryRegion iomem;
>>    char *name;
>>    /*
>>     * On a 64-bit wide bus we can have at most
>>     * 8 devices in 8-bit access mode.
>>     */
>>    MemoryRegion device[8];
>>    unsigned device_count;
>>    unsigned device_index_mask;
>>    /* Properties */
>>    unsigned bus_width;
>>    unsigned device_width;
>> } InterleaverDeviceState;
>> 
>> static Property interleaver_properties[] = {
>>    DEFINE_PROP_LINK("device[0]", InterleaverDeviceState,
>>                     device[0],
>>                     TYPE_MEMORY_REGION, MemoryRegion *),
>>    ...
>>    DEFINE_PROP_LINK("device[7]", InterleaverDeviceState,
>>                     device[7],
>>                     TYPE_MEMORY_REGION, MemoryRegion *),
>>    DEFINE_PROP_END_OF_LIST(),
>> };
>> 
>> Then previous to call InterleaverDevice.realize():
>> 
>> In the board realize():
>> 
>> 
>>    for (i = 0; i < interleaved_devices; i++) {
>>        pflash[i] = create_pflash(...);
>>        ...
>>    }
>> 
>>    ild = ... create InterleaverDevice ...
>>    for (i = 0; i < interleaved_devices; i++) {
>>        char *propname = g_strdup_printf("device[%u]", i);
>> 
>> 
>>        object_property_set_link(OBJECT(&ild->device[i]),
>>                                 OBJECT(pflash[i]),
>>                                 propname, &err);
>>        ...
>>    }
>> 
>> Finally,
>> 
>> static void interleaved_realize(DeviceState *dev, Error **errp)
>> {
>>    InterleaverDeviceState *s = INTERLEAVER_DEVICE(opaque);
>> 
>>    s->device_count = s->bus_width / s->device_width;
>>    s->device_index_mask = ~(s->device_count - 1);
>>    ...
>> }
>> 
>> static void interleaved_write(void *opaque, hwaddr offset,
>>                              uint64_t value, unsigned size)
>> {
>>    InterleaverDeviceState *s = INTERLEAVER_DEVICE(opaque);
>>    MemoryRegion *mr;
>> 
>>    /*
>>     * Since we set .impl.max_access_size = device_width,
>>     * access_with_adjusted_size() always call this with
>>     * size = device_width.
>>     *
>>     * Adjust the address (offset).
>>     */
>>    offset >>= size;
>>    /* Access the N interleaved device */
>>    mr = s->device[offset & s->device_index_mask];
>>    memory_region_dispatch_write(mr, offset, &value, size,
>>                                 MEMTXATTRS_UNSPECIFIED);
>> }
>> 
>> I'll try a PoC.
> 
> So I have a PoC, but then realize I can not use the same flash dump...
> 
> I need to deinterleave is. This is easily fixed with few lines of
> Python, then if I want to store/share the dump (aka 'backend storage') I
> have to re-interleave it.
> 
> I wonder if it would be possible/easy to add a 'interleave' option to
> blockdev to be able to have 2 pflash devices sharing the same backend...
> Is it worthwhile? Kevin/Markus/Max any thought on this?

Hi Phil,

Sorry for the delay, I’ve been traveling.

I considered something like this approach and I think it could work. 
Ultimately, I opted not to go that route for a few reasons:
- duplicated CFI tables and other state waste (a small amount of) memory when 
the flash chips are the same (the usual case in my limited experience)
- it adds an extra layer of read/write calls plus recombining from/splitting 
into the component parts
- duplicated timers firing to walk the programming/erasing state machine 
forward for each chip
- the firmware or data stored in the chips is likely already interleaved 
necessitating either splitting it up before use or adding functionality to a 
lower layer to split it (as you’ve suggested here)

None of the above seem like a big deal separately or together but I didn’t find 
the advantages of this approach to be sufficiently compelling to justify it. 
Namely, it allows using a heterogeneous set of flash chips to implement a 
logical device.

Nevertheless, if that’s the route you think is best, I have no objections.

Cheers,

Steve

> 
> Thanks,
> 
> Phil.
> 
>>> In addition to commands being sent to all devices, guest firmware
>>> expects the status and CFI queries to be replicated for each device.
>>> (The one exception to the response replication is that each device gets
>>> to report its own status bit DQ7 while programming because its value
>>> depends on the value being programmed which will usually differ for each
>>> device.)
>>> 
>>> Testing is limited to 16-bit wide devices due to the current inability
>>> to override the properties set by `pflash_cfi02_register`, but multiple
>>> configurations are tested.
>>> 
>>> Stop using global_qtest. Instead, package the qtest variable inside the
>>> FlashConfig structure.
>>> 
>>> Signed-off-by: Stephen Checkoway <address@hidden>
>>> Acked-by: Thomas Huth <address@hidden>
>>> ---
>>> hw/block/pflash_cfi02.c   | 270 +++++++++++++++------
>>> tests/pflash-cfi02-test.c | 476 ++++++++++++++++++++++++++++++--------
>>> 2 files changed, 576 insertions(+), 170 deletions(-)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]