[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [qemu-web PATCH] Add a blog post about zoned storage emulation
|
From: |
Sam Li |
|
Subject: |
Re: [qemu-web PATCH] Add a blog post about zoned storage emulation |
|
Date: |
Fri, 18 Nov 2022 08:33:10 +0800 |
Stefan Hajnoczi <stefanha@gmail.com> 于2022年11月18日周五 03:12写道:
>
> Hi Sam,
> Please send a git repo URL so Thomas can fetch the commit without
> email/file size limitations.
I'll push it to the zbd branch after fixing the bellowing.
https://github.com/sgzerolc/qemu-web/zbd
>
> > diff --git a/_posts/2022-11-17-zoned-emulation.md
> > b/_posts/2022-11-17-zoned-emulation.md
> > new file mode 100644
> > index 0000000..69ce4d7
> > --- /dev/null
> > +++ b/_posts/2022-11-17-zoned-emulation.md
> > @@ -0,0 +1,45 @@
> > +---
> > +layout: post
> > +title: "Introduction to Zoned Storage Emulation"
> > +date: 2022-11-17
> > +author: Sam Li
> > +categories: [storage, gsoc, outreachy, internships]
> > +---
> > +
> > +## Zoned block devices
> > +
> > +Aimed for at-scale data infrastructures,
>
> I don't know what at-scale data infrastructure is. Is it something
> readers can relate to? Otherwise there's a risk that readers will
> decide this doesn't apply to them and stop reading.
Yes, I'll remove it.
>
> > zoned block devices (ZBDs) divide the LBA space into block regions called
> > zones that are larger than the LBA size.
>
> LBA is not defined and also not used again after this sentence.
> Readers will be familiar with disks but may not know what an LBA is.
> Since the concept isn't used again I suggest dropping it:
>
> zoned block devices (ZBDs) are divided into regions called zones
> that can only be written sequentially.
>
> > By only allowing sequential writes, it can reduce write amplification in
> > SSDs,
>
> This sounds more natural:
>
> By only allowing sequential writes, SSD write amplification can be reduced
>
> It might also be nice to provide a little bit of extra context:
>
> ... reduced by eliminating the need for a <a
> href="https://en.wikipedia.org/wiki/Flash_translation_layer">Flash
> Translation Layer</a>
>
> > and potentially lead to higher throughput and increased capacity. Providing
> > new storage software stack,
>
> s/Providing new/Providing a new/
>
> > zoned storage concept is standardized as ZBC(SCSI standard), ZAC(ATA
> > standard), ZNS(NVMe).
>
> Small tweaks:
>
> zoned storage concepts are standardized in ZBC (SCSI standard), ZAC
> (ATA standard), ZNS (NVMe).
>
> There is a space before opening parentheses: hello (world) instead of
> hello(world). Please check the rest of the article for more instances
> of this.
>
> It would be nice to include links but I didn't find good pages for
> ZBC/ZAC/ZNS aside from the full standards that they are part of.
>
> This intro section would be a good place to link to https://zonedstorage.io/!
Good idea! Zoned storage site also has a brief introduction to those standards.
https://zonedstorage.io/docs/introduction/smr#governing-standards
https://zonedstorage.io/docs/introduction/zns
>
> > Meanwhile, the virtio protocol for block devices(virtio-blk) should also be
> > aware of ZBDs instead of taking them as regular block devices. It should be
> > able to pass such devices through to the guest. An overview of necessary
> > work is as follows:
> > +
> > +1. Virtio protocol: [extend virtio-blk protocol with main zoned storage
> > concept](https://lwn.net/Articles/914377/), Dmitry Fomichev
> > +2. Linux: [implement the virtio specification
> > extensions](https://www.spinics.net/lists/linux-block/msg91944.html),
> > Dmitry Fomichev
> > +3. QEMU: add zoned emulation support to virtio-blk, Sam Li, [Outreachy
> > 2022
> > project](https://wiki.qemu.org/Internships/ProjectIdeas/VirtIOBlkZonedBlockDevices)
>
> You could split the QEMU work into 2 points if you like:
> 3. QEMU: add zoned storage APIs to the block layer, Sam Li
> 4. QEMU: implement zoned storage support in virtio-blk emulation, Sam Li
>
> > +
> > +<img src="/screenshots/zbd.png" alt="zbd" style="zoom:50%;" />
> > +
> > +## Zoned emulation
> > +
> > +Currently, QEMU can support zoned devices by virtio-scsi or PCI device
> > passthrough. It needs to specify the device type it is talking to. While
> > storage controller emulation uses block layer APIs instead of directly
> > accessing disk images. Extending virtio-blk emulation avoids code
> > duplication and simplify the support by hiding the device types under a
> > unified zoned storage interface, simplifying VM deployment for different
> > type of zoned devices.
>
> Another advantages that come to mind:
> 1. virtio-blk can be implemented in hardware. If those devices wish to
> follow the zoned storage model then the virtio-blk specification needs
> to natively support zoned storage.
> 2. Individual NVMe namespaces or anything that is a zoned Linux block
> device can be exposed to the guest without passing through a full
> device.
Thanks!
>
> > +
> > +For zoned storage emulation, zoned storage APIs support three zoned
> > models(conventional, host-managed, host-aware) , four zone management
> > commands(Report Zone, Open Zone, Close Zone, Finish Zone), and Append Zone.
> > QEMU block storage
>
> Maybe:
> s/QEMU block storage/The QEMU block layer/
>
> > has a BlockDriverState graph that propagates device information inside
> > block layer. A root pointer at BlockBackend points to the graph. There are
> > three type of block driver nodes: filter node, format node, protocol node.
> > File-posix driver is the lowest level within the graph where zoned storage
> > APIs reside.
>
> Is it possible to remove "A root pointer at BlockBackend points to the
> graph. There are three type of block driver nodes: filter node, format
> node, protocol node." so there are fewer new concepts? I didn't see
> further use of BlockBackend or filter/format nodes in the text.
Yes, it can be removed.
>
> > +
> > +<img src="/screenshots/storage_overview.png" alt="storage_overview"
> > style="zoom: 50%;" />
> > +
> > +After receiving the block driver states, Virtio-blk emulation recognizes
> > zoned devices and sends the zoned feature bit to guest. Then the guest can
> > see the zoned device in the host. When the guest executes zoned operations,
> > virtio-blk driver issues corresponding requests that will be captured by
> > virito-blk
>
> s/virito/virtio/
>
> > device inside QEMU. Afterwards, virtio-blk device sends the requests to
> > file-posix driver which will perform zoned operations.
> > +
> > +Unlike zone management operations, Linux doesn't have a user API
>
> The Linux userspace API (<linux/blkzoned.h>) hasn't been mentioned
> before. Maybe the previous paragraph should explain that file-posix
> performs zoned operations using <linux/blkzoned.h> ioctls? Then this
> sentence will be easier to understand.
>
> > to issue zone append requests to zoned devices from user space. With the
> > help of write pointer emulation tracking locations of write pointer of each
> > zone, QEMU block layer performs append writes by modifying regular writes.
> > Write pointer locks guarantee the execution of requests. Upon failure it
> > must not update the write pointer location which is only got updated when
> > the request is successfully finished.
> > +
> > +Problems can always be sovled
>
> s/sovled/solved/
Thanks for your comments,
Sam