qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v8 1/2] virtio-crypto: Add virtio crypto device


From: Alexander Graf
Subject: Re: [Qemu-devel] [PATCH v8 1/2] virtio-crypto: Add virtio crypto device specification
Date: Fri, 2 Sep 2016 10:06:33 +0200
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.2.0


On 02.09.16 05:08, Gonglei (Arei) wrote:
> Hi Alex,
> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:address@hidden
>> Sent: Thursday, September 01, 2016 9:37 PM
>> Subject: Re: [PATCH v8 1/2] virtio-crypto: Add virtio crypto device 
>> specification
>>
>> On 08/30/2016 02:12 PM, Gonglei wrote:
>>> The virtio crypto device is a virtual crypto device (ie. hardware
>>> crypto accelerator card). The virtio crypto device can provide
>>> five crypto services: CIPHER, MAC, HASH, AEAD, KDF, ASYM, PRIMITIVE.
>>>
>>> In this patch, CIPHER, MAC, HASH, AEAD services are introduced.
>>
>> I have mostly a few high level comments.
>>
>> For starters, a lot of the structs rely on the compiler to pad them to
>> natural alignment. That may get us into trouble when trying to emulate a
>> virtio device on a host with different guest architecture (like arm on
>> x86). It'd be a good idea to manually pad everything to be 64bit aligned
>> - then all fields are always at the same spot.
>>
> Good point! I'll do this in the next version. Thanks!
> 
>> I also have a hard time getting my head around the final flow of
>> everything. Do I always have to create a session to be able to emit a
>> command? In that case, doesn't that slow down everything, since a
>> request would then need to wait for the host reply to receive its
>> session id? There should be some way to fire off a simple non-iv
>> operation without any session set up imho.
>>
> For symmetric algorithms, we'd better create a session before executing 
> encryption
> Or decryption, because the session usually be kept for a specific
> algorithm with specific key in the production environment. And if we only 
> change the iv, 
> we don't need to re-create the session. 

I think we have a slight misunderstanding here :)

The current workflow is

  -> create session
  <- session key
  -> data in
  <- data out
  ...
  <- close session
  -> ack

That means that at least for the first packet you have at least one full
round-trip cost from guest to host to guest to be able to send any data.

That sounds pretty expensive to me on the latency side. There are
multiple ways to mitigate that. One idea was to have a separate path in
parallel to the create session + data + close session dance that would
combine them all into a single command. You would still have the session
based version, but accelerate the one-blob case.

Another idea would be to make the guest be the session id janitor. Then
you could just do

  -> create session with key X
  -> data in
  <- data out
  ...

so you save the round trip, if you combine command and data queues, as
then the create and data bits are serialized by their position in the queue.

> 
> For the asymmetric algorithms, we don't need create a session IIRC.
> 
> So, I don't think this is a performance degradation, but a performance 
> enhancement.
> 
>> Also, I don't fully understand the split between control and data
>> queues. As far as I read things, the control queue is used to create
>> session ids and the data queues can then be used to push data. Is there
>> any particular reason for the split? One thing that seems natural to me
>> would be to have sessions be per-queue, so you would create a session on
>> a particular queue and only have it ever be available there. That way
>> you get rid of any locking for sessions.
>>
> We want to keep a unify request type (structure) for data queue, so we can
> keep the session operation in the control queue. Of course the control queue
> only used to create sessions currently, but we can extend its functions if 
> needed
> in the future.

I still don't understand. With separate control+data queues you just get
yourself into synchronization troubles. Both struct
virtio_crypto_ctrl_header and struct virtio_crypto_op_header already
have an opcode as first le32 field. You can easily use that to determine
both length of the payload as well as command (control vs data).

You could then also completely get rid of the "queue_id" fields, as any
operation would only ever operate on the queue it's running on.


Alex



reply via email to

[Prev in Thread] Current Thread [Next in Thread]