qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/3] exec: further refine address_space_get_iotl


From: Peter Xu
Subject: Re: [Qemu-devel] [PATCH 0/3] exec: further refine address_space_get_iotlb_entry()
Date: Mon, 5 Jun 2017 11:20:13 +0800
User-agent: Mutt/1.5.24 (2015-08-30)

On Fri, Jun 02, 2017 at 05:51:07PM +0300, Michael S. Tsirkin wrote:
> On Fri, Jun 02, 2017 at 07:50:51PM +0800, Peter Xu wrote:
> > With the patch applied:
> > 
> >   [PATCH v3] exec: fix address_space_get_iotlb_entry page mask
> >   (already in Paolo's pull request but not yet merged)
> > 
> > Now we can have valid address masks. However it is still not ideal,
> > considering that the mask may not be aligned to guest page sizes. One
> > example would be when huge page is used in guest (please see commit
> > message in patch 1 for details). It applies to normal pages too. So we
> > not only need a valid address mask, we should make sure it is page
> > mask (for x86, it should be either 4K/2M/1G pages).
> 
> Why should we? To get better performance, right?

IMHO one point is for performance, the other point is on how we should
define the IOTLB interface. My opinion is that it is better valid
masks.

> 
> > Patch 1+2 fixes the problem. Tested with both kernel net driver or
> > testpmd, on either 4K/2M pages, to make sure the page mask is correct.
> > 
> > Patch 3 is cherry picked from PT series, after fixing from 1+2, we'll
> > definitely want patch 3 now. Here's the simplest TCP streaming test
> > using vhost dmar and iommu=pt in guest:
> > 
> >   without patch 3:    12.0Gbps
> 
> And what happens without patches 1-2?

Without 1-2, performance is good. But I think it is hacky to have such
a good result (I explained why the performance is good in the VT-d PT
support thread with some logs)...

> 
> >   with patch 3:       33.5Gbps
> 
> This is the part I don't get. Patches 1-2 will return a bigger region to
> callers. The result should be better performance - instead it seems to
> slow down vhost for some reason and we need tricks to get
> performance back. What's going on?

Yes. The problem is that if without patch 1/2 I think the codes lacks
correctness. With correctness, we lost performance, then I picked
patch 3 as well.

Again, I think the first thing we need to settle is what should be the
best definition for IOTLB (addr_mask or arbitary length).

Thanks,

-- 
Peter Xu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]