[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 09/19] hw/arm/smmu-common: Rework TLB lookup for nesting
From: |
Jean-Philippe Brucker |
Subject: |
Re: [PATCH v4 09/19] hw/arm/smmu-common: Rework TLB lookup for nesting |
Date: |
Tue, 9 Jul 2024 18:13:45 +0100 |
On Tue, Jul 09, 2024 at 07:14:19AM +0000, Mostafa Saleh wrote:
> Hi Jean,
>
> On Thu, Jul 04, 2024 at 07:12:35PM +0100, Jean-Philippe Brucker wrote:
> > On Mon, Jul 01, 2024 at 11:02:31AM +0000, Mostafa Saleh wrote:
> > > In the next patch, combine_tlb() will be added which combines 2 TLB
> > > entries into one for nested translations, which chooses the granule
> > > and level from the smallest entry.
> > >
> > > This means that with nested translation, an entry can be cached with
> > > the granule of stage-2 and not stage-1.
> > >
> > > However, currently, the lookup for an IOVA is done with input stage
> > > granule, which is stage-1 for nested configuration, which will not
> > > work with the above logic.
> > > This patch reworks lookup in that case, so it falls back to stage-2
> > > granule if no entry is found using stage-1 granule.
> >
> > Why not initialize tt_combined to the minimum granule of stages 1 and 2?
> > It looks like you introduced it for this. I'm wondering if we lookup the
> > wrong IOVA if changing the granule size after the address is masked in
> > smmu_translate()
>
> I am not sure I fully understand, but I don’t think that would work as it is
> not guaranteed that the minimum granule is the one that would be cached,
> as we might hit block mappings.
>
> The IOVA at first is masked with the first stage mask for the expected page
> address, and the lookup logic would mask the address for each level look up,
> so It should match the alignment of the cached page of that granule and level,
> and as the combine logic is done with the aligned_addr it is guaranteed by
> construction that it has to be aligned with stage-1.
I missed something, this is what I had in mind initially:
* s1 granule is 64k, s2 granule is 4k
* the tlb already contains a translations for IOVA 0x30000, tg=4k
* now we lookup IOVA 0x31000. Masked with the s1 granule, aligned_addr is
0x30000. Not found at first because lookup is with tg=64k, but then we
call smmu_iotlb_lookup_all_levels() again with the s2 granule and the
same IOVA, which returns the wrong translation
But it's not actually possible, because if cfg->stage == SMMU_NESTED, then
in smmu_translate() we end up with
} else {
/* Stage2. */
tt_combined.granule_sz = cfg->s2cfg.granule_sz;
So I think the condition
(cfg->stage == SMMU_NESTED) && (cfg->s2cfg.granule_sz != tt->granule_sz)
in this patch is never true?
Then the following scenario:
* s1 granule is 4k, s2 granule is 64k
* we lookup IOVA A, miss. The translation gets cached with granule 4k
* we lookup IOVA A again, but with tt->granule_sz = 64k so we'll
never find the entry?
I guess we want to start the lookup with the smallest granule, and then if
the s1 and s2 granules differ, retry with the other one. Or with
SMMU_NESTED, start with the s1 granule and keep this patch to fallback to
s2 granule, but without masking the IOVA in smmu_translate() (it will be
masked correctly by smmu_iotlb_lookup_all_levels()).
Thanks,
Jean
>
> Thanks,
> Mostafa
>
> >
> > Thanks,
> > Jean
> >
> > >
> > > Signed-off-by: Mostafa Saleh <smostafa@google.com>
> > > ---
> > > hw/arm/smmu-common.c | 36 ++++++++++++++++++++++++++++++++++--
> > > 1 file changed, 34 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
> > > index 21982621c0..0840b5cffd 100644
> > > --- a/hw/arm/smmu-common.c
> > > +++ b/hw/arm/smmu-common.c
> > > @@ -66,8 +66,10 @@ SMMUIOTLBKey smmu_get_iotlb_key(int asid, int vmid,
> > > uint64_t iova,
> > > return key;
> > > }
> > >
> > > -SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, SMMUTransCfg *cfg,
> > > - SMMUTransTableInfo *tt, hwaddr iova)
> > > +static SMMUTLBEntry *smmu_iotlb_lookup_all_levels(SMMUState *bs,
> > > + SMMUTransCfg *cfg,
> > > + SMMUTransTableInfo *tt,
> > > + hwaddr iova)
> > > {
> > > uint8_t tg = (tt->granule_sz - 10) / 2;
> > > uint8_t inputsize = 64 - tt->tsz;
> > > @@ -88,6 +90,36 @@ SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs,
> > > SMMUTransCfg *cfg,
> > > }
> > > level++;
> > > }
> > > + return entry;
> > > +}
> > > +
> > > +/**
> > > + * smmu_iotlb_lookup - Look up for a TLB entry.
> > > + * @bs: SMMU state which includes the TLB instance
> > > + * @cfg: Configuration of the translation
> > > + * @tt: Translation table info (granule and tsz)
> > > + * @iova: IOVA address to lookup
> > > + *
> > > + * returns a valid entry on success, otherwise NULL.
> > > + * In case of nested translation, tt can be updated to include
> > > + * the granule of the found entry as it might different from
> > > + * the IOVA granule.
> > > + */
> > > +SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, SMMUTransCfg *cfg,
> > > + SMMUTransTableInfo *tt, hwaddr iova)
> > > +{
> > > + SMMUTLBEntry *entry = NULL;
> > > +
> > > + entry = smmu_iotlb_lookup_all_levels(bs, cfg, tt, iova);
> > > + /*
> > > + * For nested translation also try the s2 granule, as the TLB will
> > > insert
> > > + * it if the size of s2 tlb entry was smaller.
> > > + */
> > > + if (!entry && (cfg->stage == SMMU_NESTED) &&
> > > + (cfg->s2cfg.granule_sz != tt->granule_sz)) {
> > > + tt->granule_sz = cfg->s2cfg.granule_sz;
> > > + entry = smmu_iotlb_lookup_all_levels(bs, cfg, tt, iova);
> > > + }
> > >
> > > if (entry) {
> > > cfg->iotlb_hits++;
> > > --
> > > 2.45.2.803.g4e1b14247a-goog
> > >
- Re: [PATCH v4 08/19] hw/arm/smmuv3: Translate CD and TT using stage-2 table, (continued)
[PATCH v4 12/19] hw/arm/smmu: Support nesting in smmuv3_range_inval(), Mostafa Saleh, 2024/07/01
[PATCH v4 13/19] hw/arm/smmu: Introduce smmu_iotlb_inv_asid_vmid, Mostafa Saleh, 2024/07/01
[PATCH v4 11/19] hw/arm/smmu-common: Support nested translation, Mostafa Saleh, 2024/07/01