From 5669d26ec614b3f4c56cf1489b9095ed327938b1 Mon Sep 17 00:00:00 2001 From: Peter Maydell Date: Tue, 16 Jul 2024 11:30:32 +0100 Subject: [PATCH 01/26] target/arm: Fix handling of LDAPR/STLR with negative offset MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When we converted the LDAPR/STLR instructions to decodetree we accidentally introduced a regression where the offset is negative. The 9-bit immediate field is signed, and the old hand decoder correctly used sextract32() to get it out of the insn word, but the ldapr_stlr_i pattern in the decode file used "imm:9" instead of "imm:s9", so it treated the field as unsigned. Fix the pattern to treat the field as a signed immediate. Cc: qemu-stable@nongnu.org Fixes: 2521b6073b7 ("target/arm: Convert LDAPR/STLR (imm) to decodetree") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2419 Signed-off-by: Peter Maydell Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Richard Henderson Message-id: 20240709134504.3500007-2-peter.maydell@linaro.org --- target/arm/tcg/a64.decode | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 2922de700c..62df4c4ceb 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -520,7 +520,7 @@ LDAPR sz:2 111 0 00 1 0 1 11111 1100 00 rn:5 rt:5 LDRA 11 111 0 00 m:1 . 1 ......... w:1 1 rn:5 rt:5 imm=%ldra_imm &ldapr_stlr_i rn rt imm sz sign ext -@ldapr_stlr_i .. ...... .. . imm:9 .. rn:5 rt:5 &ldapr_stlr_i +@ldapr_stlr_i .. ...... .. . imm:s9 .. rn:5 rt:5 &ldapr_stlr_i STLR_i sz:2 011001 00 0 ......... 00 ..... ..... @ldapr_stlr_i sign=0 ext=0 LDAPR_i sz:2 011001 01 0 ......... 00 ..... ..... @ldapr_stlr_i sign=0 ext=0 LDAPR_i 00 011001 10 0 ......... 00 ..... ..... @ldapr_stlr_i sign=1 ext=0 sz=0 From 25489b521b61b874c4c6583956db0012a3674e3a Mon Sep 17 00:00:00 2001 From: Peter Maydell Date: Tue, 16 Jul 2024 11:30:32 +0100 Subject: [PATCH 02/26] target/arm: LDAPR should honour SCTLR_ELx.nAA In commit c1a1f80518d360b when we added the FEAT_LSE2 relaxations to the alignment requirements for atomic and ordered loads and stores, we didn't quite get it right for LDAPR/LDAPRH/LDAPRB with no immediate offset. These instructions were handled in the old decoder as part of disas_ldst_atomic(), but unlike all the other insns that function decoded (LDADD, LDCLR, etc) these insns are "ordered", not "atomic", so they should be using check_ordered_align() rather than check_atomic_align(). Commit c1a1f80518d360b used check_atomic_align() regardless for everything in disas_ldst_atomic(). We then carried that incorrect check over in the decodetree conversion, where LDAPR/LDAPRH/LDAPRB are now handled by trans_LDAPR(). The effect is that when FEAT_LSE2 is implemented, these instructions don't honour the SCTLR_ELx.nAA bit and will generate alignment faults when they should not. (The LDAPR insns with an immediate offset were in disas_ldst_ldapr_stlr() and then in trans_LDAPR_i() and trans_STLR_i(), and have always used the correct check_ordered_align().) Use check_ordered_align() in trans_LDAPR(). Cc: qemu-stable@nongnu.org Fixes: c1a1f80518d360b ("target/arm: Relax ordered/atomic alignment checks for LSE2") Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson Message-id: 20240709134504.3500007-3-peter.maydell@linaro.org --- target/arm/tcg/translate-a64.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 559a6cd799..148be2826e 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -3543,7 +3543,7 @@ static bool trans_LDAPR(DisasContext *s, arg_LDAPR *a) if (a->rn == 31) { gen_check_sp_alignment(s); } - mop = check_atomic_align(s, a->rn, a->sz); + mop = check_ordered_align(s, a->rn, 0, false, a->sz); clean_addr = gen_mte_check1(s, cpu_reg_sp(s, a->rn), false, a->rn != 31, mop); /* From 345acc443905eda8008a1d328dd89b73c4a3f89e Mon Sep 17 00:00:00 2001 From: SamJakob Date: Sat, 13 Jul 2024 17:03:53 +0100 Subject: [PATCH 03/26] hw/display/bcm2835_fb: fix fb_use_offsets condition MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit It is common practice when implementing double-buffering on VideoCore to do so by multiplying the height of the virtual buffer by the number of virtual screens desired (i.e., two - in the case of double-bufferring). At present, this won't work in QEMU because the logic in fb_use_offsets require that both the virtual width and height exceed their physical counterparts. This appears to be unintentional/a typo and indeed the comment states; "Experimentally, the hardware seems to do this only if the viewport size is larger than the physical screen". The viewport/virtual size would be larger than the physical size if either virtual dimension were larger than their physical counterparts and not necessarily both. Signed-off-by: SamJakob Message-id: 20240713160353.62410-1-me@samjakob.com Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Peter Maydell --- hw/display/bcm2835_fb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/display/bcm2835_fb.c b/hw/display/bcm2835_fb.c index e40ed2d2e1..650db3da82 100644 --- a/hw/display/bcm2835_fb.c +++ b/hw/display/bcm2835_fb.c @@ -145,7 +145,7 @@ static bool fb_use_offsets(BCM2835FBConfig *config) * viewport size is larger than the physical screen. (It doesn't * prevent the guest setting this silly viewport setting, though...) */ - return config->xres_virtual > config->xres && + return config->xres_virtual > config->xres || config->yres_virtual > config->yres; } From bde809f05f66b4be4475ffa9819d82a01686d1c7 Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:01 +0000 Subject: [PATCH 04/26] hw/arm/smmu-common: Add missing size check for stage-1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit According to the SMMU architecture specification (ARM IHI 0070 F.b), in “3.4 Address sizes” The address output from the translation causes a stage 1 Address Size fault if it exceeds the range of the effective IPA size for the given CD. However, this check was missing. There is already a similar check for stage-2 against effective PA. Reviewed-by: Jean-Philippe Brucker Reviewed-by: Eric Auger Signed-off-by: Mostafa Saleh Message-id: 20240715084519.1189624-2-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmu-common.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index b6601cc102..e81b684d06 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -381,6 +381,16 @@ static int smmu_ptw_64_s1(SMMUTransCfg *cfg, goto error; } + /* + * The address output from the translation causes a stage 1 Address + * Size fault if it exceeds the range of the effective IPA size for + * the given CD. + */ + if (gpa >= (1ULL << cfg->oas)) { + info->type = SMMU_PTW_ERR_ADDR_SIZE; + goto error; + } + tlbe->entry.translated_addr = gpa; tlbe->entry.iova = iova & ~mask; tlbe->entry.addr_mask = mask; From 48f9e9eb2914cf1ccd67bf7a011d2706490d81f0 Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:02 +0000 Subject: [PATCH 05/26] hw/arm/smmu: Fix IPA for stage-2 events MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit For the following events (ARM IHI 0070 F.b - 7.3 Event records): - F_TRANSLATION - F_ACCESS - F_PERMISSION - F_ADDR_SIZE If fault occurs at stage 2, S2 == 1 and: - If translating an IPA for a transaction (whether by input to stage 2-only configuration, or after successful stage 1 translation), CLASS == IN, and IPA is provided. At the moment only CLASS == IN is used which indicates input translation. However, this was not implemented correctly, as for stage 2, the code only sets the S2 bit but not the IPA. This field has the same bits as FetchAddr in F_WALK_EABT which is populated correctly, so we don’t change that. The setting of this field should be done from the walker as the IPA address wouldn't be known in case of nesting. For stage 1, the spec says: If fault occurs at stage 1, S2 == 0 and: CLASS == IN, IPA is UNKNOWN. So, no need to set it to for stage 1, as ptw_info is initialised by zero in smmuv3_translate(). Fixes: e703f7076a “hw/arm/smmuv3: Add page table walk for stage-2” Reviewed-by: Jean-Philippe Brucker Reviewed-by: Eric Auger Signed-off-by: Mostafa Saleh Message-id: 20240715084519.1189624-3-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmu-common.c | 10 ++++++---- hw/arm/smmuv3.c | 4 ++++ 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index e81b684d06..e8cdbcd8ae 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -448,7 +448,7 @@ static int smmu_ptw_64_s2(SMMUTransCfg *cfg, */ if (ipa >= (1ULL << inputsize)) { info->type = SMMU_PTW_ERR_TRANSLATION; - goto error; + goto error_ipa; } while (level < VMSA_LEVELS) { @@ -494,13 +494,13 @@ static int smmu_ptw_64_s2(SMMUTransCfg *cfg, */ if (!PTE_AF(pte) && !cfg->s2cfg.affd) { info->type = SMMU_PTW_ERR_ACCESS; - goto error; + goto error_ipa; } s2ap = PTE_AP(pte); if (is_permission_fault_s2(s2ap, perm)) { info->type = SMMU_PTW_ERR_PERMISSION; - goto error; + goto error_ipa; } /* @@ -509,7 +509,7 @@ static int smmu_ptw_64_s2(SMMUTransCfg *cfg, */ if (gpa >= (1ULL << cfg->s2cfg.eff_ps)) { info->type = SMMU_PTW_ERR_ADDR_SIZE; - goto error; + goto error_ipa; } tlbe->entry.translated_addr = gpa; @@ -522,6 +522,8 @@ static int smmu_ptw_64_s2(SMMUTransCfg *cfg, } info->type = SMMU_PTW_ERR_TRANSLATION; +error_ipa: + info->addr = ipa; error: info->stage = 2; tlbe->entry.perm = IOMMU_NONE; diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index 445e04ddf7..cab545a0b4 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -949,6 +949,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, if (PTW_RECORD_FAULT(cfg)) { event.type = SMMU_EVT_F_TRANSLATION; event.u.f_translation.addr = addr; + event.u.f_translation.addr2 = ptw_info.addr; event.u.f_translation.rnw = flag & 0x1; } break; @@ -956,6 +957,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, if (PTW_RECORD_FAULT(cfg)) { event.type = SMMU_EVT_F_ADDR_SIZE; event.u.f_addr_size.addr = addr; + event.u.f_addr_size.addr2 = ptw_info.addr; event.u.f_addr_size.rnw = flag & 0x1; } break; @@ -963,6 +965,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, if (PTW_RECORD_FAULT(cfg)) { event.type = SMMU_EVT_F_ACCESS; event.u.f_access.addr = addr; + event.u.f_access.addr2 = ptw_info.addr; event.u.f_access.rnw = flag & 0x1; } break; @@ -970,6 +973,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, if (PTW_RECORD_FAULT(cfg)) { event.type = SMMU_EVT_F_PERMISSION; event.u.f_permission.addr = addr; + event.u.f_permission.addr2 = ptw_info.addr; event.u.f_permission.rnw = flag & 0x1; } break; From 2731ea049d13dfed57f4b0b8bf38725321f752f9 Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:03 +0000 Subject: [PATCH 06/26] hw/arm/smmuv3: Fix encoding of CLASS in events MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The SMMUv3 spec (ARM IHI 0070 F.b - 7.3 Event records) defines the class of events faults as: CLASS: The class of the operation that caused the fault: - 0b00: CD, CD fetch. - 0b01: TTD, Stage 1 translation table fetch. - 0b10: IN, Input address However, this value was not set and left as 0 which means CD and not IN (0b10). Another problem was that stage-2 class is considered IN not TT for EABT, according to the spec: Translation of an IPA after successful stage 1 translation (or, in stage 2-only configuration, an input IPA) - S2 == 1 (stage 2), CLASS == IN (Input to stage) This would change soon when nested translations are supported. While at it, add an enum for class as it would be used for nesting. However, at the moment stage-1 and stage-2 use the same class values, except for EABT. Fixes: 9bde7f0674 “hw/arm/smmuv3: Implement translate callback” Signed-off-by: Mostafa Saleh Reviewed-by: Jean-Philippe Brucker Reviewed-by: Eric Auger Message-id: 20240715084519.1189624-4-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmuv3-internal.h | 6 ++++++ hw/arm/smmuv3.c | 8 +++++++- 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h index e4dd11e1e6..0f3ecec804 100644 --- a/hw/arm/smmuv3-internal.h +++ b/hw/arm/smmuv3-internal.h @@ -32,6 +32,12 @@ typedef enum SMMUTranslationStatus { SMMU_TRANS_SUCCESS, } SMMUTranslationStatus; +typedef enum SMMUTranslationClass { + SMMU_CLASS_CD, + SMMU_CLASS_TT, + SMMU_CLASS_IN, +} SMMUTranslationClass; + /* MMIO Registers */ REG32(IDR0, 0x0) diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index cab545a0b4..472fdf2e5f 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -942,7 +942,9 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, event.type = SMMU_EVT_F_WALK_EABT; event.u.f_walk_eabt.addr = addr; event.u.f_walk_eabt.rnw = flag & 0x1; - event.u.f_walk_eabt.class = 0x1; + /* Stage-2 (only) is class IN while stage-1 is class TT */ + event.u.f_walk_eabt.class = (ptw_info.stage == 2) ? + SMMU_CLASS_IN : SMMU_CLASS_TT; event.u.f_walk_eabt.addr2 = ptw_info.addr; break; case SMMU_PTW_ERR_TRANSLATION: @@ -950,6 +952,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, event.type = SMMU_EVT_F_TRANSLATION; event.u.f_translation.addr = addr; event.u.f_translation.addr2 = ptw_info.addr; + event.u.f_translation.class = SMMU_CLASS_IN; event.u.f_translation.rnw = flag & 0x1; } break; @@ -958,6 +961,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, event.type = SMMU_EVT_F_ADDR_SIZE; event.u.f_addr_size.addr = addr; event.u.f_addr_size.addr2 = ptw_info.addr; + event.u.f_translation.class = SMMU_CLASS_IN; event.u.f_addr_size.rnw = flag & 0x1; } break; @@ -966,6 +970,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, event.type = SMMU_EVT_F_ACCESS; event.u.f_access.addr = addr; event.u.f_access.addr2 = ptw_info.addr; + event.u.f_translation.class = SMMU_CLASS_IN; event.u.f_access.rnw = flag & 0x1; } break; @@ -974,6 +979,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, event.type = SMMU_EVT_F_PERMISSION; event.u.f_permission.addr = addr; event.u.f_permission.addr2 = ptw_info.addr; + event.u.f_translation.class = SMMU_CLASS_IN; event.u.f_permission.rnw = flag & 0x1; } break; From f6cc198050faf055b17105fdf4da63cf4d295765 Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:04 +0000 Subject: [PATCH 07/26] hw/arm/smmu: Use enum for SMMU stage MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Currently, translation stage is represented as an int, where 1 is stage-1 and 2 is stage-2, when nested is added, 3 would be confusing to represent nesting, so we use an enum instead. While keeping the same values, this is useful for: - Doing tricks with bit masks, where BIT(0) is stage-1 and BIT(1) is stage-2 and both is nested. - Tracing, as stage is printed as int. Reviewed-by: Eric Auger Reviewed-by: Alex Bennée Signed-off-by: Mostafa Saleh Reviewed-by: Jean-Philippe Brucker Message-id: 20240715084519.1189624-5-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmu-common.c | 14 +++++++------- hw/arm/smmuv3.c | 17 +++++++++-------- include/hw/arm/smmu-common.h | 11 +++++++++-- 3 files changed, 25 insertions(+), 17 deletions(-) diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index e8cdbcd8ae..408fc4efca 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -304,7 +304,7 @@ static int smmu_ptw_64_s1(SMMUTransCfg *cfg, SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info) { dma_addr_t baseaddr, indexmask; - int stage = cfg->stage; + SMMUStage stage = cfg->stage; SMMUTransTableInfo *tt = select_tt(cfg, iova); uint8_t level, granule_sz, inputsize, stride; @@ -402,7 +402,7 @@ static int smmu_ptw_64_s1(SMMUTransCfg *cfg, info->type = SMMU_PTW_ERR_TRANSLATION; error: - info->stage = 1; + info->stage = SMMU_STAGE_1; tlbe->entry.perm = IOMMU_NONE; return -EINVAL; } @@ -425,7 +425,7 @@ static int smmu_ptw_64_s2(SMMUTransCfg *cfg, dma_addr_t ipa, IOMMUAccessFlags perm, SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info) { - const int stage = 2; + const SMMUStage stage = SMMU_STAGE_2; int granule_sz = cfg->s2cfg.granule_sz; /* ARM DDI0487I.a: Table D8-7. */ int inputsize = 64 - cfg->s2cfg.tsz; @@ -525,7 +525,7 @@ static int smmu_ptw_64_s2(SMMUTransCfg *cfg, error_ipa: info->addr = ipa; error: - info->stage = 2; + info->stage = SMMU_STAGE_2; tlbe->entry.perm = IOMMU_NONE; return -EINVAL; } @@ -544,9 +544,9 @@ error: int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm, SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info) { - if (cfg->stage == 1) { + if (cfg->stage == SMMU_STAGE_1) { return smmu_ptw_64_s1(cfg, iova, perm, tlbe, info); - } else if (cfg->stage == 2) { + } else if (cfg->stage == SMMU_STAGE_2) { /* * If bypassing stage 1(or unimplemented), the input address is passed * directly to stage 2 as IPA. If the input address of a transaction @@ -555,7 +555,7 @@ int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm, */ if (iova >= (1ULL << cfg->oas)) { info->type = SMMU_PTW_ERR_ADDR_SIZE; - info->stage = 1; + info->stage = SMMU_STAGE_1; tlbe->entry.perm = IOMMU_NONE; return -EINVAL; } diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index 472fdf2e5f..aab09df23e 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -34,7 +34,8 @@ #include "smmuv3-internal.h" #include "smmu-internal.h" -#define PTW_RECORD_FAULT(cfg) (((cfg)->stage == 1) ? (cfg)->record_faults : \ +#define PTW_RECORD_FAULT(cfg) (((cfg)->stage == SMMU_STAGE_1) ? \ + (cfg)->record_faults : \ (cfg)->s2cfg.record_faults) /** @@ -402,7 +403,7 @@ static bool s2_pgtable_config_valid(uint8_t sl0, uint8_t t0sz, uint8_t gran) static int decode_ste_s2_cfg(SMMUTransCfg *cfg, STE *ste) { - cfg->stage = 2; + cfg->stage = SMMU_STAGE_2; if (STE_S2AA64(ste) == 0x0) { qemu_log_mask(LOG_UNIMP, @@ -678,7 +679,7 @@ static int decode_cd(SMMUTransCfg *cfg, CD *cd, SMMUEventInfo *event) /* we support only those at the moment */ cfg->aa64 = true; - cfg->stage = 1; + cfg->stage = SMMU_STAGE_1; cfg->oas = oas2bits(CD_IPS(cd)); cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas); @@ -762,7 +763,7 @@ static int smmuv3_decode_config(IOMMUMemoryRegion *mr, SMMUTransCfg *cfg, return ret; } - if (cfg->aborted || cfg->bypassed || (cfg->stage == 2)) { + if (cfg->aborted || cfg->bypassed || (cfg->stage == SMMU_STAGE_2)) { return 0; } @@ -882,7 +883,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, goto epilogue; } - if (cfg->stage == 1) { + if (cfg->stage == SMMU_STAGE_1) { /* Select stage1 translation table. */ tt = select_tt(cfg, addr); if (!tt) { @@ -919,7 +920,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, * nesting is not supported. So it is sufficient to check the * translation stage to know the TLB stage for now. */ - event.u.f_walk_eabt.s2 = (cfg->stage == 2); + event.u.f_walk_eabt.s2 = (cfg->stage == SMMU_STAGE_2); if (PTW_RECORD_FAULT(cfg)) { event.type = SMMU_EVT_F_PERMISSION; event.u.f_permission.addr = addr; @@ -935,7 +936,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, if (smmu_ptw(cfg, aligned_addr, flag, cached_entry, &ptw_info)) { /* All faults from PTW has S2 field. */ - event.u.f_walk_eabt.s2 = (ptw_info.stage == 2); + event.u.f_walk_eabt.s2 = (ptw_info.stage == SMMU_STAGE_2); g_free(cached_entry); switch (ptw_info.type) { case SMMU_PTW_ERR_WALK_EABT: @@ -943,7 +944,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, event.u.f_walk_eabt.addr = addr; event.u.f_walk_eabt.rnw = flag & 0x1; /* Stage-2 (only) is class IN while stage-1 is class TT */ - event.u.f_walk_eabt.class = (ptw_info.stage == 2) ? + event.u.f_walk_eabt.class = (ptw_info.stage == SMMU_STAGE_2) ? SMMU_CLASS_IN : SMMU_CLASS_TT; event.u.f_walk_eabt.addr2 = ptw_info.addr; break; diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h index 687b7ca132..260582089e 100644 --- a/include/hw/arm/smmu-common.h +++ b/include/hw/arm/smmu-common.h @@ -49,8 +49,15 @@ typedef enum { SMMU_PTW_ERR_PERMISSION, /* Permission fault */ } SMMUPTWEventType; +/* SMMU Stage */ +typedef enum { + SMMU_STAGE_1 = 1, + SMMU_STAGE_2, + SMMU_NESTED, +} SMMUStage; + typedef struct SMMUPTWEventInfo { - int stage; + SMMUStage stage; SMMUPTWEventType type; dma_addr_t addr; /* fetched address that induced an abort, if any */ } SMMUPTWEventInfo; @@ -88,7 +95,7 @@ typedef struct SMMUS2Cfg { */ typedef struct SMMUTransCfg { /* Shared fields between stage-1 and stage-2. */ - int stage; /* translation stage */ + SMMUStage stage; /* translation stage */ bool disabled; /* smmu is disabled */ bool bypassed; /* translation is bypassed */ bool aborted; /* translation is aborted */ From a9e3f4c1ebeead57008ebe1c0e9f4e50d5020105 Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:05 +0000 Subject: [PATCH 08/26] hw/arm/smmu: Split smmuv3_translate() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit smmuv3_translate() does everything from STE/CD parsing to TLB lookup and PTW. Soon, when nesting is supported, stage-1 data (tt, CD) needs to be translated using stage-2. Split smmuv3_translate() to 3 functions: - smmu_translate(): in smmu-common.c, which does the TLB lookup, PTW, TLB insertion, all the functions are already there, this just puts them together. This also simplifies the code as it consolidates event generation in case of TLB lookup permission failure or in TT selection. - smmuv3_do_translate(): in smmuv3.c, Calls smmu_translate() and does the event population in case of errors. - smmuv3_translate(), now calls smmuv3_do_translate() for translation while the rest is the same. Also, add stage in trace_smmuv3_translate_success() Reviewed-by: Eric Auger Signed-off-by: Mostafa Saleh Reviewed-by: Jean-Philippe Brucker Reviewed-by: Alex Bennée Message-id: 20240715084519.1189624-6-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmu-common.c | 59 +++++++++++ hw/arm/smmuv3.c | 194 +++++++++++++---------------------- hw/arm/trace-events | 2 +- include/hw/arm/smmu-common.h | 8 ++ 4 files changed, 142 insertions(+), 121 deletions(-) diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index 408fc4efca..8d52472863 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -566,6 +566,65 @@ int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm, g_assert_not_reached(); } +SMMUTLBEntry *smmu_translate(SMMUState *bs, SMMUTransCfg *cfg, dma_addr_t addr, + IOMMUAccessFlags flag, SMMUPTWEventInfo *info) +{ + uint64_t page_mask, aligned_addr; + SMMUTLBEntry *cached_entry = NULL; + SMMUTransTableInfo *tt; + int status; + + /* + * Combined attributes used for TLB lookup, as only one stage is supported, + * it will hold attributes based on the enabled stage. + */ + SMMUTransTableInfo tt_combined; + + if (cfg->stage == SMMU_STAGE_1) { + /* Select stage1 translation table. */ + tt = select_tt(cfg, addr); + if (!tt) { + info->type = SMMU_PTW_ERR_TRANSLATION; + info->stage = SMMU_STAGE_1; + return NULL; + } + tt_combined.granule_sz = tt->granule_sz; + tt_combined.tsz = tt->tsz; + + } else { + /* Stage2. */ + tt_combined.granule_sz = cfg->s2cfg.granule_sz; + tt_combined.tsz = cfg->s2cfg.tsz; + } + + /* + * TLB lookup looks for granule and input size for a translation stage, + * as only one stage is supported right now, choose the right values + * from the configuration. + */ + page_mask = (1ULL << tt_combined.granule_sz) - 1; + aligned_addr = addr & ~page_mask; + + cached_entry = smmu_iotlb_lookup(bs, cfg, &tt_combined, aligned_addr); + if (cached_entry) { + if ((flag & IOMMU_WO) && !(cached_entry->entry.perm & IOMMU_WO)) { + info->type = SMMU_PTW_ERR_PERMISSION; + info->stage = cfg->stage; + return NULL; + } + return cached_entry; + } + + cached_entry = g_new0(SMMUTLBEntry, 1); + status = smmu_ptw(cfg, aligned_addr, flag, cached_entry, info); + if (status) { + g_free(cached_entry); + return NULL; + } + smmu_iotlb_insert(bs, cfg, cached_entry); + return cached_entry; +} + /** * The bus number is used for lookup when SID based invalidation occurs. * In that case we lazily populate the SMMUPciBus array from the bus hash diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index aab09df23e..76d9969c93 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -827,6 +827,76 @@ static void smmuv3_flush_config(SMMUDevice *sdev) g_hash_table_remove(bc->configs, sdev); } +/* Do translation with TLB lookup. */ +static SMMUTranslationStatus smmuv3_do_translate(SMMUv3State *s, hwaddr addr, + SMMUTransCfg *cfg, + SMMUEventInfo *event, + IOMMUAccessFlags flag, + SMMUTLBEntry **out_entry) +{ + SMMUPTWEventInfo ptw_info = {}; + SMMUState *bs = ARM_SMMU(s); + SMMUTLBEntry *cached_entry = NULL; + + cached_entry = smmu_translate(bs, cfg, addr, flag, &ptw_info); + if (!cached_entry) { + /* All faults from PTW has S2 field. */ + event->u.f_walk_eabt.s2 = (ptw_info.stage == SMMU_STAGE_2); + switch (ptw_info.type) { + case SMMU_PTW_ERR_WALK_EABT: + event->type = SMMU_EVT_F_WALK_EABT; + event->u.f_walk_eabt.addr = addr; + event->u.f_walk_eabt.rnw = flag & 0x1; + event->u.f_walk_eabt.class = (ptw_info.stage == SMMU_STAGE_2) ? + SMMU_CLASS_IN : SMMU_CLASS_TT; + event->u.f_walk_eabt.addr2 = ptw_info.addr; + break; + case SMMU_PTW_ERR_TRANSLATION: + if (PTW_RECORD_FAULT(cfg)) { + event->type = SMMU_EVT_F_TRANSLATION; + event->u.f_translation.addr = addr; + event->u.f_translation.addr2 = ptw_info.addr; + event->u.f_translation.class = SMMU_CLASS_IN; + event->u.f_translation.rnw = flag & 0x1; + } + break; + case SMMU_PTW_ERR_ADDR_SIZE: + if (PTW_RECORD_FAULT(cfg)) { + event->type = SMMU_EVT_F_ADDR_SIZE; + event->u.f_addr_size.addr = addr; + event->u.f_addr_size.addr2 = ptw_info.addr; + event->u.f_addr_size.class = SMMU_CLASS_IN; + event->u.f_addr_size.rnw = flag & 0x1; + } + break; + case SMMU_PTW_ERR_ACCESS: + if (PTW_RECORD_FAULT(cfg)) { + event->type = SMMU_EVT_F_ACCESS; + event->u.f_access.addr = addr; + event->u.f_access.addr2 = ptw_info.addr; + event->u.f_access.class = SMMU_CLASS_IN; + event->u.f_access.rnw = flag & 0x1; + } + break; + case SMMU_PTW_ERR_PERMISSION: + if (PTW_RECORD_FAULT(cfg)) { + event->type = SMMU_EVT_F_PERMISSION; + event->u.f_permission.addr = addr; + event->u.f_permission.addr2 = ptw_info.addr; + event->u.f_permission.class = SMMU_CLASS_IN; + event->u.f_permission.rnw = flag & 0x1; + } + break; + default: + g_assert_not_reached(); + } + return SMMU_TRANS_ERROR; + } + *out_entry = cached_entry; + return SMMU_TRANS_SUCCESS; +} + +/* Entry point to SMMU, does everything. */ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, IOMMUAccessFlags flag, int iommu_idx) { @@ -836,12 +906,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, SMMUEventInfo event = {.type = SMMU_EVT_NONE, .sid = sid, .inval_ste_allowed = false}; - SMMUPTWEventInfo ptw_info = {}; SMMUTranslationStatus status; - SMMUState *bs = ARM_SMMU(s); - uint64_t page_mask, aligned_addr; - SMMUTLBEntry *cached_entry = NULL; - SMMUTransTableInfo *tt; SMMUTransCfg *cfg = NULL; IOMMUTLBEntry entry = { .target_as = &address_space_memory, @@ -850,11 +915,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, .addr_mask = ~(hwaddr)0, .perm = IOMMU_NONE, }; - /* - * Combined attributes used for TLB lookup, as only one stage is supported, - * it will hold attributes based on the enabled stage. - */ - SMMUTransTableInfo tt_combined; + SMMUTLBEntry *cached_entry = NULL; qemu_mutex_lock(&s->mutex); @@ -883,115 +944,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, goto epilogue; } - if (cfg->stage == SMMU_STAGE_1) { - /* Select stage1 translation table. */ - tt = select_tt(cfg, addr); - if (!tt) { - if (cfg->record_faults) { - event.type = SMMU_EVT_F_TRANSLATION; - event.u.f_translation.addr = addr; - event.u.f_translation.rnw = flag & 0x1; - } - status = SMMU_TRANS_ERROR; - goto epilogue; - } - tt_combined.granule_sz = tt->granule_sz; - tt_combined.tsz = tt->tsz; - - } else { - /* Stage2. */ - tt_combined.granule_sz = cfg->s2cfg.granule_sz; - tt_combined.tsz = cfg->s2cfg.tsz; - } - /* - * TLB lookup looks for granule and input size for a translation stage, - * as only one stage is supported right now, choose the right values - * from the configuration. - */ - page_mask = (1ULL << tt_combined.granule_sz) - 1; - aligned_addr = addr & ~page_mask; - - cached_entry = smmu_iotlb_lookup(bs, cfg, &tt_combined, aligned_addr); - if (cached_entry) { - if ((flag & IOMMU_WO) && !(cached_entry->entry.perm & IOMMU_WO)) { - status = SMMU_TRANS_ERROR; - /* - * We know that the TLB only contains either stage-1 or stage-2 as - * nesting is not supported. So it is sufficient to check the - * translation stage to know the TLB stage for now. - */ - event.u.f_walk_eabt.s2 = (cfg->stage == SMMU_STAGE_2); - if (PTW_RECORD_FAULT(cfg)) { - event.type = SMMU_EVT_F_PERMISSION; - event.u.f_permission.addr = addr; - event.u.f_permission.rnw = flag & 0x1; - } - } else { - status = SMMU_TRANS_SUCCESS; - } - goto epilogue; - } - - cached_entry = g_new0(SMMUTLBEntry, 1); - - if (smmu_ptw(cfg, aligned_addr, flag, cached_entry, &ptw_info)) { - /* All faults from PTW has S2 field. */ - event.u.f_walk_eabt.s2 = (ptw_info.stage == SMMU_STAGE_2); - g_free(cached_entry); - switch (ptw_info.type) { - case SMMU_PTW_ERR_WALK_EABT: - event.type = SMMU_EVT_F_WALK_EABT; - event.u.f_walk_eabt.addr = addr; - event.u.f_walk_eabt.rnw = flag & 0x1; - /* Stage-2 (only) is class IN while stage-1 is class TT */ - event.u.f_walk_eabt.class = (ptw_info.stage == SMMU_STAGE_2) ? - SMMU_CLASS_IN : SMMU_CLASS_TT; - event.u.f_walk_eabt.addr2 = ptw_info.addr; - break; - case SMMU_PTW_ERR_TRANSLATION: - if (PTW_RECORD_FAULT(cfg)) { - event.type = SMMU_EVT_F_TRANSLATION; - event.u.f_translation.addr = addr; - event.u.f_translation.addr2 = ptw_info.addr; - event.u.f_translation.class = SMMU_CLASS_IN; - event.u.f_translation.rnw = flag & 0x1; - } - break; - case SMMU_PTW_ERR_ADDR_SIZE: - if (PTW_RECORD_FAULT(cfg)) { - event.type = SMMU_EVT_F_ADDR_SIZE; - event.u.f_addr_size.addr = addr; - event.u.f_addr_size.addr2 = ptw_info.addr; - event.u.f_translation.class = SMMU_CLASS_IN; - event.u.f_addr_size.rnw = flag & 0x1; - } - break; - case SMMU_PTW_ERR_ACCESS: - if (PTW_RECORD_FAULT(cfg)) { - event.type = SMMU_EVT_F_ACCESS; - event.u.f_access.addr = addr; - event.u.f_access.addr2 = ptw_info.addr; - event.u.f_translation.class = SMMU_CLASS_IN; - event.u.f_access.rnw = flag & 0x1; - } - break; - case SMMU_PTW_ERR_PERMISSION: - if (PTW_RECORD_FAULT(cfg)) { - event.type = SMMU_EVT_F_PERMISSION; - event.u.f_permission.addr = addr; - event.u.f_permission.addr2 = ptw_info.addr; - event.u.f_translation.class = SMMU_CLASS_IN; - event.u.f_permission.rnw = flag & 0x1; - } - break; - default: - g_assert_not_reached(); - } - status = SMMU_TRANS_ERROR; - } else { - smmu_iotlb_insert(bs, cfg, cached_entry); - status = SMMU_TRANS_SUCCESS; - } + status = smmuv3_do_translate(s, addr, cfg, &event, flag, &cached_entry); epilogue: qemu_mutex_unlock(&s->mutex); @@ -1002,7 +955,8 @@ epilogue: (addr & cached_entry->entry.addr_mask); entry.addr_mask = cached_entry->entry.addr_mask; trace_smmuv3_translate_success(mr->parent_obj.name, sid, addr, - entry.translated_addr, entry.perm); + entry.translated_addr, entry.perm, + cfg->stage); break; case SMMU_TRANS_DISABLE: entry.perm = flag; diff --git a/hw/arm/trace-events b/hw/arm/trace-events index f1a54a02df..cc12924a84 100644 --- a/hw/arm/trace-events +++ b/hw/arm/trace-events @@ -37,7 +37,7 @@ smmuv3_get_ste(uint64_t addr) "STE addr: 0x%"PRIx64 smmuv3_translate_disable(const char *n, uint16_t sid, uint64_t addr, bool is_write) "%s sid=0x%x bypass (smmu disabled) iova:0x%"PRIx64" is_write=%d" smmuv3_translate_bypass(const char *n, uint16_t sid, uint64_t addr, bool is_write) "%s sid=0x%x STE bypass iova:0x%"PRIx64" is_write=%d" smmuv3_translate_abort(const char *n, uint16_t sid, uint64_t addr, bool is_write) "%s sid=0x%x abort on iova:0x%"PRIx64" is_write=%d" -smmuv3_translate_success(const char *n, uint16_t sid, uint64_t iova, uint64_t translated, int perm) "%s sid=0x%x iova=0x%"PRIx64" translated=0x%"PRIx64" perm=0x%x" +smmuv3_translate_success(const char *n, uint16_t sid, uint64_t iova, uint64_t translated, int perm, int stage) "%s sid=0x%x iova=0x%"PRIx64" translated=0x%"PRIx64" perm=0x%x stage=%d" smmuv3_get_cd(uint64_t addr) "CD addr: 0x%"PRIx64 smmuv3_decode_cd(uint32_t oas) "oas=%d" smmuv3_decode_cd_tt(int i, uint32_t tsz, uint64_t ttb, uint32_t granule_sz, bool had) "TT[%d]:tsz:%d ttb:0x%"PRIx64" granule_sz:%d had:%d" diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h index 260582089e..366b53be77 100644 --- a/include/hw/arm/smmu-common.h +++ b/include/hw/arm/smmu-common.h @@ -183,6 +183,14 @@ static inline uint16_t smmu_get_sid(SMMUDevice *sdev) int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm, SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info); + +/* + * smmu_translate - Look for a translation in TLB, if not, do a PTW. + * Returns NULL on PTW error or incase of TLB permission errors. + */ +SMMUTLBEntry *smmu_translate(SMMUState *bs, SMMUTransCfg *cfg, dma_addr_t addr, + IOMMUAccessFlags flag, SMMUPTWEventInfo *info); + /** * select_tt - compute which translation table shall be used according to * the input iova and translation config and return the TT specific info From d883822641748e2d3629fdea722192986238d2ff Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:06 +0000 Subject: [PATCH 09/26] hw/arm/smmu: Consolidate ASID and VMID types MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ASID and VMID used to be uint16_t in the translation config, however, in other contexts they can be int as -1 in case of TLB invalidation, to represent all (don’t care). When stage-2 was added asid was set to -1 in stage-2 and vmid to -1 in stage-1 configs. However, that meant they were set as (65536), this was not an issue as nesting was not supported and no commands/lookup uses both. With nesting, it’s critical to get this right as translation must be tagged correctly with ASID/VMID, and with ASID=-1 meaning stage-2. Represent ASID/VMID everywhere as int. Reviewed-by: Eric Auger Signed-off-by: Mostafa Saleh Reviewed-by: Jean-Philippe Brucker Reviewed-by: Alex Bennée Message-id: 20240715084519.1189624-7-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmu-common.c | 10 +++++----- hw/arm/smmuv3.c | 4 ++-- hw/arm/trace-events | 18 +++++++++--------- include/hw/arm/smmu-common.h | 14 +++++++------- 4 files changed, 23 insertions(+), 23 deletions(-) diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index 8d52472863..7fbf8e22fe 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -57,7 +57,7 @@ static gboolean smmu_iotlb_key_equal(gconstpointer v1, gconstpointer v2) (k1->vmid == k2->vmid); } -SMMUIOTLBKey smmu_get_iotlb_key(uint16_t asid, uint16_t vmid, uint64_t iova, +SMMUIOTLBKey smmu_get_iotlb_key(int asid, int vmid, uint64_t iova, uint8_t tg, uint8_t level) { SMMUIOTLBKey key = {.asid = asid, .vmid = vmid, .iova = iova, @@ -130,7 +130,7 @@ void smmu_iotlb_inv_all(SMMUState *s) static gboolean smmu_hash_remove_by_asid(gpointer key, gpointer value, gpointer user_data) { - uint16_t asid = *(uint16_t *)user_data; + int asid = *(int *)user_data; SMMUIOTLBKey *iotlb_key = (SMMUIOTLBKey *)key; return SMMU_IOTLB_ASID(*iotlb_key) == asid; @@ -139,7 +139,7 @@ static gboolean smmu_hash_remove_by_asid(gpointer key, gpointer value, static gboolean smmu_hash_remove_by_vmid(gpointer key, gpointer value, gpointer user_data) { - uint16_t vmid = *(uint16_t *)user_data; + int vmid = *(int *)user_data; SMMUIOTLBKey *iotlb_key = (SMMUIOTLBKey *)key; return SMMU_IOTLB_VMID(*iotlb_key) == vmid; @@ -191,13 +191,13 @@ void smmu_iotlb_inv_iova(SMMUState *s, int asid, int vmid, dma_addr_t iova, &info); } -void smmu_iotlb_inv_asid(SMMUState *s, uint16_t asid) +void smmu_iotlb_inv_asid(SMMUState *s, int asid) { trace_smmu_iotlb_inv_asid(asid); g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_asid, &asid); } -void smmu_iotlb_inv_vmid(SMMUState *s, uint16_t vmid) +void smmu_iotlb_inv_vmid(SMMUState *s, int vmid) { trace_smmu_iotlb_inv_vmid(vmid); g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_vmid, &vmid); diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index 76d9969c93..e71b842162 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -1240,7 +1240,7 @@ static int smmuv3_cmdq_consume(SMMUv3State *s) } case SMMU_CMD_TLBI_NH_ASID: { - uint16_t asid = CMD_ASID(&cmd); + int asid = CMD_ASID(&cmd); if (!STAGE1_SUPPORTED(s)) { cmd_error = SMMU_CERROR_ILL; @@ -1273,7 +1273,7 @@ static int smmuv3_cmdq_consume(SMMUv3State *s) break; case SMMU_CMD_TLBI_S12_VMALL: { - uint16_t vmid = CMD_VMID(&cmd); + int vmid = CMD_VMID(&cmd); if (!STAGE2_SUPPORTED(s)) { cmd_error = SMMU_CERROR_ILL; diff --git a/hw/arm/trace-events b/hw/arm/trace-events index cc12924a84..09ccd39548 100644 --- a/hw/arm/trace-events +++ b/hw/arm/trace-events @@ -11,13 +11,13 @@ smmu_ptw_page_pte(int stage, int level, uint64_t iova, uint64_t baseaddr, uint6 smmu_ptw_block_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t iova, uint64_t gpa, int bsize_mb) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" iova=0x%"PRIx64" block address = 0x%"PRIx64" block size = %d MiB" smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64 smmu_iotlb_inv_all(void) "IOTLB invalidate all" -smmu_iotlb_inv_asid(uint16_t asid) "IOTLB invalidate asid=%d" -smmu_iotlb_inv_vmid(uint16_t vmid) "IOTLB invalidate vmid=%d" -smmu_iotlb_inv_iova(uint16_t asid, uint64_t addr) "IOTLB invalidate asid=%d addr=0x%"PRIx64 +smmu_iotlb_inv_asid(int asid) "IOTLB invalidate asid=%d" +smmu_iotlb_inv_vmid(int vmid) "IOTLB invalidate vmid=%d" +smmu_iotlb_inv_iova(int asid, uint64_t addr) "IOTLB invalidate asid=%d addr=0x%"PRIx64 smmu_inv_notifiers_mr(const char *name) "iommu mr=%s" -smmu_iotlb_lookup_hit(uint16_t asid, uint16_t vmid, uint64_t addr, uint32_t hit, uint32_t miss, uint32_t p) "IOTLB cache HIT asid=%d vmid=%d addr=0x%"PRIx64" hit=%d miss=%d hit rate=%d" -smmu_iotlb_lookup_miss(uint16_t asid, uint16_t vmid, uint64_t addr, uint32_t hit, uint32_t miss, uint32_t p) "IOTLB cache MISS asid=%d vmid=%d addr=0x%"PRIx64" hit=%d miss=%d hit rate=%d" -smmu_iotlb_insert(uint16_t asid, uint16_t vmid, uint64_t addr, uint8_t tg, uint8_t level) "IOTLB ++ asid=%d vmid=%d addr=0x%"PRIx64" tg=%d level=%d" +smmu_iotlb_lookup_hit(int asid, int vmid, uint64_t addr, uint32_t hit, uint32_t miss, uint32_t p) "IOTLB cache HIT asid=%d vmid=%d addr=0x%"PRIx64" hit=%d miss=%d hit rate=%d" +smmu_iotlb_lookup_miss(int asid, int vmid, uint64_t addr, uint32_t hit, uint32_t miss, uint32_t p) "IOTLB cache MISS asid=%d vmid=%d addr=0x%"PRIx64" hit=%d miss=%d hit rate=%d" +smmu_iotlb_insert(int asid, int vmid, uint64_t addr, uint8_t tg, uint8_t level) "IOTLB ++ asid=%d vmid=%d addr=0x%"PRIx64" tg=%d level=%d" # smmuv3.c smmuv3_read_mmio(uint64_t addr, uint64_t val, unsigned size, uint32_t r) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x(%d)" @@ -48,12 +48,12 @@ smmuv3_config_cache_hit(uint32_t sid, uint32_t hits, uint32_t misses, uint32_t p smmuv3_config_cache_miss(uint32_t sid, uint32_t hits, uint32_t misses, uint32_t perc) "Config cache MISS for sid=0x%x (hits=%d, misses=%d, hit rate=%d)" smmuv3_range_inval(int vmid, int asid, uint64_t addr, uint8_t tg, uint64_t num_pages, uint8_t ttl, bool leaf) "vmid=%d asid=%d addr=0x%"PRIx64" tg=%d num_pages=0x%"PRIx64" ttl=%d leaf=%d" smmuv3_cmdq_tlbi_nh(void) "" -smmuv3_cmdq_tlbi_nh_asid(uint16_t asid) "asid=%d" -smmuv3_cmdq_tlbi_s12_vmid(uint16_t vmid) "vmid=%d" +smmuv3_cmdq_tlbi_nh_asid(int asid) "asid=%d" +smmuv3_cmdq_tlbi_s12_vmid(int vmid) "vmid=%d" smmuv3_config_cache_inv(uint32_t sid) "Config cache INV for sid=0x%x" smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s" smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s" -smmuv3_inv_notifiers_iova(const char *name, uint16_t asid, uint16_t vmid, uint64_t iova, uint8_t tg, uint64_t num_pages) "iommu mr=%s asid=%d vmid=%d iova=0x%"PRIx64" tg=%d num_pages=0x%"PRIx64 +smmuv3_inv_notifiers_iova(const char *name, int asid, int vmid, uint64_t iova, uint8_t tg, uint64_t num_pages) "iommu mr=%s asid=%d vmid=%d iova=0x%"PRIx64" tg=%d num_pages=0x%"PRIx64 # strongarm.c strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d" diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h index 366b53be77..5cb3024464 100644 --- a/include/hw/arm/smmu-common.h +++ b/include/hw/arm/smmu-common.h @@ -84,7 +84,7 @@ typedef struct SMMUS2Cfg { bool record_faults; /* Record fault events (S2R) */ uint8_t granule_sz; /* Granule page shift (based on S2TG) */ uint8_t eff_ps; /* Effective PA output range (based on S2PS) */ - uint16_t vmid; /* Virtual Machine ID (S2VMID) */ + int vmid; /* Virtual Machine ID (S2VMID) */ uint64_t vttb; /* Address of translation table base (S2TTB) */ } SMMUS2Cfg; @@ -108,7 +108,7 @@ typedef struct SMMUTransCfg { uint64_t ttb; /* TT base address */ uint8_t oas; /* output address width */ uint8_t tbi; /* Top Byte Ignore */ - uint16_t asid; + int asid; SMMUTransTableInfo tt[2]; /* Used by stage-2 only. */ struct SMMUS2Cfg s2cfg; @@ -132,8 +132,8 @@ typedef struct SMMUPciBus { typedef struct SMMUIOTLBKey { uint64_t iova; - uint16_t asid; - uint16_t vmid; + int asid; + int vmid; uint8_t tg; uint8_t level; } SMMUIOTLBKey; @@ -205,11 +205,11 @@ SMMUDevice *smmu_find_sdev(SMMUState *s, uint32_t sid); SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, SMMUTransCfg *cfg, SMMUTransTableInfo *tt, hwaddr iova); void smmu_iotlb_insert(SMMUState *bs, SMMUTransCfg *cfg, SMMUTLBEntry *entry); -SMMUIOTLBKey smmu_get_iotlb_key(uint16_t asid, uint16_t vmid, uint64_t iova, +SMMUIOTLBKey smmu_get_iotlb_key(int asid, int vmid, uint64_t iova, uint8_t tg, uint8_t level); void smmu_iotlb_inv_all(SMMUState *s); -void smmu_iotlb_inv_asid(SMMUState *s, uint16_t asid); -void smmu_iotlb_inv_vmid(SMMUState *s, uint16_t vmid); +void smmu_iotlb_inv_asid(SMMUState *s, int asid); +void smmu_iotlb_inv_vmid(SMMUState *s, int vmid); void smmu_iotlb_inv_iova(SMMUState *s, int asid, int vmid, dma_addr_t iova, uint8_t tg, uint64_t num_pages, uint8_t ttl); From ec31ef9115a7511c90fcdc0d89cb1491c2702003 Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:07 +0000 Subject: [PATCH 10/26] hw/arm/smmu: Introduce CACHED_ENTRY_TO_ADDR MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Soon, smmuv3_do_translate() will be used to translate the CD and the TTBx, instead of re-writting the same logic to convert the returned cached entry to an address, add a new macro CACHED_ENTRY_TO_ADDR. Reviewed-by: Eric Auger Signed-off-by: Mostafa Saleh Reviewed-by: Jean-Philippe Brucker Reviewed-by: Alex Bennée Message-id: 20240715084519.1189624-8-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmuv3.c | 3 +-- include/hw/arm/smmu-common.h | 3 +++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index e71b842162..dc63e07d68 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -951,8 +951,7 @@ epilogue: switch (status) { case SMMU_TRANS_SUCCESS: entry.perm = cached_entry->entry.perm; - entry.translated_addr = cached_entry->entry.translated_addr + - (addr & cached_entry->entry.addr_mask); + entry.translated_addr = CACHED_ENTRY_TO_ADDR(cached_entry, addr); entry.addr_mask = cached_entry->entry.addr_mask; trace_smmuv3_translate_success(mr->parent_obj.name, sid, addr, entry.translated_addr, entry.perm, diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h index 5cb3024464..f793b54388 100644 --- a/include/hw/arm/smmu-common.h +++ b/include/hw/arm/smmu-common.h @@ -37,6 +37,9 @@ #define VMSA_IDXMSK(isz, strd, lvl) ((1ULL << \ VMSA_BIT_LVL(isz, strd, lvl)) - 1) +#define CACHED_ENTRY_TO_ADDR(ent, addr) ((ent)->entry.translated_addr + \ + ((addr) & (ent)->entry.addr_mask)) + /* * Page table walk error types */ From 9dd6aa9b05d535f6d66e7b07134bddf1689ab23f Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:08 +0000 Subject: [PATCH 11/26] hw/arm/smmuv3: Translate CD and TT using stage-2 table MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit According to ARM SMMU architecture specification (ARM IHI 0070 F.b), In "5.2 Stream Table Entry": [51:6] S1ContextPtr If Config[1] == 1 (stage 2 enabled), this pointer is an IPA translated by stage 2 and the programmed value must be within the range of the IAS. In "5.4.1 CD notes": The translation table walks performed from TTB0 or TTB1 are always performed in IPA space if stage 2 translations are enabled. This patch implements translation of the S1 context descriptor pointer and TTBx base addresses through the S2 stage (IPA -> PA) smmuv3_do_translate() is updated to have one arg which is translation class, this is useful to: - Decide wether a translation is stage-2 only or use the STE config. - Populate the class in case of faults, WALK_EABT is left unchanged for stage-1 as it is always IN, while stage-2 would match the used class (TT, IN, CD), this will change slightly when the ptw supports nested translation as it can also issue TT event with class IN. In case for stage-2 only translation, used in the context of nested translation, the stage and asid are saved and restored before and after calling smmu_translate(). Translating CD or TTBx can fail for the following reasons: 1) Large address size: This is described in (3.4.3 Address sizes of SMMU-originated accesses) - For CD ptr larger than IAS, for SMMUv3.1, it can trigger either C_BAD_STE or Translation fault, we implement the latter as it requires no extra code. - For TTBx, if larger than the effective stage 1 output address size, it triggers C_BAD_CD. 2) Faults from PTWs (7.3 Event records) - F_ADDR_SIZE: large address size after first level causes stage 2 Address Size fault (Also in 3.4.3 Address sizes of SMMU-originated accesses) - F_PERMISSION: Same as an address translation. However, when CLASS == CD, the access is implicitly Data and a read. - F_ACCESS: Same as an address translation. - F_TRANSLATION: Same as an address translation. - F_WALK_EABT: Same as an address translation. These are already implemented in the PTW logic, so no extra handling required. As in CD and TTBx translation context, the iova is not known, setting the InputAddr was removed from "smmuv3_do_translate" and set after from "smmuv3_translate" with the new function "smmuv3_fixup_event" Signed-off-by: Mostafa Saleh Reviewed-by: Jean-Philippe Brucker Reviewed-by: Eric Auger Reviewed-by: Alex Bennée Message-id: 20240715084519.1189624-9-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmuv3.c | 120 +++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 103 insertions(+), 17 deletions(-) diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index dc63e07d68..5c5fee2799 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -337,14 +337,35 @@ static int smmu_get_ste(SMMUv3State *s, dma_addr_t addr, STE *buf, } +static SMMUTranslationStatus smmuv3_do_translate(SMMUv3State *s, hwaddr addr, + SMMUTransCfg *cfg, + SMMUEventInfo *event, + IOMMUAccessFlags flag, + SMMUTLBEntry **out_entry, + SMMUTranslationClass class); /* @ssid > 0 not supported yet */ -static int smmu_get_cd(SMMUv3State *s, STE *ste, uint32_t ssid, - CD *buf, SMMUEventInfo *event) +static int smmu_get_cd(SMMUv3State *s, STE *ste, SMMUTransCfg *cfg, + uint32_t ssid, CD *buf, SMMUEventInfo *event) { dma_addr_t addr = STE_CTXPTR(ste); int ret, i; + SMMUTranslationStatus status; + SMMUTLBEntry *entry; trace_smmuv3_get_cd(addr); + + if (cfg->stage == SMMU_NESTED) { + status = smmuv3_do_translate(s, addr, cfg, event, + IOMMU_RO, &entry, SMMU_CLASS_CD); + + /* Same PTW faults are reported but with CLASS = CD. */ + if (status != SMMU_TRANS_SUCCESS) { + return -EINVAL; + } + + addr = CACHED_ENTRY_TO_ADDR(entry, addr); + } + /* TODO: guarantee 64-bit single-copy atomicity */ ret = dma_memory_read(&address_space_memory, addr, buf, sizeof(*buf), MEMTXATTRS_UNSPECIFIED); @@ -659,10 +680,13 @@ static int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste, return 0; } -static int decode_cd(SMMUTransCfg *cfg, CD *cd, SMMUEventInfo *event) +static int decode_cd(SMMUv3State *s, SMMUTransCfg *cfg, + CD *cd, SMMUEventInfo *event) { int ret = -EINVAL; int i; + SMMUTranslationStatus status; + SMMUTLBEntry *entry; if (!CD_VALID(cd) || !CD_AARCH64(cd)) { goto bad_cd; @@ -713,9 +737,26 @@ static int decode_cd(SMMUTransCfg *cfg, CD *cd, SMMUEventInfo *event) tt->tsz = tsz; tt->ttb = CD_TTB(cd, i); + if (tt->ttb & ~(MAKE_64BIT_MASK(0, cfg->oas))) { goto bad_cd; } + + /* Translate the TTBx, from IPA to PA if nesting is enabled. */ + if (cfg->stage == SMMU_NESTED) { + status = smmuv3_do_translate(s, tt->ttb, cfg, event, IOMMU_RO, + &entry, SMMU_CLASS_TT); + /* + * Same PTW faults are reported but with CLASS = TT. + * If TTBx is larger than the effective stage 1 output addres + * size, it reports C_BAD_CD, which is handled by the above case. + */ + if (status != SMMU_TRANS_SUCCESS) { + return -EINVAL; + } + tt->ttb = CACHED_ENTRY_TO_ADDR(entry, tt->ttb); + } + tt->had = CD_HAD(cd, i); trace_smmuv3_decode_cd_tt(i, tt->tsz, tt->ttb, tt->granule_sz, tt->had); } @@ -767,12 +808,12 @@ static int smmuv3_decode_config(IOMMUMemoryRegion *mr, SMMUTransCfg *cfg, return 0; } - ret = smmu_get_cd(s, &ste, 0 /* ssid */, &cd, event); + ret = smmu_get_cd(s, &ste, cfg, 0 /* ssid */, &cd, event); if (ret) { return ret; } - return decode_cd(cfg, &cd, event); + return decode_cd(s, cfg, &cd, event); } /** @@ -832,58 +873,80 @@ static SMMUTranslationStatus smmuv3_do_translate(SMMUv3State *s, hwaddr addr, SMMUTransCfg *cfg, SMMUEventInfo *event, IOMMUAccessFlags flag, - SMMUTLBEntry **out_entry) + SMMUTLBEntry **out_entry, + SMMUTranslationClass class) { SMMUPTWEventInfo ptw_info = {}; SMMUState *bs = ARM_SMMU(s); SMMUTLBEntry *cached_entry = NULL; + int asid, stage; + bool desc_s2_translation = class != SMMU_CLASS_IN; + + /* + * The function uses the argument class to identify which stage is used: + * - CLASS = IN: Means an input translation, determine the stage from STE. + * - CLASS = CD: Means the addr is an IPA of the CD, and it would be + * translated using the stage-2. + * - CLASS = TT: Means the addr is an IPA of the stage-1 translation table + * and it would be translated using the stage-2. + * For the last 2 cases instead of having intrusive changes in the common + * logic, we modify the cfg to be a stage-2 translation only in case of + * nested, and then restore it after. + */ + if (desc_s2_translation) { + asid = cfg->asid; + stage = cfg->stage; + cfg->asid = -1; + cfg->stage = SMMU_STAGE_2; + } cached_entry = smmu_translate(bs, cfg, addr, flag, &ptw_info); + + if (desc_s2_translation) { + cfg->asid = asid; + cfg->stage = stage; + } + if (!cached_entry) { /* All faults from PTW has S2 field. */ event->u.f_walk_eabt.s2 = (ptw_info.stage == SMMU_STAGE_2); switch (ptw_info.type) { case SMMU_PTW_ERR_WALK_EABT: event->type = SMMU_EVT_F_WALK_EABT; - event->u.f_walk_eabt.addr = addr; event->u.f_walk_eabt.rnw = flag & 0x1; event->u.f_walk_eabt.class = (ptw_info.stage == SMMU_STAGE_2) ? - SMMU_CLASS_IN : SMMU_CLASS_TT; + class : SMMU_CLASS_TT; event->u.f_walk_eabt.addr2 = ptw_info.addr; break; case SMMU_PTW_ERR_TRANSLATION: if (PTW_RECORD_FAULT(cfg)) { event->type = SMMU_EVT_F_TRANSLATION; - event->u.f_translation.addr = addr; event->u.f_translation.addr2 = ptw_info.addr; - event->u.f_translation.class = SMMU_CLASS_IN; + event->u.f_translation.class = class; event->u.f_translation.rnw = flag & 0x1; } break; case SMMU_PTW_ERR_ADDR_SIZE: if (PTW_RECORD_FAULT(cfg)) { event->type = SMMU_EVT_F_ADDR_SIZE; - event->u.f_addr_size.addr = addr; event->u.f_addr_size.addr2 = ptw_info.addr; - event->u.f_addr_size.class = SMMU_CLASS_IN; + event->u.f_addr_size.class = class; event->u.f_addr_size.rnw = flag & 0x1; } break; case SMMU_PTW_ERR_ACCESS: if (PTW_RECORD_FAULT(cfg)) { event->type = SMMU_EVT_F_ACCESS; - event->u.f_access.addr = addr; event->u.f_access.addr2 = ptw_info.addr; - event->u.f_access.class = SMMU_CLASS_IN; + event->u.f_access.class = class; event->u.f_access.rnw = flag & 0x1; } break; case SMMU_PTW_ERR_PERMISSION: if (PTW_RECORD_FAULT(cfg)) { event->type = SMMU_EVT_F_PERMISSION; - event->u.f_permission.addr = addr; event->u.f_permission.addr2 = ptw_info.addr; - event->u.f_permission.class = SMMU_CLASS_IN; + event->u.f_permission.class = class; event->u.f_permission.rnw = flag & 0x1; } break; @@ -896,6 +959,27 @@ static SMMUTranslationStatus smmuv3_do_translate(SMMUv3State *s, hwaddr addr, return SMMU_TRANS_SUCCESS; } +/* + * Sets the InputAddr for an SMMU_TRANS_ERROR, as it can't be + * set from all contexts, as smmuv3_get_config() can return + * translation faults in case of nested translation (for CD + * and TTBx). But in that case the iova is not known. + */ +static void smmuv3_fixup_event(SMMUEventInfo *event, hwaddr iova) +{ + switch (event->type) { + case SMMU_EVT_F_WALK_EABT: + case SMMU_EVT_F_TRANSLATION: + case SMMU_EVT_F_ADDR_SIZE: + case SMMU_EVT_F_ACCESS: + case SMMU_EVT_F_PERMISSION: + event->u.f_walk_eabt.addr = iova; + break; + default: + break; + } +} + /* Entry point to SMMU, does everything. */ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, IOMMUAccessFlags flag, int iommu_idx) @@ -944,7 +1028,8 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr, goto epilogue; } - status = smmuv3_do_translate(s, addr, cfg, &event, flag, &cached_entry); + status = smmuv3_do_translate(s, addr, cfg, &event, flag, + &cached_entry, SMMU_CLASS_IN); epilogue: qemu_mutex_unlock(&s->mutex); @@ -975,6 +1060,7 @@ epilogue: entry.perm); break; case SMMU_TRANS_ERROR: + smmuv3_fixup_event(&event, addr); qemu_log_mask(LOG_GUEST_ERROR, "%s translation failed for iova=0x%"PRIx64" (%s)\n", mr->parent_obj.name, addr, smmu_event_string(event.type)); From 7eb57be1efb8cdcf613fd8b089295ed5a4f48d7d Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:09 +0000 Subject: [PATCH 12/26] hw/arm/smmu-common: Rework TLB lookup for nesting MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In the next patch, combine_tlb() will be added which combines 2 TLB entries into one for nested translations, which chooses the granule and level from the smallest entry. This means that with nested translation, an entry can be cached with the granule of stage-2 and not stage-1. However, currently, the lookup for an IOVA is done with input stage granule, which is stage-1 for nested configuration, which will not work with the above logic. This patch reworks lookup in that case, so it falls back to stage-2 granule if no entry is found using stage-1 granule. Also, drop aligning the iova to avoid over-aligning in case the iova is cached with a smaller granule, the TLB lookup will align the iova anyway for each granule and level, and the page table walker doesn't consider the page offset bits. Signed-off-by: Mostafa Saleh Reviewed-by: Jean-Philippe Brucker Reviewed-by: Eric Auger Reviewed-by: Alex Bennée Message-id: 20240715084519.1189624-10-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmu-common.c | 64 +++++++++++++++++++++++++++++--------------- 1 file changed, 43 insertions(+), 21 deletions(-) diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index 7fbf8e22fe..e374cae0db 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -66,8 +66,10 @@ SMMUIOTLBKey smmu_get_iotlb_key(int asid, int vmid, uint64_t iova, return key; } -SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, SMMUTransCfg *cfg, - SMMUTransTableInfo *tt, hwaddr iova) +static SMMUTLBEntry *smmu_iotlb_lookup_all_levels(SMMUState *bs, + SMMUTransCfg *cfg, + SMMUTransTableInfo *tt, + hwaddr iova) { uint8_t tg = (tt->granule_sz - 10) / 2; uint8_t inputsize = 64 - tt->tsz; @@ -88,6 +90,36 @@ SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, SMMUTransCfg *cfg, } level++; } + return entry; +} + +/** + * smmu_iotlb_lookup - Look up for a TLB entry. + * @bs: SMMU state which includes the TLB instance + * @cfg: Configuration of the translation + * @tt: Translation table info (granule and tsz) + * @iova: IOVA address to lookup + * + * returns a valid entry on success, otherwise NULL. + * In case of nested translation, tt can be updated to include + * the granule of the found entry as it might different from + * the IOVA granule. + */ +SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, SMMUTransCfg *cfg, + SMMUTransTableInfo *tt, hwaddr iova) +{ + SMMUTLBEntry *entry = NULL; + + entry = smmu_iotlb_lookup_all_levels(bs, cfg, tt, iova); + /* + * For nested translation also try the s2 granule, as the TLB will insert + * it if the size of s2 tlb entry was smaller. + */ + if (!entry && (cfg->stage == SMMU_NESTED) && + (cfg->s2cfg.granule_sz != tt->granule_sz)) { + tt->granule_sz = cfg->s2cfg.granule_sz; + entry = smmu_iotlb_lookup_all_levels(bs, cfg, tt, iova); + } if (entry) { cfg->iotlb_hits++; @@ -569,18 +601,21 @@ int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm, SMMUTLBEntry *smmu_translate(SMMUState *bs, SMMUTransCfg *cfg, dma_addr_t addr, IOMMUAccessFlags flag, SMMUPTWEventInfo *info) { - uint64_t page_mask, aligned_addr; SMMUTLBEntry *cached_entry = NULL; SMMUTransTableInfo *tt; int status; /* - * Combined attributes used for TLB lookup, as only one stage is supported, - * it will hold attributes based on the enabled stage. + * Combined attributes used for TLB lookup, holds the attributes for + * the input stage. */ SMMUTransTableInfo tt_combined; - if (cfg->stage == SMMU_STAGE_1) { + if (cfg->stage == SMMU_STAGE_2) { + /* Stage2. */ + tt_combined.granule_sz = cfg->s2cfg.granule_sz; + tt_combined.tsz = cfg->s2cfg.tsz; + } else { /* Select stage1 translation table. */ tt = select_tt(cfg, addr); if (!tt) { @@ -590,22 +625,9 @@ SMMUTLBEntry *smmu_translate(SMMUState *bs, SMMUTransCfg *cfg, dma_addr_t addr, } tt_combined.granule_sz = tt->granule_sz; tt_combined.tsz = tt->tsz; - - } else { - /* Stage2. */ - tt_combined.granule_sz = cfg->s2cfg.granule_sz; - tt_combined.tsz = cfg->s2cfg.tsz; } - /* - * TLB lookup looks for granule and input size for a translation stage, - * as only one stage is supported right now, choose the right values - * from the configuration. - */ - page_mask = (1ULL << tt_combined.granule_sz) - 1; - aligned_addr = addr & ~page_mask; - - cached_entry = smmu_iotlb_lookup(bs, cfg, &tt_combined, aligned_addr); + cached_entry = smmu_iotlb_lookup(bs, cfg, &tt_combined, addr); if (cached_entry) { if ((flag & IOMMU_WO) && !(cached_entry->entry.perm & IOMMU_WO)) { info->type = SMMU_PTW_ERR_PERMISSION; @@ -616,7 +638,7 @@ SMMUTLBEntry *smmu_translate(SMMUState *bs, SMMUTransCfg *cfg, dma_addr_t addr, } cached_entry = g_new0(SMMUTLBEntry, 1); - status = smmu_ptw(cfg, aligned_addr, flag, cached_entry, info); + status = smmu_ptw(cfg, addr, flag, cached_entry, info); if (status) { g_free(cached_entry); return NULL; From d7cdf89c276abeb392cbf9b9d0dde5060a8c778f Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:10 +0000 Subject: [PATCH 13/26] hw/arm/smmu-common: Add support for nested TLB MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This patch adds support for nested (combined) TLB entries. The main function combine_tlb() is not used here but in the next patches, but to simplify the patches it is introduced first. Main changes: 1) New field added in the SMMUTLBEntry struct: parent_perm, for nested TLB, holds the stage-2 permission, this can be used to know the origin of a permission fault from a cached entry as caching the “and” of the permissions loses this information. SMMUPTWEventInfo is used to hold information about PTW faults so the event can be populated, the value of stage used to be set based on the current stage for TLB permission faults, however with the parent_perm, it is now set based on which perm has the missing permission When nesting is not enabled it has the same value as perm which doesn't change the logic. 2) As combined TLB implementation is used, the combination logic chooses: - tg and level from the entry which has the smallest addr_mask. - Based on that the iova that would be cached is recalculated. - Translated_addr is chosen from stage-2. Reviewed-by: Eric Auger Reviewed-by: Jean-Philippe Brucker Signed-off-by: Mostafa Saleh Reviewed-by: Alex Bennée Message-id: 20240715084519.1189624-11-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmu-common.c | 37 ++++++++++++++++++++++++++++++++---- include/hw/arm/smmu-common.h | 1 + 2 files changed, 34 insertions(+), 4 deletions(-) diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index e374cae0db..bf55b9c5a4 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -426,7 +426,8 @@ static int smmu_ptw_64_s1(SMMUTransCfg *cfg, tlbe->entry.translated_addr = gpa; tlbe->entry.iova = iova & ~mask; tlbe->entry.addr_mask = mask; - tlbe->entry.perm = PTE_AP_TO_PERM(ap); + tlbe->parent_perm = PTE_AP_TO_PERM(ap); + tlbe->entry.perm = tlbe->parent_perm; tlbe->level = level; tlbe->granule = granule_sz; return 0; @@ -547,7 +548,8 @@ static int smmu_ptw_64_s2(SMMUTransCfg *cfg, tlbe->entry.translated_addr = gpa; tlbe->entry.iova = ipa & ~mask; tlbe->entry.addr_mask = mask; - tlbe->entry.perm = s2ap; + tlbe->parent_perm = s2ap; + tlbe->entry.perm = tlbe->parent_perm; tlbe->level = level; tlbe->granule = granule_sz; return 0; @@ -562,6 +564,30 @@ error: return -EINVAL; } +/* + * combine S1 and S2 TLB entries into a single entry. + * As a result the S1 entry is overriden with combined data. + */ +static void __attribute__((unused)) combine_tlb(SMMUTLBEntry *tlbe, + SMMUTLBEntry *tlbe_s2, + dma_addr_t iova, + SMMUTransCfg *cfg) +{ + if (tlbe_s2->entry.addr_mask < tlbe->entry.addr_mask) { + tlbe->entry.addr_mask = tlbe_s2->entry.addr_mask; + tlbe->granule = tlbe_s2->granule; + tlbe->level = tlbe_s2->level; + } + + tlbe->entry.translated_addr = CACHED_ENTRY_TO_ADDR(tlbe_s2, + tlbe->entry.translated_addr); + + tlbe->entry.iova = iova & ~tlbe->entry.addr_mask; + /* parent_perm has s2 perm while perm keeps s1 perm. */ + tlbe->parent_perm = tlbe_s2->entry.perm; + return; +} + /** * smmu_ptw - Walk the page tables for an IOVA, according to @cfg * @@ -629,9 +655,12 @@ SMMUTLBEntry *smmu_translate(SMMUState *bs, SMMUTransCfg *cfg, dma_addr_t addr, cached_entry = smmu_iotlb_lookup(bs, cfg, &tt_combined, addr); if (cached_entry) { - if ((flag & IOMMU_WO) && !(cached_entry->entry.perm & IOMMU_WO)) { + if ((flag & IOMMU_WO) && !(cached_entry->entry.perm & + cached_entry->parent_perm & IOMMU_WO)) { info->type = SMMU_PTW_ERR_PERMISSION; - info->stage = cfg->stage; + info->stage = !(cached_entry->entry.perm & IOMMU_WO) ? + SMMU_STAGE_1 : + SMMU_STAGE_2; return NULL; } return cached_entry; diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h index f793b54388..0877536472 100644 --- a/include/hw/arm/smmu-common.h +++ b/include/hw/arm/smmu-common.h @@ -77,6 +77,7 @@ typedef struct SMMUTLBEntry { IOMMUTLBEntry entry; uint8_t level; uint8_t granule; + IOMMUAccessFlags parent_perm; } SMMUTLBEntry; /* Stage-2 configuration. */ From f42a0a57c0a22210456e83676f43e88782e43942 Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:11 +0000 Subject: [PATCH 14/26] hw/arm/smmu-common: Support nested translation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When nested translation is requested, do the following: - Translate stage-1 table address IPA into PA through stage-2. - Translate stage-1 table walk output (IPA) through stage-2. - Create a single TLB entry from stage-1 and stage-2 translations using logic introduced before. smmu_ptw() has a new argument SMMUState which include the TLB as stage-1 table address can be cached in there. Also in smmu_ptw(), a separate path used for nesting to simplify the code, although some logic can be combined. With nested translation class of translation fault can be different, from the class of the translation, as faults from translating stage-1 tables are considered as CLASS_TT and not CLASS_IN, a new member "is_ipa_descriptor" added to "SMMUPTWEventInfo" to differ faults from walking stage 1 translation table and faults from translating an IPA for a transaction. Signed-off-by: Mostafa Saleh Reviewed-by: Jean-Philippe Brucker Reviewed-by: Eric Auger Reviewed-by: Alex Bennée Message-id: 20240715084519.1189624-12-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmu-common.c | 74 +++++++++++++++++++++++++++++++----- hw/arm/smmuv3.c | 14 +++++++ include/hw/arm/smmu-common.h | 7 ++-- 3 files changed, 82 insertions(+), 13 deletions(-) diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index bf55b9c5a4..912b89b5ee 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -318,8 +318,41 @@ SMMUTransTableInfo *select_tt(SMMUTransCfg *cfg, dma_addr_t iova) return NULL; } +/* Translate stage-1 table address using stage-2 page table. */ +static inline int translate_table_addr_ipa(SMMUState *bs, + dma_addr_t *table_addr, + SMMUTransCfg *cfg, + SMMUPTWEventInfo *info) +{ + dma_addr_t addr = *table_addr; + SMMUTLBEntry *cached_entry; + int asid; + + /* + * The translation table walks performed from TTB0 or TTB1 are always + * performed in IPA space if stage 2 translations are enabled. + */ + asid = cfg->asid; + cfg->stage = SMMU_STAGE_2; + cfg->asid = -1; + cached_entry = smmu_translate(bs, cfg, addr, IOMMU_RO, info); + cfg->asid = asid; + cfg->stage = SMMU_NESTED; + + if (cached_entry) { + *table_addr = CACHED_ENTRY_TO_ADDR(cached_entry, addr); + return 0; + } + + info->stage = SMMU_STAGE_2; + info->addr = addr; + info->is_ipa_descriptor = true; + return -EINVAL; +} + /** * smmu_ptw_64_s1 - VMSAv8-64 Walk of the page tables for a given IOVA + * @bs: smmu state which includes TLB instance * @cfg: translation config * @iova: iova to translate * @perm: access type @@ -331,7 +364,7 @@ SMMUTransTableInfo *select_tt(SMMUTransCfg *cfg, dma_addr_t iova) * Upon success, @tlbe is filled with translated_addr and entry * permission rights. */ -static int smmu_ptw_64_s1(SMMUTransCfg *cfg, +static int smmu_ptw_64_s1(SMMUState *bs, SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm, SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info) { @@ -381,6 +414,11 @@ static int smmu_ptw_64_s1(SMMUTransCfg *cfg, goto error; } baseaddr = get_table_pte_address(pte, granule_sz); + if (cfg->stage == SMMU_NESTED) { + if (translate_table_addr_ipa(bs, &baseaddr, cfg, info)) { + goto error; + } + } level++; continue; } else if (is_page_pte(pte, level)) { @@ -568,10 +606,8 @@ error: * combine S1 and S2 TLB entries into a single entry. * As a result the S1 entry is overriden with combined data. */ -static void __attribute__((unused)) combine_tlb(SMMUTLBEntry *tlbe, - SMMUTLBEntry *tlbe_s2, - dma_addr_t iova, - SMMUTransCfg *cfg) +static void combine_tlb(SMMUTLBEntry *tlbe, SMMUTLBEntry *tlbe_s2, + dma_addr_t iova, SMMUTransCfg *cfg) { if (tlbe_s2->entry.addr_mask < tlbe->entry.addr_mask) { tlbe->entry.addr_mask = tlbe_s2->entry.addr_mask; @@ -591,6 +627,7 @@ static void __attribute__((unused)) combine_tlb(SMMUTLBEntry *tlbe, /** * smmu_ptw - Walk the page tables for an IOVA, according to @cfg * + * @bs: smmu state which includes TLB instance * @cfg: translation configuration * @iova: iova to translate * @perm: tentative access type @@ -599,11 +636,15 @@ static void __attribute__((unused)) combine_tlb(SMMUTLBEntry *tlbe, * * return 0 on success */ -int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm, - SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info) +int smmu_ptw(SMMUState *bs, SMMUTransCfg *cfg, dma_addr_t iova, + IOMMUAccessFlags perm, SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info) { + int ret; + SMMUTLBEntry tlbe_s2; + dma_addr_t ipa; + if (cfg->stage == SMMU_STAGE_1) { - return smmu_ptw_64_s1(cfg, iova, perm, tlbe, info); + return smmu_ptw_64_s1(bs, cfg, iova, perm, tlbe, info); } else if (cfg->stage == SMMU_STAGE_2) { /* * If bypassing stage 1(or unimplemented), the input address is passed @@ -621,7 +662,20 @@ int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm, return smmu_ptw_64_s2(cfg, iova, perm, tlbe, info); } - g_assert_not_reached(); + /* SMMU_NESTED. */ + ret = smmu_ptw_64_s1(bs, cfg, iova, perm, tlbe, info); + if (ret) { + return ret; + } + + ipa = CACHED_ENTRY_TO_ADDR(tlbe, iova); + ret = smmu_ptw_64_s2(cfg, ipa, perm, &tlbe_s2, info); + if (ret) { + return ret; + } + + combine_tlb(tlbe, &tlbe_s2, iova, cfg); + return 0; } SMMUTLBEntry *smmu_translate(SMMUState *bs, SMMUTransCfg *cfg, dma_addr_t addr, @@ -667,7 +721,7 @@ SMMUTLBEntry *smmu_translate(SMMUState *bs, SMMUTransCfg *cfg, dma_addr_t addr, } cached_entry = g_new0(SMMUTLBEntry, 1); - status = smmu_ptw(cfg, addr, flag, cached_entry, info); + status = smmu_ptw(bs, cfg, addr, flag, cached_entry, info); if (status) { g_free(cached_entry); return NULL; diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index 5c5fee2799..0faa08c8d8 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -910,6 +910,20 @@ static SMMUTranslationStatus smmuv3_do_translate(SMMUv3State *s, hwaddr addr, if (!cached_entry) { /* All faults from PTW has S2 field. */ event->u.f_walk_eabt.s2 = (ptw_info.stage == SMMU_STAGE_2); + /* + * Fault class is set as follows based on "class" input to + * the function and to "ptw_info" from "smmu_translate()" + * For stage-1: + * - EABT => CLASS_TT (hardcoded) + * - other events => CLASS_IN (input to function) + * For stage-2 => CLASS_IN (input to function) + * For nested, for all events: + * - CD fetch => CLASS_CD (input to function) + * - walking stage 1 translation table => CLASS_TT (from + * is_ipa_descriptor or input in case of TTBx) + * - s2 translation => CLASS_IN (input to function) + */ + class = ptw_info.is_ipa_descriptor ? SMMU_CLASS_TT : class; switch (ptw_info.type) { case SMMU_PTW_ERR_WALK_EABT: event->type = SMMU_EVT_F_WALK_EABT; diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h index 0877536472..a51005e8b8 100644 --- a/include/hw/arm/smmu-common.h +++ b/include/hw/arm/smmu-common.h @@ -63,6 +63,7 @@ typedef struct SMMUPTWEventInfo { SMMUStage stage; SMMUPTWEventType type; dma_addr_t addr; /* fetched address that induced an abort, if any */ + bool is_ipa_descriptor; /* src for fault in nested translation. */ } SMMUPTWEventInfo; typedef struct SMMUTransTableInfo { @@ -184,9 +185,9 @@ static inline uint16_t smmu_get_sid(SMMUDevice *sdev) * smmu_ptw - Perform the page table walk for a given iova / access flags * pair, according to @cfg translation config */ -int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm, - SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info); - +int smmu_ptw(SMMUState *bs, SMMUTransCfg *cfg, dma_addr_t iova, + IOMMUAccessFlags perm, SMMUTLBEntry *tlbe, + SMMUPTWEventInfo *info); /* * smmu_translate - Look for a translation in TLB, if not, do a PTW. From 1ea8a6f59b8d6dbcc5d0aa59380ae45a18231b88 Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:12 +0000 Subject: [PATCH 15/26] hw/arm/smmu: Support nesting in smmuv3_range_inval() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit With nesting, we would need to invalidate IPAs without over-invalidating stage-1 IOVAs. This can be done by distinguishing IPAs in the TLBs by having ASID=-1. To achieve that, rework the invalidation for IPAs to have a separate function, while for IOVA invalidation ASID=-1 means invalidate for all ASIDs. Reviewed-by: Eric Auger Signed-off-by: Mostafa Saleh Reviewed-by: Jean-Philippe Brucker Reviewed-by: Alex Bennée Message-id: 20240715084519.1189624-13-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmu-common.c | 47 ++++++++++++++++++++++++++++++++++++ hw/arm/smmuv3.c | 23 ++++++++++++------ hw/arm/trace-events | 2 +- include/hw/arm/smmu-common.h | 3 ++- 4 files changed, 66 insertions(+), 9 deletions(-) diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index 912b89b5ee..4b0857ab4d 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -195,6 +195,25 @@ static gboolean smmu_hash_remove_by_asid_vmid_iova(gpointer key, gpointer value, ((entry->iova & ~info->mask) == info->iova); } +static gboolean smmu_hash_remove_by_vmid_ipa(gpointer key, gpointer value, + gpointer user_data) +{ + SMMUTLBEntry *iter = (SMMUTLBEntry *)value; + IOMMUTLBEntry *entry = &iter->entry; + SMMUIOTLBPageInvInfo *info = (SMMUIOTLBPageInvInfo *)user_data; + SMMUIOTLBKey iotlb_key = *(SMMUIOTLBKey *)key; + + if (SMMU_IOTLB_ASID(iotlb_key) >= 0) { + /* This is a stage-1 address. */ + return false; + } + if (info->vmid != SMMU_IOTLB_VMID(iotlb_key)) { + return false; + } + return ((info->iova & ~entry->addr_mask) == entry->iova) || + ((entry->iova & ~info->mask) == info->iova); +} + void smmu_iotlb_inv_iova(SMMUState *s, int asid, int vmid, dma_addr_t iova, uint8_t tg, uint64_t num_pages, uint8_t ttl) { @@ -223,6 +242,34 @@ void smmu_iotlb_inv_iova(SMMUState *s, int asid, int vmid, dma_addr_t iova, &info); } +/* + * Similar to smmu_iotlb_inv_iova(), but for Stage-2, ASID is always -1, + * in Stage-1 invalidation ASID = -1, means don't care. + */ +void smmu_iotlb_inv_ipa(SMMUState *s, int vmid, dma_addr_t ipa, uint8_t tg, + uint64_t num_pages, uint8_t ttl) +{ + uint8_t granule = tg ? tg * 2 + 10 : 12; + int asid = -1; + + if (ttl && (num_pages == 1)) { + SMMUIOTLBKey key = smmu_get_iotlb_key(asid, vmid, ipa, tg, ttl); + + if (g_hash_table_remove(s->iotlb, &key)) { + return; + } + } + + SMMUIOTLBPageInvInfo info = { + .iova = ipa, + .vmid = vmid, + .mask = (num_pages << granule) - 1}; + + g_hash_table_foreach_remove(s->iotlb, + smmu_hash_remove_by_vmid_ipa, + &info); +} + void smmu_iotlb_inv_asid(SMMUState *s, int asid) { trace_smmu_iotlb_inv_asid(asid); diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index 0faa08c8d8..ebf29f3adf 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -1168,7 +1168,7 @@ static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, int vmid, } } -static void smmuv3_range_inval(SMMUState *s, Cmd *cmd) +static void smmuv3_range_inval(SMMUState *s, Cmd *cmd, SMMUStage stage) { dma_addr_t end, addr = CMD_ADDR(cmd); uint8_t type = CMD_TYPE(cmd); @@ -1193,9 +1193,13 @@ static void smmuv3_range_inval(SMMUState *s, Cmd *cmd) } if (!tg) { - trace_smmuv3_range_inval(vmid, asid, addr, tg, 1, ttl, leaf); + trace_smmuv3_range_inval(vmid, asid, addr, tg, 1, ttl, leaf, stage); smmuv3_inv_notifiers_iova(s, asid, vmid, addr, tg, 1); - smmu_iotlb_inv_iova(s, asid, vmid, addr, tg, 1, ttl); + if (stage == SMMU_STAGE_1) { + smmu_iotlb_inv_iova(s, asid, vmid, addr, tg, 1, ttl); + } else { + smmu_iotlb_inv_ipa(s, vmid, addr, tg, 1, ttl); + } return; } @@ -1211,9 +1215,14 @@ static void smmuv3_range_inval(SMMUState *s, Cmd *cmd) uint64_t mask = dma_aligned_pow2_mask(addr, end, 64); num_pages = (mask + 1) >> granule; - trace_smmuv3_range_inval(vmid, asid, addr, tg, num_pages, ttl, leaf); + trace_smmuv3_range_inval(vmid, asid, addr, tg, num_pages, + ttl, leaf, stage); smmuv3_inv_notifiers_iova(s, asid, vmid, addr, tg, num_pages); - smmu_iotlb_inv_iova(s, asid, vmid, addr, tg, num_pages, ttl); + if (stage == SMMU_STAGE_1) { + smmu_iotlb_inv_iova(s, asid, vmid, addr, tg, num_pages, ttl); + } else { + smmu_iotlb_inv_ipa(s, vmid, addr, tg, num_pages, ttl); + } addr += mask + 1; } } @@ -1368,7 +1377,7 @@ static int smmuv3_cmdq_consume(SMMUv3State *s) cmd_error = SMMU_CERROR_ILL; break; } - smmuv3_range_inval(bs, &cmd); + smmuv3_range_inval(bs, &cmd, SMMU_STAGE_1); break; case SMMU_CMD_TLBI_S12_VMALL: { @@ -1393,7 +1402,7 @@ static int smmuv3_cmdq_consume(SMMUv3State *s) * As currently only either s1 or s2 are supported * we can reuse same function for s2. */ - smmuv3_range_inval(bs, &cmd); + smmuv3_range_inval(bs, &cmd, SMMU_STAGE_2); break; case SMMU_CMD_TLBI_EL3_ALL: case SMMU_CMD_TLBI_EL3_VA: diff --git a/hw/arm/trace-events b/hw/arm/trace-events index 09ccd39548..7d9c1703da 100644 --- a/hw/arm/trace-events +++ b/hw/arm/trace-events @@ -46,7 +46,7 @@ smmuv3_cmdq_cfgi_ste_range(int start, int end) "start=0x%x - end=0x%x" smmuv3_cmdq_cfgi_cd(uint32_t sid) "sid=0x%x" smmuv3_config_cache_hit(uint32_t sid, uint32_t hits, uint32_t misses, uint32_t perc) "Config cache HIT for sid=0x%x (hits=%d, misses=%d, hit rate=%d)" smmuv3_config_cache_miss(uint32_t sid, uint32_t hits, uint32_t misses, uint32_t perc) "Config cache MISS for sid=0x%x (hits=%d, misses=%d, hit rate=%d)" -smmuv3_range_inval(int vmid, int asid, uint64_t addr, uint8_t tg, uint64_t num_pages, uint8_t ttl, bool leaf) "vmid=%d asid=%d addr=0x%"PRIx64" tg=%d num_pages=0x%"PRIx64" ttl=%d leaf=%d" +smmuv3_range_inval(int vmid, int asid, uint64_t addr, uint8_t tg, uint64_t num_pages, uint8_t ttl, bool leaf, int stage) "vmid=%d asid=%d addr=0x%"PRIx64" tg=%d num_pages=0x%"PRIx64" ttl=%d leaf=%d stage=%d" smmuv3_cmdq_tlbi_nh(void) "" smmuv3_cmdq_tlbi_nh_asid(int asid) "asid=%d" smmuv3_cmdq_tlbi_s12_vmid(int vmid) "vmid=%d" diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h index a51005e8b8..da9ff45fb5 100644 --- a/include/hw/arm/smmu-common.h +++ b/include/hw/arm/smmu-common.h @@ -217,7 +217,8 @@ void smmu_iotlb_inv_asid(SMMUState *s, int asid); void smmu_iotlb_inv_vmid(SMMUState *s, int vmid); void smmu_iotlb_inv_iova(SMMUState *s, int asid, int vmid, dma_addr_t iova, uint8_t tg, uint64_t num_pages, uint8_t ttl); - +void smmu_iotlb_inv_ipa(SMMUState *s, int vmid, dma_addr_t ipa, uint8_t tg, + uint64_t num_pages, uint8_t ttl); /* Unmap the range of all the notifiers registered to any IOMMU mr */ void smmu_inv_notifiers_all(SMMUState *s); From eb41313c4be1d9b3c8fcd43ee706c10eabd5c188 Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:13 +0000 Subject: [PATCH 16/26] hw/arm/smmu: Introduce smmu_iotlb_inv_asid_vmid MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Soon, Instead of doing TLB invalidation by ASID only, VMID will be also required. Add smmu_iotlb_inv_asid_vmid() which invalidates by both ASID and VMID. However, at the moment this function is only used in SMMU_CMD_TLBI_NH_ASID which is a stage-1 command, so passing VMID = -1 keeps the original behaviour. Reviewed-by: Jean-Philippe Brucker Reviewed-by: Eric Auger Signed-off-by: Mostafa Saleh Reviewed-by: Alex Bennée Message-id: 20240715084519.1189624-14-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmu-common.c | 20 +++++++++++++------- hw/arm/smmuv3.c | 2 +- hw/arm/trace-events | 2 +- include/hw/arm/smmu-common.h | 2 +- 4 files changed, 16 insertions(+), 10 deletions(-) diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index 4b0857ab4d..e7f9c758fa 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -159,13 +159,14 @@ void smmu_iotlb_inv_all(SMMUState *s) g_hash_table_remove_all(s->iotlb); } -static gboolean smmu_hash_remove_by_asid(gpointer key, gpointer value, - gpointer user_data) +static gboolean smmu_hash_remove_by_asid_vmid(gpointer key, gpointer value, + gpointer user_data) { - int asid = *(int *)user_data; + SMMUIOTLBPageInvInfo *info = (SMMUIOTLBPageInvInfo *)user_data; SMMUIOTLBKey *iotlb_key = (SMMUIOTLBKey *)key; - return SMMU_IOTLB_ASID(*iotlb_key) == asid; + return (SMMU_IOTLB_ASID(*iotlb_key) == info->asid) && + (SMMU_IOTLB_VMID(*iotlb_key) == info->vmid); } static gboolean smmu_hash_remove_by_vmid(gpointer key, gpointer value, @@ -270,10 +271,15 @@ void smmu_iotlb_inv_ipa(SMMUState *s, int vmid, dma_addr_t ipa, uint8_t tg, &info); } -void smmu_iotlb_inv_asid(SMMUState *s, int asid) +void smmu_iotlb_inv_asid_vmid(SMMUState *s, int asid, int vmid) { - trace_smmu_iotlb_inv_asid(asid); - g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_asid, &asid); + SMMUIOTLBPageInvInfo info = { + .asid = asid, + .vmid = vmid, + }; + + trace_smmu_iotlb_inv_asid_vmid(asid, vmid); + g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_asid_vmid, &info); } void smmu_iotlb_inv_vmid(SMMUState *s, int vmid) diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index ebf29f3adf..847fc56676 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -1357,7 +1357,7 @@ static int smmuv3_cmdq_consume(SMMUv3State *s) trace_smmuv3_cmdq_tlbi_nh_asid(asid); smmu_inv_notifiers_all(&s->smmu_state); - smmu_iotlb_inv_asid(bs, asid); + smmu_iotlb_inv_asid_vmid(bs, asid, -1); break; } case SMMU_CMD_TLBI_NH_ALL: diff --git a/hw/arm/trace-events b/hw/arm/trace-events index 7d9c1703da..4aa71b1b19 100644 --- a/hw/arm/trace-events +++ b/hw/arm/trace-events @@ -11,7 +11,7 @@ smmu_ptw_page_pte(int stage, int level, uint64_t iova, uint64_t baseaddr, uint6 smmu_ptw_block_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t iova, uint64_t gpa, int bsize_mb) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" iova=0x%"PRIx64" block address = 0x%"PRIx64" block size = %d MiB" smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64 smmu_iotlb_inv_all(void) "IOTLB invalidate all" -smmu_iotlb_inv_asid(int asid) "IOTLB invalidate asid=%d" +smmu_iotlb_inv_asid_vmid(int asid, int vmid) "IOTLB invalidate asid=%d vmid=%d" smmu_iotlb_inv_vmid(int vmid) "IOTLB invalidate vmid=%d" smmu_iotlb_inv_iova(int asid, uint64_t addr) "IOTLB invalidate asid=%d addr=0x%"PRIx64 smmu_inv_notifiers_mr(const char *name) "iommu mr=%s" diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h index da9ff45fb5..eaee867e45 100644 --- a/include/hw/arm/smmu-common.h +++ b/include/hw/arm/smmu-common.h @@ -213,7 +213,7 @@ void smmu_iotlb_insert(SMMUState *bs, SMMUTransCfg *cfg, SMMUTLBEntry *entry); SMMUIOTLBKey smmu_get_iotlb_key(int asid, int vmid, uint64_t iova, uint8_t tg, uint8_t level); void smmu_iotlb_inv_all(SMMUState *s); -void smmu_iotlb_inv_asid(SMMUState *s, int asid); +void smmu_iotlb_inv_asid_vmid(SMMUState *s, int asid, int vmid); void smmu_iotlb_inv_vmid(SMMUState *s, int vmid); void smmu_iotlb_inv_iova(SMMUState *s, int asid, int vmid, dma_addr_t iova, uint8_t tg, uint64_t num_pages, uint8_t ttl); From b8fa4c233bd4235081080fca919159fd770db53f Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:14 +0000 Subject: [PATCH 17/26] hw/arm/smmu: Support nesting in the rest of commands MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Some commands need rework for nesting, as they used to assume S1 and S2 are mutually exclusive: - CMD_TLBI_NH_ASID: Consider VMID if stage-2 is supported - CMD_TLBI_NH_ALL: Consider VMID if stage-2 is supported, otherwise invalidate everything, this required a new vmid invalidation function for stage-1 only (ASID >= 0) Also, rework trace events to reflect the new implementation. Reviewed-by: Jean-Philippe Brucker Reviewed-by: Eric Auger Signed-off-by: Mostafa Saleh Reviewed-by: Alex Bennée Message-id: 20240715084519.1189624-15-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmu-common.c | 16 ++++++++++++++++ hw/arm/smmuv3.c | 28 ++++++++++++++++++++++++++-- hw/arm/trace-events | 4 +++- include/hw/arm/smmu-common.h | 1 + 4 files changed, 46 insertions(+), 3 deletions(-) diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index e7f9c758fa..00d7579cd3 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -178,6 +178,16 @@ static gboolean smmu_hash_remove_by_vmid(gpointer key, gpointer value, return SMMU_IOTLB_VMID(*iotlb_key) == vmid; } +static gboolean smmu_hash_remove_by_vmid_s1(gpointer key, gpointer value, + gpointer user_data) +{ + int vmid = *(int *)user_data; + SMMUIOTLBKey *iotlb_key = (SMMUIOTLBKey *)key; + + return (SMMU_IOTLB_VMID(*iotlb_key) == vmid) && + (SMMU_IOTLB_ASID(*iotlb_key) >= 0); +} + static gboolean smmu_hash_remove_by_asid_vmid_iova(gpointer key, gpointer value, gpointer user_data) { @@ -288,6 +298,12 @@ void smmu_iotlb_inv_vmid(SMMUState *s, int vmid) g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_vmid, &vmid); } +inline void smmu_iotlb_inv_vmid_s1(SMMUState *s, int vmid) +{ + trace_smmu_iotlb_inv_vmid_s1(vmid); + g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_vmid_s1, &vmid); +} + /* VMSAv8-64 Translation */ /** diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index 847fc56676..b05f2ab929 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -1349,25 +1349,49 @@ static int smmuv3_cmdq_consume(SMMUv3State *s) case SMMU_CMD_TLBI_NH_ASID: { int asid = CMD_ASID(&cmd); + int vmid = -1; if (!STAGE1_SUPPORTED(s)) { cmd_error = SMMU_CERROR_ILL; break; } + /* + * VMID is only matched when stage 2 is supported, otherwise set it + * to -1 as the value used for stage-1 only VMIDs. + */ + if (STAGE2_SUPPORTED(s)) { + vmid = CMD_VMID(&cmd); + } + trace_smmuv3_cmdq_tlbi_nh_asid(asid); smmu_inv_notifiers_all(&s->smmu_state); - smmu_iotlb_inv_asid_vmid(bs, asid, -1); + smmu_iotlb_inv_asid_vmid(bs, asid, vmid); break; } case SMMU_CMD_TLBI_NH_ALL: + { + int vmid = -1; + if (!STAGE1_SUPPORTED(s)) { cmd_error = SMMU_CERROR_ILL; break; } + + /* + * If stage-2 is supported, invalidate for this VMID only, otherwise + * invalidate the whole thing. + */ + if (STAGE2_SUPPORTED(s)) { + vmid = CMD_VMID(&cmd); + trace_smmuv3_cmdq_tlbi_nh(vmid); + smmu_iotlb_inv_vmid_s1(bs, vmid); + break; + } QEMU_FALLTHROUGH; + } case SMMU_CMD_TLBI_NSNH_ALL: - trace_smmuv3_cmdq_tlbi_nh(); + trace_smmuv3_cmdq_tlbi_nsnh(); smmu_inv_notifiers_all(&s->smmu_state); smmu_iotlb_inv_all(bs); break; diff --git a/hw/arm/trace-events b/hw/arm/trace-events index 4aa71b1b19..593cc571da 100644 --- a/hw/arm/trace-events +++ b/hw/arm/trace-events @@ -13,6 +13,7 @@ smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "base smmu_iotlb_inv_all(void) "IOTLB invalidate all" smmu_iotlb_inv_asid_vmid(int asid, int vmid) "IOTLB invalidate asid=%d vmid=%d" smmu_iotlb_inv_vmid(int vmid) "IOTLB invalidate vmid=%d" +smmu_iotlb_inv_vmid_s1(int vmid) "IOTLB invalidate vmid=%d" smmu_iotlb_inv_iova(int asid, uint64_t addr) "IOTLB invalidate asid=%d addr=0x%"PRIx64 smmu_inv_notifiers_mr(const char *name) "iommu mr=%s" smmu_iotlb_lookup_hit(int asid, int vmid, uint64_t addr, uint32_t hit, uint32_t miss, uint32_t p) "IOTLB cache HIT asid=%d vmid=%d addr=0x%"PRIx64" hit=%d miss=%d hit rate=%d" @@ -47,7 +48,8 @@ smmuv3_cmdq_cfgi_cd(uint32_t sid) "sid=0x%x" smmuv3_config_cache_hit(uint32_t sid, uint32_t hits, uint32_t misses, uint32_t perc) "Config cache HIT for sid=0x%x (hits=%d, misses=%d, hit rate=%d)" smmuv3_config_cache_miss(uint32_t sid, uint32_t hits, uint32_t misses, uint32_t perc) "Config cache MISS for sid=0x%x (hits=%d, misses=%d, hit rate=%d)" smmuv3_range_inval(int vmid, int asid, uint64_t addr, uint8_t tg, uint64_t num_pages, uint8_t ttl, bool leaf, int stage) "vmid=%d asid=%d addr=0x%"PRIx64" tg=%d num_pages=0x%"PRIx64" ttl=%d leaf=%d stage=%d" -smmuv3_cmdq_tlbi_nh(void) "" +smmuv3_cmdq_tlbi_nh(int vmid) "vmid=%d" +smmuv3_cmdq_tlbi_nsnh(void) "" smmuv3_cmdq_tlbi_nh_asid(int asid) "asid=%d" smmuv3_cmdq_tlbi_s12_vmid(int vmid) "vmid=%d" smmuv3_config_cache_inv(uint32_t sid) "Config cache INV for sid=0x%x" diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h index eaee867e45..d1a4a64551 100644 --- a/include/hw/arm/smmu-common.h +++ b/include/hw/arm/smmu-common.h @@ -215,6 +215,7 @@ SMMUIOTLBKey smmu_get_iotlb_key(int asid, int vmid, uint64_t iova, void smmu_iotlb_inv_all(SMMUState *s); void smmu_iotlb_inv_asid_vmid(SMMUState *s, int asid, int vmid); void smmu_iotlb_inv_vmid(SMMUState *s, int vmid); +void smmu_iotlb_inv_vmid_s1(SMMUState *s, int vmid); void smmu_iotlb_inv_iova(SMMUState *s, int asid, int vmid, dma_addr_t iova, uint8_t tg, uint64_t num_pages, uint8_t ttl); void smmu_iotlb_inv_ipa(SMMUState *s, int vmid, dma_addr_t ipa, uint8_t tg, From 46727727965f0d0769f167e300bf11aca77aadd6 Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:15 +0000 Subject: [PATCH 18/26] hw/arm/smmuv3: Support nested SMMUs in smmuv3_notify_iova() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit IOMMUTLBEvent only understands IOVA, for stage-1 or stage-2 SMMU instances we consider the input address as the IOVA, but when nesting is used, we can't mix stage-1 and stage-2 addresses, so for nesting only stage-1 is considered the IOVA and would be notified. Signed-off-by: Mostafa Saleh Reviewed-by: Jean-Philippe Brucker Reviewed-by: Eric Auger Reviewed-by: Alex Bennée Message-id: 20240715084519.1189624-16-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmuv3.c | 39 +++++++++++++++++++++++++-------------- hw/arm/trace-events | 2 +- 2 files changed, 26 insertions(+), 15 deletions(-) diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index b05f2ab929..a3cb30501e 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -1096,27 +1096,38 @@ epilogue: * @iova: iova * @tg: translation granule (if communicated through range invalidation) * @num_pages: number of @granule sized pages (if tg != 0), otherwise 1 + * @stage: Which stage(1 or 2) is used */ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr, IOMMUNotifier *n, int asid, int vmid, dma_addr_t iova, uint8_t tg, - uint64_t num_pages) + uint64_t num_pages, int stage) { SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu); + SMMUEventInfo eventinfo = {.inval_ste_allowed = true}; + SMMUTransCfg *cfg = smmuv3_get_config(sdev, &eventinfo); IOMMUTLBEvent event; uint8_t granule; - SMMUv3State *s = sdev->smmu; + + if (!cfg) { + return; + } + + /* + * stage is passed from TLB invalidation commands which can be either + * stage-1 or stage-2. + * However, IOMMUTLBEvent only understands IOVA, for stage-1 or stage-2 + * SMMU instances we consider the input address as the IOVA, but when + * nesting is used, we can't mix stage-1 and stage-2 addresses, so for + * nesting only stage-1 is considered the IOVA and would be notified. + */ + if ((stage == SMMU_STAGE_2) && (cfg->stage == SMMU_NESTED)) + return; if (!tg) { - SMMUEventInfo eventinfo = {.inval_ste_allowed = true}; - SMMUTransCfg *cfg = smmuv3_get_config(sdev, &eventinfo); SMMUTransTableInfo *tt; - if (!cfg) { - return; - } - if (asid >= 0 && cfg->asid != asid) { return; } @@ -1125,7 +1136,7 @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr, return; } - if (STAGE1_SUPPORTED(s)) { + if (stage == SMMU_STAGE_1) { tt = select_tt(cfg, iova); if (!tt) { return; @@ -1151,7 +1162,7 @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr, /* invalidate an asid/vmid/iova range tuple in all mr's */ static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, int vmid, dma_addr_t iova, uint8_t tg, - uint64_t num_pages) + uint64_t num_pages, int stage) { SMMUDevice *sdev; @@ -1160,10 +1171,10 @@ static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, int vmid, IOMMUNotifier *n; trace_smmuv3_inv_notifiers_iova(mr->parent_obj.name, asid, vmid, - iova, tg, num_pages); + iova, tg, num_pages, stage); IOMMU_NOTIFIER_FOREACH(n, mr) { - smmuv3_notify_iova(mr, n, asid, vmid, iova, tg, num_pages); + smmuv3_notify_iova(mr, n, asid, vmid, iova, tg, num_pages, stage); } } } @@ -1194,7 +1205,7 @@ static void smmuv3_range_inval(SMMUState *s, Cmd *cmd, SMMUStage stage) if (!tg) { trace_smmuv3_range_inval(vmid, asid, addr, tg, 1, ttl, leaf, stage); - smmuv3_inv_notifiers_iova(s, asid, vmid, addr, tg, 1); + smmuv3_inv_notifiers_iova(s, asid, vmid, addr, tg, 1, stage); if (stage == SMMU_STAGE_1) { smmu_iotlb_inv_iova(s, asid, vmid, addr, tg, 1, ttl); } else { @@ -1217,7 +1228,7 @@ static void smmuv3_range_inval(SMMUState *s, Cmd *cmd, SMMUStage stage) num_pages = (mask + 1) >> granule; trace_smmuv3_range_inval(vmid, asid, addr, tg, num_pages, ttl, leaf, stage); - smmuv3_inv_notifiers_iova(s, asid, vmid, addr, tg, num_pages); + smmuv3_inv_notifiers_iova(s, asid, vmid, addr, tg, num_pages, stage); if (stage == SMMU_STAGE_1) { smmu_iotlb_inv_iova(s, asid, vmid, addr, tg, num_pages, ttl); } else { diff --git a/hw/arm/trace-events b/hw/arm/trace-events index 593cc571da..be6c8f720b 100644 --- a/hw/arm/trace-events +++ b/hw/arm/trace-events @@ -55,7 +55,7 @@ smmuv3_cmdq_tlbi_s12_vmid(int vmid) "vmid=%d" smmuv3_config_cache_inv(uint32_t sid) "Config cache INV for sid=0x%x" smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s" smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s" -smmuv3_inv_notifiers_iova(const char *name, int asid, int vmid, uint64_t iova, uint8_t tg, uint64_t num_pages) "iommu mr=%s asid=%d vmid=%d iova=0x%"PRIx64" tg=%d num_pages=0x%"PRIx64 +smmuv3_inv_notifiers_iova(const char *name, int asid, int vmid, uint64_t iova, uint8_t tg, uint64_t num_pages, int stage) "iommu mr=%s asid=%d vmid=%d iova=0x%"PRIx64" tg=%d num_pages=0x%"PRIx64" stage=%d" # strongarm.c strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d" From f9131185937c16da9258a565bbfc543e238d323b Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:16 +0000 Subject: [PATCH 19/26] hw/arm/smmuv3: Handle translation faults according to SMMUPTWEventInfo MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Previously, to check if faults are enabled, it was sufficient to check the current stage of translation and check the corresponding record_faults flag. However, with nesting, it is possible for stage-1 (nested) translation to trigger a stage-2 fault, so we check SMMUPTWEventInfo as it would have the correct stage set from the page table walk. Signed-off-by: Mostafa Saleh Reviewed-by: Jean-Philippe Brucker Reviewed-by: Eric Auger Reviewed-by: Alex Bennée Message-id: 20240715084519.1189624-17-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmuv3.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index a3cb30501e..2d61637aed 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -34,9 +34,10 @@ #include "smmuv3-internal.h" #include "smmu-internal.h" -#define PTW_RECORD_FAULT(cfg) (((cfg)->stage == SMMU_STAGE_1) ? \ - (cfg)->record_faults : \ - (cfg)->s2cfg.record_faults) +#define PTW_RECORD_FAULT(ptw_info, cfg) (((ptw_info).stage == SMMU_STAGE_1 && \ + (cfg)->record_faults) || \ + ((ptw_info).stage == SMMU_STAGE_2 && \ + (cfg)->s2cfg.record_faults)) /** * smmuv3_trigger_irq - pulse @irq if enabled and update @@ -933,7 +934,7 @@ static SMMUTranslationStatus smmuv3_do_translate(SMMUv3State *s, hwaddr addr, event->u.f_walk_eabt.addr2 = ptw_info.addr; break; case SMMU_PTW_ERR_TRANSLATION: - if (PTW_RECORD_FAULT(cfg)) { + if (PTW_RECORD_FAULT(ptw_info, cfg)) { event->type = SMMU_EVT_F_TRANSLATION; event->u.f_translation.addr2 = ptw_info.addr; event->u.f_translation.class = class; @@ -941,7 +942,7 @@ static SMMUTranslationStatus smmuv3_do_translate(SMMUv3State *s, hwaddr addr, } break; case SMMU_PTW_ERR_ADDR_SIZE: - if (PTW_RECORD_FAULT(cfg)) { + if (PTW_RECORD_FAULT(ptw_info, cfg)) { event->type = SMMU_EVT_F_ADDR_SIZE; event->u.f_addr_size.addr2 = ptw_info.addr; event->u.f_addr_size.class = class; @@ -949,7 +950,7 @@ static SMMUTranslationStatus smmuv3_do_translate(SMMUv3State *s, hwaddr addr, } break; case SMMU_PTW_ERR_ACCESS: - if (PTW_RECORD_FAULT(cfg)) { + if (PTW_RECORD_FAULT(ptw_info, cfg)) { event->type = SMMU_EVT_F_ACCESS; event->u.f_access.addr2 = ptw_info.addr; event->u.f_access.class = class; @@ -957,7 +958,7 @@ static SMMUTranslationStatus smmuv3_do_translate(SMMUv3State *s, hwaddr addr, } break; case SMMU_PTW_ERR_PERMISSION: - if (PTW_RECORD_FAULT(cfg)) { + if (PTW_RECORD_FAULT(ptw_info, cfg)) { event->type = SMMU_EVT_F_PERMISSION; event->u.f_permission.addr2 = ptw_info.addr; event->u.f_permission.class = class; From 58377c363291d76d0a40be4be34f8bc42785e484 Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:17 +0000 Subject: [PATCH 20/26] hw/arm/smmuv3: Support and advertise nesting MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Everything is in place, consolidate parsing of STE cfg and setting translation stage. Advertise nesting if stage requested is "nested". Reviewed-by: Jean-Philippe Brucker Reviewed-by: Eric Auger Signed-off-by: Mostafa Saleh Reviewed-by: Alex Bennée Message-id: 20240715084519.1189624-18-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmuv3.c | 35 ++++++++++++++++++++++++++--------- 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index 2d61637aed..3db6c7c135 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -261,6 +261,9 @@ static void smmuv3_init_regs(SMMUv3State *s) /* Based on sys property, the stages supported in smmu will be advertised.*/ if (s->stage && !strcmp("2", s->stage)) { s->idr[0] = FIELD_DP32(s->idr[0], IDR0, S2P, 1); + } else if (s->stage && !strcmp("nested", s->stage)) { + s->idr[0] = FIELD_DP32(s->idr[0], IDR0, S1P, 1); + s->idr[0] = FIELD_DP32(s->idr[0], IDR0, S2P, 1); } else { s->idr[0] = FIELD_DP32(s->idr[0], IDR0, S1P, 1); } @@ -425,8 +428,6 @@ static bool s2_pgtable_config_valid(uint8_t sl0, uint8_t t0sz, uint8_t gran) static int decode_ste_s2_cfg(SMMUTransCfg *cfg, STE *ste) { - cfg->stage = SMMU_STAGE_2; - if (STE_S2AA64(ste) == 0x0) { qemu_log_mask(LOG_UNIMP, "SMMUv3 AArch32 tables not supported\n"); @@ -509,6 +510,27 @@ bad_ste: return -EINVAL; } +static void decode_ste_config(SMMUTransCfg *cfg, uint32_t config) +{ + + if (STE_CFG_ABORT(config)) { + cfg->aborted = true; + return; + } + if (STE_CFG_BYPASS(config)) { + cfg->bypassed = true; + return; + } + + if (STE_CFG_S1_ENABLED(config)) { + cfg->stage = SMMU_STAGE_1; + } + + if (STE_CFG_S2_ENABLED(config)) { + cfg->stage |= SMMU_STAGE_2; + } +} + /* Returns < 0 in case of invalid STE, 0 otherwise */ static int decode_ste(SMMUv3State *s, SMMUTransCfg *cfg, STE *ste, SMMUEventInfo *event) @@ -525,13 +547,9 @@ static int decode_ste(SMMUv3State *s, SMMUTransCfg *cfg, config = STE_CONFIG(ste); - if (STE_CFG_ABORT(config)) { - cfg->aborted = true; - return 0; - } + decode_ste_config(cfg, config); - if (STE_CFG_BYPASS(config)) { - cfg->bypassed = true; + if (cfg->aborted || cfg->bypassed) { return 0; } @@ -704,7 +722,6 @@ static int decode_cd(SMMUv3State *s, SMMUTransCfg *cfg, /* we support only those at the moment */ cfg->aa64 = true; - cfg->stage = SMMU_STAGE_1; cfg->oas = oas2bits(CD_IPS(cd)); cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas); From 6783a184bb381350c43b4e35b89cd188b40c7266 Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Mon, 15 Jul 2024 08:45:18 +0000 Subject: [PATCH 21/26] hw/arm/smmu: Refactor SMMU OAS MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit SMMUv3 OAS is currently hardcoded in the code to 44 bits, for nested configurations that can be a problem, as stage-2 might be shared with the CPU which might have different PARANGE, and according to SMMU manual ARM IHI 0070F.b: 6.3.6 SMMU_IDR5, OAS must match the system physical address size. This patch doesn't change the SMMU OAS, but refactors the code to make it easier to do that: - Rely everywhere on IDR5 for reading OAS instead of using the SMMU_IDR5_OAS macro, so, it is easier just to change IDR5 and it propagages correctly. - Add additional checks when OAS is greater than 48bits. - Remove unused functions/macros: pa_range/MAX_PA. Reviewed-by: Eric Auger Signed-off-by: Mostafa Saleh Reviewed-by: Jean-Philippe Brucker Reviewed-by: Alex Bennée Message-id: 20240715084519.1189624-19-smostafa@google.com Signed-off-by: Peter Maydell --- hw/arm/smmu-common.c | 7 ++++--- hw/arm/smmuv3-internal.h | 13 ------------- hw/arm/smmuv3.c | 35 ++++++++++++++++++++++++++++------- 3 files changed, 32 insertions(+), 23 deletions(-) diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index 00d7579cd3..d73ad62211 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -452,7 +452,8 @@ static int smmu_ptw_64_s1(SMMUState *bs, SMMUTransCfg *cfg, inputsize = 64 - tt->tsz; level = 4 - (inputsize - 4) / stride; indexmask = VMSA_IDXMSK(inputsize, stride, level); - baseaddr = extract64(tt->ttb, 0, 48); + + baseaddr = extract64(tt->ttb, 0, cfg->oas); baseaddr &= ~indexmask; while (level < VMSA_LEVELS) { @@ -576,8 +577,8 @@ static int smmu_ptw_64_s2(SMMUTransCfg *cfg, * Get the ttb from concatenated structure. * The offset is the idx * size of each ttb(number of ptes * (sizeof(pte)) */ - uint64_t baseaddr = extract64(cfg->s2cfg.vttb, 0, 48) + (1 << stride) * - idx * sizeof(uint64_t); + uint64_t baseaddr = extract64(cfg->s2cfg.vttb, 0, cfg->s2cfg.eff_ps) + + (1 << stride) * idx * sizeof(uint64_t); dma_addr_t indexmask = VMSA_IDXMSK(inputsize, stride, level); baseaddr &= ~indexmask; diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h index 0f3ecec804..0ebf2eebcf 100644 --- a/hw/arm/smmuv3-internal.h +++ b/hw/arm/smmuv3-internal.h @@ -602,19 +602,6 @@ static inline int oas2bits(int oas_field) return -1; } -static inline int pa_range(STE *ste) -{ - int oas_field = MIN(STE_S2PS(ste), SMMU_IDR5_OAS); - - if (!STE_S2AA64(ste)) { - return 40; - } - - return oas2bits(oas_field); -} - -#define MAX_PA(ste) ((1 << pa_range(ste)) - 1) - /* CD fields */ #define CD_VALID(x) extract32((x)->word[0], 31, 1) diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index 3db6c7c135..3971976389 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -402,10 +402,10 @@ static bool s2t0sz_valid(SMMUTransCfg *cfg) } if (cfg->s2cfg.granule_sz == 16) { - return (cfg->s2cfg.tsz >= 64 - oas2bits(SMMU_IDR5_OAS)); + return (cfg->s2cfg.tsz >= 64 - cfg->s2cfg.eff_ps); } - return (cfg->s2cfg.tsz >= MAX(64 - oas2bits(SMMU_IDR5_OAS), 16)); + return (cfg->s2cfg.tsz >= MAX(64 - cfg->s2cfg.eff_ps, 16)); } /* @@ -426,8 +426,11 @@ static bool s2_pgtable_config_valid(uint8_t sl0, uint8_t t0sz, uint8_t gran) return nr_concat <= VMSA_MAX_S2_CONCAT; } -static int decode_ste_s2_cfg(SMMUTransCfg *cfg, STE *ste) +static int decode_ste_s2_cfg(SMMUv3State *s, SMMUTransCfg *cfg, + STE *ste) { + uint8_t oas = FIELD_EX32(s->idr[5], IDR5, OAS); + if (STE_S2AA64(ste) == 0x0) { qemu_log_mask(LOG_UNIMP, "SMMUv3 AArch32 tables not supported\n"); @@ -460,7 +463,15 @@ static int decode_ste_s2_cfg(SMMUTransCfg *cfg, STE *ste) } /* For AA64, The effective S2PS size is capped to the OAS. */ - cfg->s2cfg.eff_ps = oas2bits(MIN(STE_S2PS(ste), SMMU_IDR5_OAS)); + cfg->s2cfg.eff_ps = oas2bits(MIN(STE_S2PS(ste), oas)); + /* + * For SMMUv3.1 and later, when OAS == IAS == 52, the stage 2 input + * range is further limited to 48 bits unless STE.S2TG indicates a + * 64KB granule. + */ + if (cfg->s2cfg.granule_sz != 16) { + cfg->s2cfg.eff_ps = MIN(cfg->s2cfg.eff_ps, 48); + } /* * It is ILLEGAL for the address in S2TTB to be outside the range * described by the effective S2PS value. @@ -536,6 +547,7 @@ static int decode_ste(SMMUv3State *s, SMMUTransCfg *cfg, STE *ste, SMMUEventInfo *event) { uint32_t config; + uint8_t oas = FIELD_EX32(s->idr[5], IDR5, OAS); int ret; if (!STE_VALID(ste)) { @@ -579,8 +591,8 @@ static int decode_ste(SMMUv3State *s, SMMUTransCfg *cfg, * Stage-1 OAS defaults to OAS even if not enabled as it would be used * in input address check for stage-2. */ - cfg->oas = oas2bits(SMMU_IDR5_OAS); - ret = decode_ste_s2_cfg(cfg, ste); + cfg->oas = oas2bits(oas); + ret = decode_ste_s2_cfg(s, cfg, ste); if (ret) { goto bad_ste; } @@ -706,6 +718,7 @@ static int decode_cd(SMMUv3State *s, SMMUTransCfg *cfg, int i; SMMUTranslationStatus status; SMMUTLBEntry *entry; + uint8_t oas = FIELD_EX32(s->idr[5], IDR5, OAS); if (!CD_VALID(cd) || !CD_AARCH64(cd)) { goto bad_cd; @@ -724,7 +737,7 @@ static int decode_cd(SMMUv3State *s, SMMUTransCfg *cfg, cfg->aa64 = true; cfg->oas = oas2bits(CD_IPS(cd)); - cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas); + cfg->oas = MIN(oas2bits(oas), cfg->oas); cfg->tbi = CD_TBI(cd); cfg->asid = CD_ASID(cd); cfg->affd = CD_AFFD(cd); @@ -753,6 +766,14 @@ static int decode_cd(SMMUv3State *s, SMMUTransCfg *cfg, goto bad_cd; } + /* + * An address greater than 48 bits in size can only be output from a + * TTD when, in SMMUv3.1 and later, the effective IPS is 52 and a 64KB + * granule is in use for that translation table + */ + if (tt->granule_sz != 16) { + cfg->oas = MIN(cfg->oas, 48); + } tt->tsz = tsz; tt->ttb = CD_TTB(cd, i); From 31d93fedf41c24b0badb38cd9317590d1ef74e37 Mon Sep 17 00:00:00 2001 From: Daniyal Khan Date: Wed, 17 Jul 2024 16:01:47 +1000 Subject: [PATCH 22/26] target/arm: Use float_status copy in sme_fmopa_s MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit We made a copy above because the fp exception flags are not propagated back to the FPST register, but then failed to use the copy. Cc: qemu-stable@nongnu.org Fixes: 558e956c719 ("target/arm: Implement FMOPA, FMOPS (non-widening)") Signed-off-by: Daniyal Khan Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Alex Bennée Message-id: 20240717060149.204788-2-richard.henderson@linaro.org [rth: Split from a larger patch] Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Alex Bennée Signed-off-by: Peter Maydell --- target/arm/tcg/sme_helper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c index e2e0575039..5a6dd76489 100644 --- a/target/arm/tcg/sme_helper.c +++ b/target/arm/tcg/sme_helper.c @@ -916,7 +916,7 @@ void HELPER(sme_fmopa_s)(void *vza, void *vzn, void *vzm, void *vpn, if (pb & 1) { uint32_t *a = vza_row + H1_4(col); uint32_t *m = vzm + H1_4(col); - *a = float32_muladd(n, *m, *a, 0, vst); + *a = float32_muladd(n, *m, *a, 0, &fpst); } col += 4; pb >>= 4; From 207d30b5fdb5b45a36f26eefcf52fe2c1714dd4f Mon Sep 17 00:00:00 2001 From: Richard Henderson Date: Wed, 17 Jul 2024 16:01:48 +1000 Subject: [PATCH 23/26] target/arm: Use FPST_F16 for SME FMOPA (widening) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This operation has float16 inputs and thus must use the FZ16 control not the FZ control. Cc: qemu-stable@nongnu.org Fixes: 3916841ac75 ("target/arm: Implement FMOPA, FMOPS (widening)") Reported-by: Daniyal Khan Signed-off-by: Richard Henderson Reviewed-by: Alex Bennée Message-id: 20240717060149.204788-3-richard.henderson@linaro.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2374 Signed-off-by: Richard Henderson Reviewed-by: Alex Bennée Signed-off-by: Peter Maydell --- target/arm/tcg/translate-sme.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c index 46c7fce8b4..185a8a917b 100644 --- a/target/arm/tcg/translate-sme.c +++ b/target/arm/tcg/translate-sme.c @@ -304,6 +304,7 @@ static bool do_outprod(DisasContext *s, arg_op *a, MemOp esz, } static bool do_outprod_fpst(DisasContext *s, arg_op *a, MemOp esz, + ARMFPStatusFlavour e_fpst, gen_helper_gvec_5_ptr *fn) { int svl = streaming_vec_reg_size(s); @@ -319,15 +320,18 @@ static bool do_outprod_fpst(DisasContext *s, arg_op *a, MemOp esz, zm = vec_full_reg_ptr(s, a->zm); pn = pred_full_reg_ptr(s, a->pn); pm = pred_full_reg_ptr(s, a->pm); - fpst = fpstatus_ptr(FPST_FPCR); + fpst = fpstatus_ptr(e_fpst); fn(za, zn, zm, pn, pm, fpst, tcg_constant_i32(desc)); return true; } -TRANS_FEAT(FMOPA_h, aa64_sme, do_outprod_fpst, a, MO_32, gen_helper_sme_fmopa_h) -TRANS_FEAT(FMOPA_s, aa64_sme, do_outprod_fpst, a, MO_32, gen_helper_sme_fmopa_s) -TRANS_FEAT(FMOPA_d, aa64_sme_f64f64, do_outprod_fpst, a, MO_64, gen_helper_sme_fmopa_d) +TRANS_FEAT(FMOPA_h, aa64_sme, do_outprod_fpst, a, + MO_32, FPST_FPCR_F16, gen_helper_sme_fmopa_h) +TRANS_FEAT(FMOPA_s, aa64_sme, do_outprod_fpst, a, + MO_32, FPST_FPCR, gen_helper_sme_fmopa_s) +TRANS_FEAT(FMOPA_d, aa64_sme_f64f64, do_outprod_fpst, a, + MO_64, FPST_FPCR, gen_helper_sme_fmopa_d) /* TODO: FEAT_EBF16 */ TRANS_FEAT(BFMOPA, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_bfmopa) From f103ecccb68184b53a9c9bd80536cf2d4080f834 Mon Sep 17 00:00:00 2001 From: Daniyal Khan Date: Wed, 17 Jul 2024 16:01:49 +1000 Subject: [PATCH 24/26] tests/tcg/aarch64: Add test cases for SME FMOPA (widening) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Daniyal Khan Signed-off-by: Richard Henderson Reviewed-by: Alex Bennée Message-id: 20240717060149.204788-4-richard.henderson@linaro.org Message-Id: 172090222034.13953.16888708708822922098-1@git.sr.ht [rth: Split test from a larger patch, tidy assembly] Signed-off-by: Richard Henderson Reviewed-by: Alex Bennée Signed-off-by: Peter Maydell --- tests/tcg/aarch64/Makefile.target | 5 ++- tests/tcg/aarch64/sme-fmopa-1.c | 63 +++++++++++++++++++++++++++++++ tests/tcg/aarch64/sme-fmopa-2.c | 56 +++++++++++++++++++++++++++ tests/tcg/aarch64/sme-fmopa-3.c | 63 +++++++++++++++++++++++++++++++ 4 files changed, 185 insertions(+), 2 deletions(-) create mode 100644 tests/tcg/aarch64/sme-fmopa-1.c create mode 100644 tests/tcg/aarch64/sme-fmopa-2.c create mode 100644 tests/tcg/aarch64/sme-fmopa-3.c diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target index b53218e115..8cc62eb456 100644 --- a/tests/tcg/aarch64/Makefile.target +++ b/tests/tcg/aarch64/Makefile.target @@ -70,8 +70,9 @@ endif # SME Tests ifneq ($(CROSS_AS_HAS_ARMV9_SME),) -AARCH64_TESTS += sme-outprod1 sme-smopa-1 sme-smopa-2 -sme-outprod1 sme-smopa-1 sme-smopa-2: CFLAGS += $(CROSS_AS_HAS_ARMV9_SME) +SME_TESTS = sme-outprod1 sme-smopa-1 sme-smopa-2 sme-fmopa-1 sme-fmopa-2 sme-fmopa-3 +AARCH64_TESTS += $(SME_TESTS) +$(SME_TESTS): CFLAGS += $(CROSS_AS_HAS_ARMV9_SME) endif # System Registers Tests diff --git a/tests/tcg/aarch64/sme-fmopa-1.c b/tests/tcg/aarch64/sme-fmopa-1.c new file mode 100644 index 0000000000..652c4ea090 --- /dev/null +++ b/tests/tcg/aarch64/sme-fmopa-1.c @@ -0,0 +1,63 @@ +/* + * SME outer product, 1 x 1. + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include + +static void foo(float *dst) +{ + asm(".arch_extension sme\n\t" + "smstart\n\t" + "ptrue p0.s, vl4\n\t" + "fmov z0.s, #1.0\n\t" + /* + * An outer product of a vector of 1.0 by itself should be a matrix of 1.0. + * Note that we are using tile 1 here (za1.s) rather than tile 0. + */ + "zero {za}\n\t" + "fmopa za1.s, p0/m, p0/m, z0.s, z0.s\n\t" + /* + * Read the first 4x4 sub-matrix of elements from tile 1: + * Note that za1h should be interchangeable here. + */ + "mov w12, #0\n\t" + "mova z0.s, p0/m, za1v.s[w12, #0]\n\t" + "mova z1.s, p0/m, za1v.s[w12, #1]\n\t" + "mova z2.s, p0/m, za1v.s[w12, #2]\n\t" + "mova z3.s, p0/m, za1v.s[w12, #3]\n\t" + /* + * And store them to the input pointer (dst in the C code): + */ + "st1w {z0.s}, p0, [%0]\n\t" + "add x0, x0, #16\n\t" + "st1w {z1.s}, p0, [x0]\n\t" + "add x0, x0, #16\n\t" + "st1w {z2.s}, p0, [x0]\n\t" + "add x0, x0, #16\n\t" + "st1w {z3.s}, p0, [x0]\n\t" + "smstop" + : : "r"(dst) + : "x12", "d0", "d1", "d2", "d3", "memory"); +} + +int main() +{ + float dst[16] = { }; + + foo(dst); + + for (int i = 0; i < 16; i++) { + if (dst[i] != 1.0f) { + goto failure; + } + } + /* success */ + return 0; + + failure: + for (int i = 0; i < 16; i++) { + printf("%f%c", dst[i], i % 4 == 3 ? '\n' : ' '); + } + return 1; +} diff --git a/tests/tcg/aarch64/sme-fmopa-2.c b/tests/tcg/aarch64/sme-fmopa-2.c new file mode 100644 index 0000000000..15f0972d83 --- /dev/null +++ b/tests/tcg/aarch64/sme-fmopa-2.c @@ -0,0 +1,56 @@ +/* + * SME outer product, FZ vs FZ16 + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include +#include + +static void test_fmopa(uint32_t *result) +{ + asm(".arch_extension sme\n\t" + "smstart\n\t" /* Z*, P* and ZArray cleared */ + "ptrue p2.b, vl16\n\t" /* Limit vector length to 16 */ + "ptrue p5.b, vl16\n\t" + "movi d0, #0x00ff\n\t" /* fp16 denormal */ + "movi d16, #0x00ff\n\t" + "mov w15, #0x0001000000\n\t" /* FZ=1, FZ16=0 */ + "msr fpcr, x15\n\t" + "fmopa za3.s, p2/m, p5/m, z16.h, z0.h\n\t" + "mov w15, #0\n\t" + "st1w {za3h.s[w15, 0]}, p2, [%0]\n\t" + "add %0, %0, #16\n\t" + "st1w {za3h.s[w15, 1]}, p2, [%0]\n\t" + "mov w15, #2\n\t" + "add %0, %0, #16\n\t" + "st1w {za3h.s[w15, 0]}, p2, [%0]\n\t" + "add %0, %0, #16\n\t" + "st1w {za3h.s[w15, 1]}, p2, [%0]\n\t" + "smstop" + : "+r"(result) : + : "x15", "x16", "p2", "p5", "d0", "d16", "memory"); +} + +int main(void) +{ + uint32_t result[4 * 4] = { }; + + test_fmopa(result); + + if (result[0] != 0x2f7e0100) { + printf("Test failed: Incorrect output in first 4 bytes\n" + "Expected: %08x\n" + "Got: %08x\n", + 0x2f7e0100, result[0]); + return 1; + } + + for (int i = 1; i < 16; ++i) { + if (result[i] != 0) { + printf("Test failed: Non-zero word at position %d\n", i); + return 1; + } + } + + return 0; +} diff --git a/tests/tcg/aarch64/sme-fmopa-3.c b/tests/tcg/aarch64/sme-fmopa-3.c new file mode 100644 index 0000000000..3bfec34914 --- /dev/null +++ b/tests/tcg/aarch64/sme-fmopa-3.c @@ -0,0 +1,63 @@ +/* + * SME outer product, [ 1 2 3 4 ] squared + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include +#include +#include +#include + +static const float i_1234[4] = { + 1.0f, 2.0f, 3.0f, 4.0f +}; + +static const float expected[4] = { + 4.515625f, 5.750000f, 6.984375f, 8.218750f +}; + +static void test_fmopa(float *result) +{ + asm(".arch_extension sme\n\t" + "smstart\n\t" /* ZArray cleared */ + "ptrue p2.b, vl16\n\t" /* Limit vector length to 16 */ + "ld1w {z0.s}, p2/z, [%1]\n\t" + "mov w15, #0\n\t" + "mov za3h.s[w15, 0], p2/m, z0.s\n\t" + "mov za3h.s[w15, 1], p2/m, z0.s\n\t" + "mov w15, #2\n\t" + "mov za3h.s[w15, 0], p2/m, z0.s\n\t" + "mov za3h.s[w15, 1], p2/m, z0.s\n\t" + "msr fpcr, xzr\n\t" + "fmopa za3.s, p2/m, p2/m, z0.h, z0.h\n\t" + "mov w15, #0\n\t" + "st1w {za3h.s[w15, 0]}, p2, [%0]\n" + "add %0, %0, #16\n\t" + "st1w {za3h.s[w15, 1]}, p2, [%0]\n\t" + "mov w15, #2\n\t" + "add %0, %0, #16\n\t" + "st1w {za3h.s[w15, 0]}, p2, [%0]\n\t" + "add %0, %0, #16\n\t" + "st1w {za3h.s[w15, 1]}, p2, [%0]\n\t" + "smstop" + : "+r"(result) : "r"(i_1234) + : "x15", "x16", "p2", "d0", "memory"); +} + +int main(void) +{ + float result[4 * 4] = { }; + int ret = 0; + + test_fmopa(result); + + for (int i = 0; i < 4; i++) { + float actual = result[i]; + if (fabsf(actual - expected[i]) > 0.001f) { + printf("Test failed at element %d: Expected %f, got %f\n", + i, expected[i], actual); + ret = 1; + } + } + return ret; +} From 71328d82896642d79d4e538e5a56c5e6762a219a Mon Sep 17 00:00:00 2001 From: Akihiko Odaki Date: Tue, 16 Jul 2024 21:50:30 +0900 Subject: [PATCH 25/26] tests/arm-cpu-features: Do not assume PMU availability MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Asahi Linux supports KVM but lacks PMU support. Signed-off-by: Akihiko Odaki Reviewed-by: Philippe Mathieu-Daudé Message-id: 20240716-pmu-v3-1-8c7c1858a227@daynix.com Signed-off-by: Peter Maydell --- tests/qtest/arm-cpu-features.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/tests/qtest/arm-cpu-features.c b/tests/qtest/arm-cpu-features.c index 966c65d5c3..cfd6f77353 100644 --- a/tests/qtest/arm-cpu-features.c +++ b/tests/qtest/arm-cpu-features.c @@ -509,6 +509,7 @@ static void test_query_cpu_model_expansion_kvm(const void *data) assert_set_feature(qts, "host", "kvm-no-adjvtime", false); if (g_str_equal(qtest_get_arch(), "aarch64")) { + bool kvm_supports_pmu; bool kvm_supports_steal_time; bool kvm_supports_sve; char max_name[8], name[8]; @@ -537,11 +538,6 @@ static void test_query_cpu_model_expansion_kvm(const void *data) assert_has_feature_enabled(qts, "host", "aarch64"); - /* Enabling and disabling pmu should always work. */ - assert_has_feature_enabled(qts, "host", "pmu"); - assert_set_feature(qts, "host", "pmu", false); - assert_set_feature(qts, "host", "pmu", true); - /* * Some features would be enabled by default, but they're disabled * because this instance of KVM doesn't support them. Test that the @@ -551,11 +547,18 @@ static void test_query_cpu_model_expansion_kvm(const void *data) assert_has_feature(qts, "host", "sve"); resp = do_query_no_props(qts, "host"); + kvm_supports_pmu = resp_get_feature(resp, "pmu"); kvm_supports_steal_time = resp_get_feature(resp, "kvm-steal-time"); kvm_supports_sve = resp_get_feature(resp, "sve"); vls = resp_get_sve_vls(resp); qobject_unref(resp); + if (kvm_supports_pmu) { + /* If we have pmu then we should be able to toggle it. */ + assert_set_feature(qts, "host", "pmu", false); + assert_set_feature(qts, "host", "pmu", true); + } + if (kvm_supports_steal_time) { /* If we have steal-time then we should be able to toggle it. */ assert_set_feature(qts, "host", "kvm-steal-time", false); From 30a1690f2402e6c1582d5b3ebcf7940bfe2fad4b Mon Sep 17 00:00:00 2001 From: Akihiko Odaki Date: Tue, 16 Jul 2024 21:50:33 +0900 Subject: [PATCH 26/26] hvf: arm: Do not advance PC when raising an exception hvf did not advance PC when raising an exception for most unhandled system registers, but it mistakenly advanced PC when raising an exception for GICv3 registers. Cc: qemu-stable@nongnu.org Fixes: a2260983c655 ("hvf: arm: Add support for GICv3") Signed-off-by: Akihiko Odaki Message-id: 20240716-pmu-v3-4-8c7c1858a227@daynix.com Reviewed-by: Peter Maydell Signed-off-by: Peter Maydell --- target/arm/hvf/hvf.c | 1 + 1 file changed, 1 insertion(+) diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c index ef9bc42738..eb090e67a2 100644 --- a/target/arm/hvf/hvf.c +++ b/target/arm/hvf/hvf.c @@ -1278,6 +1278,7 @@ static int hvf_sysreg_read(CPUState *cpu, uint32_t reg, uint32_t rt) /* Call the TCG sysreg handler. This is only safe for GICv3 regs. */ if (!hvf_sysreg_read_cp(cpu, reg, &val)) { hvf_raise_exception(cpu, EXCP_UDEF, syn_uncategorized()); + return 1; } break; case SYSREG_DBGBVR0_EL1: