mirror of https://github.com/xemu-project/xemu.git
![]() The function is a no-op if 'vta' is zero but we're still doing a lot of stuff in this function regardless. vext_set_elems_1s() will ignore every single time (since vta is zero) and we just wasted time. Skip it altogether in this case. Aside from the code simplification there's a noticeable emulation performance gain by doing it. For a regular C binary that does a vectors operation like this: ======= #define SZ 10000000 int main () { int *a = malloc (SZ * sizeof (int)); int *b = malloc (SZ * sizeof (int)); int *c = malloc (SZ * sizeof (int)); for (int i = 0; i < SZ; i++) c[i] = a[i] + b[i]; return c[SZ - 1]; } ======= Emulating it with qemu-riscv64 and RVV takes ~0.3 sec: $ time ~/work/qemu/build/qemu-riscv64 \ -cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out real 0m0.303s user 0m0.281s sys 0m0.023s With this skip we take ~0.275 sec: $ time ~/work/qemu/build/qemu-riscv64 \ -cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out real 0m0.274s user 0m0.252s sys 0m0.019s This performance gain adds up fast when executing heavy benchmarks like SPEC. Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn> Message-Id: <20230427205708.246679-2-dbarboza@ventanamicro.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com> |
||
---|---|---|
.. | ||
alpha | ||
arm | ||
avr | ||
cris | ||
hexagon | ||
hppa | ||
i386 | ||
loongarch | ||
m68k | ||
microblaze | ||
mips | ||
nios2 | ||
openrisc | ||
ppc | ||
riscv | ||
rx | ||
s390x | ||
sh4 | ||
sparc | ||
tricore | ||
xtensa | ||
Kconfig | ||
meson.build |