From 7237c7ce772794d6be10b8b987fe4d02dfd76562 Mon Sep 17 00:00:00 2001 From: Yanan Wang <wangyanan55@huawei.com> Date: Wed, 29 Sep 2021 10:58:01 +0800 Subject: [PATCH 01/29] qapi/machine: Fix an incorrect comment of SMPConfiguration MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The explanation of @cores should be "number of cores per die" but not "number of cores per thread". Let's fix it. Fixes: 1e63fe685804 ("machine: pass QAPI struct to mc->smp_parse") Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-2-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- qapi/machine.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qapi/machine.json b/qapi/machine.json index f1c4983b64..0e91a57a76 100644 --- a/qapi/machine.json +++ b/qapi/machine.json @@ -1393,7 +1393,7 @@ # # @dies: number of dies per socket in the CPU topology # -# @cores: number of cores per thread in the CPU topology +# @cores: number of cores per die in the CPU topology # # @threads: number of threads per core in the CPU topology # From c2511b1632e109130df524121dfb7d2413216d3c Mon Sep 17 00:00:00 2001 From: Yanan Wang <wangyanan55@huawei.com> Date: Wed, 29 Sep 2021 10:58:02 +0800 Subject: [PATCH 02/29] machine: Deprecate "parameter=0" SMP configurations MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In the SMP configuration, we should either provide a topology parameter with a reasonable value (greater than zero) or just omit it and QEMU will compute the missing value. The users shouldn't provide a configuration with any parameter of it specified as zero (e.g. -smp 8,sockets=0) which could possibly cause unexpected results in the -smp parsing. So we deprecate this kind of configurations since 6.2 by adding the explicit sanity check. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-3-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- docs/about/deprecated.rst | 15 +++++++++++++++ hw/core/machine.c | 14 ++++++++++++++ qemu-options.hx | 12 +++++++----- 3 files changed, 36 insertions(+), 5 deletions(-) diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst index 2f7db9a98d..0bed6ecb1d 100644 --- a/docs/about/deprecated.rst +++ b/docs/about/deprecated.rst @@ -160,6 +160,21 @@ Use ``-display sdl`` instead. Use ``-display curses`` instead. +``-smp`` ("parameter=0" SMP configurations) (since 6.2) +''''''''''''''''''''''''''''''''''''''''''''''''''''''' + +Specified CPU topology parameters must be greater than zero. + +In the SMP configuration, users should either provide a CPU topology +parameter with a reasonable value (greater than zero) or just omit it +and QEMU will compute the missing value. + +However, historically it was implicitly allowed for users to provide +a parameter with zero value, which is meaningless and could also possibly +cause unexpected results in the -smp parsing. So support for this kind of +configurations (e.g. -smp 8,sockets=0) is deprecated since 6.2 and will +be removed in the near future, users have to ensure that all the topology +members described with -smp are greater than zero. Plugin argument passing through ``arg=<string>`` (since 6.1) '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' diff --git a/hw/core/machine.c b/hw/core/machine.c index 067f42b528..4e409261c9 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -835,6 +835,20 @@ static void machine_set_smp(Object *obj, Visitor *v, const char *name, return; } + /* + * Specified CPU topology parameters must be greater than zero, + * explicit configuration like "cpus=0" is not allowed. + */ + if ((config->has_cpus && config->cpus == 0) || + (config->has_sockets && config->sockets == 0) || + (config->has_dies && config->dies == 0) || + (config->has_cores && config->cores == 0) || + (config->has_threads && config->threads == 0) || + (config->has_maxcpus && config->maxcpus == 0)) { + warn_report("Deprecated CPU topology (considered invalid): " + "CPU topology parameters must be greater than zero"); + } + mc->smp_parse(ms, config, errp); if (*errp) { goto out_free; diff --git a/qemu-options.hx b/qemu-options.hx index ceca52818a..4a092a092a 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -233,11 +233,13 @@ SRST of computing the CPU maximum count. Either the initial CPU count, or at least one of the topology parameters - must be specified. Values for any omitted parameters will be computed - from those which are given. Historically preference was given to the - coarsest topology parameters when computing missing values (ie sockets - preferred over cores, which were preferred over threads), however, this - behaviour is considered liable to change. + must be specified. The specified parameters must be greater than zero, + explicit configuration like "cpus=0" is not allowed. Values for any + omitted parameters will be computed from those which are given. + Historically preference was given to the coarsest topology parameters + when computing missing values (ie sockets preferred over cores, which + were preferred over threads), however, this behaviour is considered + liable to change. ERST DEF("numa", HAS_ARG, QEMU_OPTION_numa, From 5d8b5a505571b7927095015c805646f78fc56578 Mon Sep 17 00:00:00 2001 From: Yanan Wang <wangyanan55@huawei.com> Date: Wed, 29 Sep 2021 10:58:03 +0800 Subject: [PATCH 03/29] machine: Minor refactor/fix for the smp parsers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit To pave the way for the functional improvement in later patches, make some refactor/cleanup for the smp parsers, including using local maxcpus instead of ms->smp.max_cpus in the calculation, defaulting dies to 0 initially like other members, cleanup the sanity check for dies. We actually also fix a hidden defect by avoiding directly using the provided *zero value* in the calculation, which could cause a segment fault (e.g. using dies=0 in the calculation). Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-4-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/core/machine.c | 18 ++++++++++-------- hw/i386/pc.c | 23 ++++++++++++++--------- 2 files changed, 24 insertions(+), 17 deletions(-) diff --git a/hw/core/machine.c b/hw/core/machine.c index 4e409261c9..8e719e2932 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -752,8 +752,9 @@ static void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) unsigned sockets = config->has_sockets ? config->sockets : 0; unsigned cores = config->has_cores ? config->cores : 0; unsigned threads = config->has_threads ? config->threads : 0; + unsigned maxcpus = config->has_maxcpus ? config->maxcpus : 0; - if (config->has_dies && config->dies != 0 && config->dies != 1) { + if (config->has_dies && config->dies > 1) { error_setg(errp, "dies not supported by this machine's CPU topology"); return; } @@ -766,8 +767,8 @@ static void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) sockets = sockets > 0 ? sockets : 1; cpus = cores * threads * sockets; } else { - ms->smp.max_cpus = config->has_maxcpus ? config->maxcpus : cpus; - sockets = ms->smp.max_cpus / (cores * threads); + maxcpus = maxcpus > 0 ? maxcpus : cpus; + sockets = maxcpus / (cores * threads); } } else if (cores == 0) { threads = threads > 0 ? threads : 1; @@ -784,26 +785,27 @@ static void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) return; } - ms->smp.max_cpus = config->has_maxcpus ? config->maxcpus : cpus; + maxcpus = maxcpus > 0 ? maxcpus : cpus; - if (ms->smp.max_cpus < cpus) { + if (maxcpus < cpus) { error_setg(errp, "maxcpus must be equal to or greater than smp"); return; } - if (sockets * cores * threads != ms->smp.max_cpus) { + if (sockets * cores * threads != maxcpus) { error_setg(errp, "Invalid CPU topology: " "sockets (%u) * cores (%u) * threads (%u) " "!= maxcpus (%u)", sockets, cores, threads, - ms->smp.max_cpus); + maxcpus); return; } ms->smp.cpus = cpus; + ms->smp.sockets = sockets; ms->smp.cores = cores; ms->smp.threads = threads; - ms->smp.sockets = sockets; + ms->smp.max_cpus = maxcpus; } static void machine_get_smp(Object *obj, Visitor *v, const char *name, diff --git a/hw/i386/pc.c b/hw/i386/pc.c index df457eceba..92c78d9933 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -718,9 +718,13 @@ static void pc_smp_parse(MachineState *ms, SMPConfiguration *config, Error **err { unsigned cpus = config->has_cpus ? config->cpus : 0; unsigned sockets = config->has_sockets ? config->sockets : 0; - unsigned dies = config->has_dies ? config->dies : 1; + unsigned dies = config->has_dies ? config->dies : 0; unsigned cores = config->has_cores ? config->cores : 0; unsigned threads = config->has_threads ? config->threads : 0; + unsigned maxcpus = config->has_maxcpus ? config->maxcpus : 0; + + /* directly default dies to 1 if it's omitted */ + dies = dies > 0 ? dies : 1; /* compute missing values, prefer sockets over cores over threads */ if (cpus == 0 || sockets == 0) { @@ -730,8 +734,8 @@ static void pc_smp_parse(MachineState *ms, SMPConfiguration *config, Error **err sockets = sockets > 0 ? sockets : 1; cpus = cores * threads * dies * sockets; } else { - ms->smp.max_cpus = config->has_maxcpus ? config->maxcpus : cpus; - sockets = ms->smp.max_cpus / (cores * threads * dies); + maxcpus = maxcpus > 0 ? maxcpus : cpus; + sockets = maxcpus / (dies * cores * threads); } } else if (cores == 0) { threads = threads > 0 ? threads : 1; @@ -748,27 +752,28 @@ static void pc_smp_parse(MachineState *ms, SMPConfiguration *config, Error **err return; } - ms->smp.max_cpus = config->has_maxcpus ? config->maxcpus : cpus; + maxcpus = maxcpus > 0 ? maxcpus : cpus; - if (ms->smp.max_cpus < cpus) { + if (maxcpus < cpus) { error_setg(errp, "maxcpus must be equal to or greater than smp"); return; } - if (sockets * dies * cores * threads != ms->smp.max_cpus) { + if (sockets * dies * cores * threads != maxcpus) { error_setg(errp, "Invalid CPU topology deprecated: " "sockets (%u) * dies (%u) * cores (%u) * threads (%u) " "!= maxcpus (%u)", sockets, dies, cores, threads, - ms->smp.max_cpus); + maxcpus); return; } ms->smp.cpus = cpus; - ms->smp.cores = cores; - ms->smp.threads = threads; ms->smp.sockets = sockets; ms->smp.dies = dies; + ms->smp.cores = cores; + ms->smp.threads = threads; + ms->smp.max_cpus = maxcpus; } static From 9a52b508061163df4dae05e708cd2a9cd790ad04 Mon Sep 17 00:00:00 2001 From: Yanan Wang <wangyanan55@huawei.com> Date: Wed, 29 Sep 2021 10:58:04 +0800 Subject: [PATCH 04/29] machine: Uniformly use maxcpus to calculate the omitted parameters MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit We are currently using maxcpus to calculate the omitted sockets but using cpus to calculate the omitted cores/threads. This makes cmdlines like: -smp cpus=8,maxcpus=16 -smp cpus=8,cores=4,maxcpus=16 -smp cpus=8,threads=2,maxcpus=16 work fine but the ones like: -smp cpus=8,sockets=2,maxcpus=16 -smp cpus=8,sockets=2,cores=4,maxcpus=16 -smp cpus=8,sockets=2,threads=2,maxcpus=16 break the sanity check. Since we require for a valid config that the product of "sockets * cores * threads" should equal to the maxcpus, we should uniformly use maxcpus to calculate their omitted values. Also the if-branch of "cpus == 0 || sockets == 0" was split into two branches of "cpus == 0" and "sockets == 0" so that we can clearly read that we are parsing the configuration with a preference on cpus over sockets over cores over threads. Note: change in this patch won't affect any existing working cmdlines but improves consistency and allows more incomplete configs to be valid. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-5-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/core/machine.c | 30 +++++++++++++++--------------- hw/i386/pc.c | 30 +++++++++++++++--------------- 2 files changed, 30 insertions(+), 30 deletions(-) diff --git a/hw/core/machine.c b/hw/core/machine.c index 8e719e2932..596e758133 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -760,24 +760,26 @@ static void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) } /* compute missing values, prefer sockets over cores over threads */ - if (cpus == 0 || sockets == 0) { + maxcpus = maxcpus > 0 ? maxcpus : cpus; + + if (cpus == 0) { + sockets = sockets > 0 ? sockets : 1; cores = cores > 0 ? cores : 1; threads = threads > 0 ? threads : 1; - if (cpus == 0) { - sockets = sockets > 0 ? sockets : 1; - cpus = cores * threads * sockets; - } else { - maxcpus = maxcpus > 0 ? maxcpus : cpus; - sockets = maxcpus / (cores * threads); - } + cpus = sockets * cores * threads; + maxcpus = maxcpus > 0 ? maxcpus : cpus; + } else if (sockets == 0) { + cores = cores > 0 ? cores : 1; + threads = threads > 0 ? threads : 1; + sockets = maxcpus / (cores * threads); } else if (cores == 0) { threads = threads > 0 ? threads : 1; - cores = cpus / (sockets * threads); - cores = cores > 0 ? cores : 1; + cores = maxcpus / (sockets * threads); } else if (threads == 0) { - threads = cpus / (cores * sockets); - threads = threads > 0 ? threads : 1; - } else if (sockets * cores * threads < cpus) { + threads = maxcpus / (sockets * cores); + } + + if (sockets * cores * threads < cpus) { error_setg(errp, "cpu topology: " "sockets (%u) * cores (%u) * threads (%u) < " "smp_cpus (%u)", @@ -785,8 +787,6 @@ static void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) return; } - maxcpus = maxcpus > 0 ? maxcpus : cpus; - if (maxcpus < cpus) { error_setg(errp, "maxcpus must be equal to or greater than smp"); return; diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 92c78d9933..e37e84cc7b 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -727,24 +727,26 @@ static void pc_smp_parse(MachineState *ms, SMPConfiguration *config, Error **err dies = dies > 0 ? dies : 1; /* compute missing values, prefer sockets over cores over threads */ - if (cpus == 0 || sockets == 0) { + maxcpus = maxcpus > 0 ? maxcpus : cpus; + + if (cpus == 0) { + sockets = sockets > 0 ? sockets : 1; cores = cores > 0 ? cores : 1; threads = threads > 0 ? threads : 1; - if (cpus == 0) { - sockets = sockets > 0 ? sockets : 1; - cpus = cores * threads * dies * sockets; - } else { - maxcpus = maxcpus > 0 ? maxcpus : cpus; - sockets = maxcpus / (dies * cores * threads); - } + cpus = sockets * dies * cores * threads; + maxcpus = maxcpus > 0 ? maxcpus : cpus; + } else if (sockets == 0) { + cores = cores > 0 ? cores : 1; + threads = threads > 0 ? threads : 1; + sockets = maxcpus / (dies * cores * threads); } else if (cores == 0) { threads = threads > 0 ? threads : 1; - cores = cpus / (sockets * dies * threads); - cores = cores > 0 ? cores : 1; + cores = maxcpus / (sockets * dies * threads); } else if (threads == 0) { - threads = cpus / (cores * dies * sockets); - threads = threads > 0 ? threads : 1; - } else if (sockets * dies * cores * threads < cpus) { + threads = maxcpus / (sockets * dies * cores); + } + + if (sockets * dies * cores * threads < cpus) { error_setg(errp, "cpu topology: " "sockets (%u) * dies (%u) * cores (%u) * threads (%u) < " "smp_cpus (%u)", @@ -752,8 +754,6 @@ static void pc_smp_parse(MachineState *ms, SMPConfiguration *config, Error **err return; } - maxcpus = maxcpus > 0 ? maxcpus : cpus; - if (maxcpus < cpus) { error_setg(errp, "maxcpus must be equal to or greater than smp"); return; From 7d8c5a39628820f6927b8b70c8f54872f5d1a196 Mon Sep 17 00:00:00 2001 From: Yanan Wang <wangyanan55@huawei.com> Date: Wed, 29 Sep 2021 10:58:05 +0800 Subject: [PATCH 05/29] machine: Set the value of cpus to match maxcpus if it's omitted MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Currently we directly calculate the omitted cpus based on the given incomplete collection of parameters. This makes some cmdlines like: -smp maxcpus=16 -smp sockets=2,maxcpus=16 -smp sockets=2,dies=2,maxcpus=16 -smp sockets=2,cores=4,maxcpus=16 not work. We should probably set the value of cpus to match maxcpus if it's omitted, which will make above configs start to work. So the calculation logic of cpus/maxcpus after this patch will be: When both maxcpus and cpus are omitted, maxcpus will be calculated from the given parameters and cpus will be set equal to maxcpus. When only one of maxcpus and cpus is given then the omitted one will be set to its counterpart's value. Both maxcpus and cpus may be specified, but maxcpus must be equal to or greater than cpus. Note: change in this patch won't affect any existing working cmdlines but allows more incomplete configs to be valid. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-6-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/core/machine.c | 29 ++++++++++++++++------------- hw/i386/pc.c | 29 ++++++++++++++++------------- qemu-options.hx | 11 ++++++++--- 3 files changed, 40 insertions(+), 29 deletions(-) diff --git a/hw/core/machine.c b/hw/core/machine.c index 596e758133..d8f458db60 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -760,25 +760,28 @@ static void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) } /* compute missing values, prefer sockets over cores over threads */ - maxcpus = maxcpus > 0 ? maxcpus : cpus; - - if (cpus == 0) { + if (cpus == 0 && maxcpus == 0) { sockets = sockets > 0 ? sockets : 1; cores = cores > 0 ? cores : 1; threads = threads > 0 ? threads : 1; - cpus = sockets * cores * threads; + } else { maxcpus = maxcpus > 0 ? maxcpus : cpus; - } else if (sockets == 0) { - cores = cores > 0 ? cores : 1; - threads = threads > 0 ? threads : 1; - sockets = maxcpus / (cores * threads); - } else if (cores == 0) { - threads = threads > 0 ? threads : 1; - cores = maxcpus / (sockets * threads); - } else if (threads == 0) { - threads = maxcpus / (sockets * cores); + + if (sockets == 0) { + cores = cores > 0 ? cores : 1; + threads = threads > 0 ? threads : 1; + sockets = maxcpus / (cores * threads); + } else if (cores == 0) { + threads = threads > 0 ? threads : 1; + cores = maxcpus / (sockets * threads); + } else if (threads == 0) { + threads = maxcpus / (sockets * cores); + } } + maxcpus = maxcpus > 0 ? maxcpus : sockets * cores * threads; + cpus = cpus > 0 ? cpus : maxcpus; + if (sockets * cores * threads < cpus) { error_setg(errp, "cpu topology: " "sockets (%u) * cores (%u) * threads (%u) < " diff --git a/hw/i386/pc.c b/hw/i386/pc.c index e37e84cc7b..f24a1d72ad 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -727,25 +727,28 @@ static void pc_smp_parse(MachineState *ms, SMPConfiguration *config, Error **err dies = dies > 0 ? dies : 1; /* compute missing values, prefer sockets over cores over threads */ - maxcpus = maxcpus > 0 ? maxcpus : cpus; - - if (cpus == 0) { + if (cpus == 0 && maxcpus == 0) { sockets = sockets > 0 ? sockets : 1; cores = cores > 0 ? cores : 1; threads = threads > 0 ? threads : 1; - cpus = sockets * dies * cores * threads; + } else { maxcpus = maxcpus > 0 ? maxcpus : cpus; - } else if (sockets == 0) { - cores = cores > 0 ? cores : 1; - threads = threads > 0 ? threads : 1; - sockets = maxcpus / (dies * cores * threads); - } else if (cores == 0) { - threads = threads > 0 ? threads : 1; - cores = maxcpus / (sockets * dies * threads); - } else if (threads == 0) { - threads = maxcpus / (sockets * dies * cores); + + if (sockets == 0) { + cores = cores > 0 ? cores : 1; + threads = threads > 0 ? threads : 1; + sockets = maxcpus / (dies * cores * threads); + } else if (cores == 0) { + threads = threads > 0 ? threads : 1; + cores = maxcpus / (sockets * dies * threads); + } else if (threads == 0) { + threads = maxcpus / (sockets * dies * cores); + } } + maxcpus = maxcpus > 0 ? maxcpus : sockets * dies * cores * threads; + cpus = cpus > 0 ? cpus : maxcpus; + if (sockets * dies * cores * threads < cpus) { error_setg(errp, "cpu topology: " "sockets (%u) * dies (%u) * cores (%u) * threads (%u) < " diff --git a/qemu-options.hx b/qemu-options.hx index 4a092a092a..2eac830fdb 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -220,9 +220,14 @@ SRST Simulate a SMP system with '\ ``n``\ ' CPUs initially present on the machine type board. On boards supporting CPU hotplug, the optional '\ ``maxcpus``\ ' parameter can be set to enable further CPUs to be - added at runtime. If omitted the maximum number of CPUs will be - set to match the initial CPU count. Both parameters are subject to - an upper limit that is determined by the specific machine type chosen. + added at runtime. When both parameters are omitted, the maximum number + of CPUs will be calculated from the provided topology members and the + initial CPU count will match the maximum number. When only one of them + is given then the omitted one will be set to its counterpart's value. + Both parameters may be specified, but the maximum number of CPUs must + be equal to or greater than the initial CPU count. Both parameters are + subject to an upper limit that is determined by the specific machine + type chosen. To control reporting of CPU topology information, the number of sockets, dies per socket, cores per die, and threads per core can be specified. From 52082d3ba4e81bf9df23d3cd5ddfb2a620e9b267 Mon Sep 17 00:00:00 2001 From: Yanan Wang <wangyanan55@huawei.com> Date: Wed, 29 Sep 2021 10:58:06 +0800 Subject: [PATCH 06/29] machine: Improve the error reporting of smp parsing MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit We have two requirements for a valid SMP configuration: the product of "sockets * cores * threads" must represent all the possible cpus, i.e., max_cpus, and then must include the initially present cpus, i.e., smp_cpus. So we only need to ensure 1) "sockets * cores * threads == maxcpus" at first and then ensure 2) "maxcpus >= cpus". With a reasonable order of the sanity check, we can simplify the error reporting code. When reporting an error message we also report the exact value of each topology member to make users easily see what's going on. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210929025816.21076-7-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/core/machine.c | 22 +++++++++------------- hw/i386/pc.c | 24 ++++++++++-------------- 2 files changed, 19 insertions(+), 27 deletions(-) diff --git a/hw/core/machine.c b/hw/core/machine.c index d8f458db60..e38ab760e6 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -782,25 +782,21 @@ static void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) maxcpus = maxcpus > 0 ? maxcpus : sockets * cores * threads; cpus = cpus > 0 ? cpus : maxcpus; - if (sockets * cores * threads < cpus) { - error_setg(errp, "cpu topology: " - "sockets (%u) * cores (%u) * threads (%u) < " - "smp_cpus (%u)", - sockets, cores, threads, cpus); + if (sockets * cores * threads != maxcpus) { + error_setg(errp, "Invalid CPU topology: " + "product of the hierarchy must match maxcpus: " + "sockets (%u) * cores (%u) * threads (%u) " + "!= maxcpus (%u)", + sockets, cores, threads, maxcpus); return; } if (maxcpus < cpus) { - error_setg(errp, "maxcpus must be equal to or greater than smp"); - return; - } - - if (sockets * cores * threads != maxcpus) { error_setg(errp, "Invalid CPU topology: " + "maxcpus must be equal to or greater than smp: " "sockets (%u) * cores (%u) * threads (%u) " - "!= maxcpus (%u)", - sockets, cores, threads, - maxcpus); + "== maxcpus (%u) < smp_cpus (%u)", + sockets, cores, threads, maxcpus, cpus); return; } diff --git a/hw/i386/pc.c b/hw/i386/pc.c index f24a1d72ad..9216ad163d 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -749,25 +749,21 @@ static void pc_smp_parse(MachineState *ms, SMPConfiguration *config, Error **err maxcpus = maxcpus > 0 ? maxcpus : sockets * dies * cores * threads; cpus = cpus > 0 ? cpus : maxcpus; - if (sockets * dies * cores * threads < cpus) { - error_setg(errp, "cpu topology: " - "sockets (%u) * dies (%u) * cores (%u) * threads (%u) < " - "smp_cpus (%u)", - sockets, dies, cores, threads, cpus); + if (sockets * dies * cores * threads != maxcpus) { + error_setg(errp, "Invalid CPU topology: " + "product of the hierarchy must match maxcpus: " + "sockets (%u) * dies (%u) * cores (%u) * threads (%u) " + "!= maxcpus (%u)", + sockets, dies, cores, threads, maxcpus); return; } if (maxcpus < cpus) { - error_setg(errp, "maxcpus must be equal to or greater than smp"); - return; - } - - if (sockets * dies * cores * threads != maxcpus) { - error_setg(errp, "Invalid CPU topology deprecated: " + error_setg(errp, "Invalid CPU topology: " + "maxcpus must be equal to or greater than smp: " "sockets (%u) * dies (%u) * cores (%u) * threads (%u) " - "!= maxcpus (%u)", - sockets, dies, cores, threads, - maxcpus); + "== maxcpus (%u) < smp_cpus (%u)", + sockets, dies, cores, threads, maxcpus, cpus); return; } From afc8e9aaa7915ebfe1af8eba6994a38b3e556ba9 Mon Sep 17 00:00:00 2001 From: Yanan Wang <wangyanan55@huawei.com> Date: Wed, 29 Sep 2021 10:58:07 +0800 Subject: [PATCH 07/29] qtest/numa-test: Use detailed -smp CLIs in pc_dynamic_cpu_cfg MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Since commit 80d7835749 (qemu-options: rewrite help for -smp options), the preference of sockets/cores in -smp parsing is considered liable to change, and actually we are going to change it in a coming commit. So it'll be more stable to use detailed -smp CLIs in testing if we have strong dependency on the parsing results. pc_dynamic_cpu_cfg currently assumes/needs that there will be 2 CPU sockets with "-smp 2". To avoid breaking the test because of parsing logic change, now explicitly use "-smp 2,sockets=2". Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210929025816.21076-8-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- tests/qtest/numa-test.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tests/qtest/numa-test.c b/tests/qtest/numa-test.c index c677cd63c4..fd7a2e80a0 100644 --- a/tests/qtest/numa-test.c +++ b/tests/qtest/numa-test.c @@ -265,7 +265,8 @@ static void pc_dynamic_cpu_cfg(const void *data) QTestState *qs; g_autofree char *cli = NULL; - cli = make_cli(data, "-nodefaults --preconfig -machine smp.cpus=2"); + cli = make_cli(data, "-nodefaults --preconfig " + "-machine smp.cpus=2,smp.sockets=2"); qs = qtest_init(cli); /* create 2 numa nodes */ From bbb0c0ec6da8a739b02fbff941b44afc5513d6e4 Mon Sep 17 00:00:00 2001 From: Yanan Wang <wangyanan55@huawei.com> Date: Wed, 29 Sep 2021 10:58:08 +0800 Subject: [PATCH 08/29] qtest/numa-test: Use detailed -smp CLIs in test_def_cpu_split MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Since commit 80d7835749 (qemu-options: rewrite help for -smp options), the preference of sockets/cores in -smp parsing is considered liable to change, and actually we are going to change it in a coming commit. So it'll be more stable to use detailed -smp CLIs in the testcases that have strong dependency on the parsing results. Currently, test_def_cpu_split use "-smp 8" and will get 8 CPU sockets based on current parsing rule. But if we change to prefer cores over sockets we will get one CPU socket with 8 cores, and this testcase will not get expected numa set by default on x86_64 (Ok on aarch64). So now explicitly use "-smp 8,sockets=8" to avoid affect from parsing logic change. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210929025816.21076-9-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- tests/qtest/numa-test.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tests/qtest/numa-test.c b/tests/qtest/numa-test.c index fd7a2e80a0..90bf68a5b3 100644 --- a/tests/qtest/numa-test.c +++ b/tests/qtest/numa-test.c @@ -42,7 +42,8 @@ static void test_def_cpu_split(const void *data) g_autofree char *s = NULL; g_autofree char *cli = NULL; - cli = make_cli(data, "-machine smp.cpus=8 -numa node,memdev=ram -numa node"); + cli = make_cli(data, "-machine smp.cpus=8,smp.sockets=8 " + "-numa node,memdev=ram -numa node"); qts = qtest_init(cli); s = qtest_hmp(qts, "info numa"); From 4a0af2930a4e4f64ce551152fdb4b9e7be106408 Mon Sep 17 00:00:00 2001 From: Yanan Wang <wangyanan55@huawei.com> Date: Wed, 29 Sep 2021 10:58:09 +0800 Subject: [PATCH 09/29] machine: Prefer cores over sockets in smp parsing since 6.2 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In the real SMP hardware topology world, it's much more likely that we have high cores-per-socket counts and few sockets totally. While the current preference of sockets over cores in smp parsing results in a virtual cpu topology with low cores-per-sockets counts and a large number of sockets, which is just contrary to the real world. Given that it is better to make the virtual cpu topology be more reflective of the real world and also for the sake of compatibility, we start to prefer cores over sockets over threads in smp parsing since machine type 6.2 for different arches. In this patch, a boolean "smp_prefer_sockets" is added, and we only enable the old preference on older machines and enable the new one since type 6.2 for all arches by using the machine compat mechanism. Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-10-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/arm/virt.c | 1 + hw/core/machine.c | 35 ++++++++++++++++++++++++++--------- hw/i386/pc.c | 35 ++++++++++++++++++++++++++--------- hw/i386/pc_piix.c | 1 + hw/i386/pc_q35.c | 1 + hw/ppc/spapr.c | 1 + hw/s390x/s390-virtio-ccw.c | 1 + include/hw/boards.h | 1 + qemu-options.hx | 3 ++- 9 files changed, 60 insertions(+), 19 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 1d59f0e59f..8c13deb5db 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -2815,6 +2815,7 @@ static void virt_machine_6_1_options(MachineClass *mc) virt_machine_6_2_options(mc); compat_props_add(mc->compat_props, hw_compat_6_1, hw_compat_6_1_len); + mc->smp_prefer_sockets = true; /* qemu ITS was introduced with 6.2 */ vmc->no_tcg_its = true; diff --git a/hw/core/machine.c b/hw/core/machine.c index e38ab760e6..f2a34d98c0 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -748,6 +748,7 @@ void machine_set_cpu_numa_node(MachineState *machine, static void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) { + MachineClass *mc = MACHINE_GET_CLASS(ms); unsigned cpus = config->has_cpus ? config->cpus : 0; unsigned sockets = config->has_sockets ? config->sockets : 0; unsigned cores = config->has_cores ? config->cores : 0; @@ -759,7 +760,7 @@ static void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) return; } - /* compute missing values, prefer sockets over cores over threads */ + /* compute missing values based on the provided ones */ if (cpus == 0 && maxcpus == 0) { sockets = sockets > 0 ? sockets : 1; cores = cores > 0 ? cores : 1; @@ -767,14 +768,30 @@ static void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) } else { maxcpus = maxcpus > 0 ? maxcpus : cpus; - if (sockets == 0) { - cores = cores > 0 ? cores : 1; - threads = threads > 0 ? threads : 1; - sockets = maxcpus / (cores * threads); - } else if (cores == 0) { - threads = threads > 0 ? threads : 1; - cores = maxcpus / (sockets * threads); - } else if (threads == 0) { + if (mc->smp_prefer_sockets) { + /* prefer sockets over cores before 6.2 */ + if (sockets == 0) { + cores = cores > 0 ? cores : 1; + threads = threads > 0 ? threads : 1; + sockets = maxcpus / (cores * threads); + } else if (cores == 0) { + threads = threads > 0 ? threads : 1; + cores = maxcpus / (sockets * threads); + } + } else { + /* prefer cores over sockets since 6.2 */ + if (cores == 0) { + sockets = sockets > 0 ? sockets : 1; + threads = threads > 0 ? threads : 1; + cores = maxcpus / (sockets * threads); + } else if (sockets == 0) { + threads = threads > 0 ? threads : 1; + sockets = maxcpus / (cores * threads); + } + } + + /* try to calculate omitted threads at last */ + if (threads == 0) { threads = maxcpus / (sockets * cores); } } diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 9216ad163d..6cc32f4048 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -716,6 +716,7 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level) */ static void pc_smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) { + MachineClass *mc = MACHINE_GET_CLASS(ms); unsigned cpus = config->has_cpus ? config->cpus : 0; unsigned sockets = config->has_sockets ? config->sockets : 0; unsigned dies = config->has_dies ? config->dies : 0; @@ -726,7 +727,7 @@ static void pc_smp_parse(MachineState *ms, SMPConfiguration *config, Error **err /* directly default dies to 1 if it's omitted */ dies = dies > 0 ? dies : 1; - /* compute missing values, prefer sockets over cores over threads */ + /* compute missing values based on the provided ones */ if (cpus == 0 && maxcpus == 0) { sockets = sockets > 0 ? sockets : 1; cores = cores > 0 ? cores : 1; @@ -734,14 +735,30 @@ static void pc_smp_parse(MachineState *ms, SMPConfiguration *config, Error **err } else { maxcpus = maxcpus > 0 ? maxcpus : cpus; - if (sockets == 0) { - cores = cores > 0 ? cores : 1; - threads = threads > 0 ? threads : 1; - sockets = maxcpus / (dies * cores * threads); - } else if (cores == 0) { - threads = threads > 0 ? threads : 1; - cores = maxcpus / (sockets * dies * threads); - } else if (threads == 0) { + if (mc->smp_prefer_sockets) { + /* prefer sockets over cores before 6.2 */ + if (sockets == 0) { + cores = cores > 0 ? cores : 1; + threads = threads > 0 ? threads : 1; + sockets = maxcpus / (dies * cores * threads); + } else if (cores == 0) { + threads = threads > 0 ? threads : 1; + cores = maxcpus / (sockets * dies * threads); + } + } else { + /* prefer cores over sockets since 6.2 */ + if (cores == 0) { + sockets = sockets > 0 ? sockets : 1; + threads = threads > 0 ? threads : 1; + cores = maxcpus / (sockets * dies * threads); + } else if (sockets == 0) { + threads = threads > 0 ? threads : 1; + sockets = maxcpus / (dies * cores * threads); + } + } + + /* try to calculate omitted threads at last */ + if (threads == 0) { threads = maxcpus / (sockets * dies * cores); } } diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 6cc834aff6..932e6e2cfe 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -432,6 +432,7 @@ static void pc_i440fx_6_1_machine_options(MachineClass *m) m->is_default = false; compat_props_add(m->compat_props, hw_compat_6_1, hw_compat_6_1_len); compat_props_add(m->compat_props, pc_compat_6_1, pc_compat_6_1_len); + m->smp_prefer_sockets = true; } DEFINE_I440FX_MACHINE(v6_1, "pc-i440fx-6.1", NULL, diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 5481d5c965..57ab07e5b7 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -372,6 +372,7 @@ static void pc_q35_6_1_machine_options(MachineClass *m) m->alias = NULL; compat_props_add(m->compat_props, hw_compat_6_1, hw_compat_6_1_len); compat_props_add(m->compat_props, pc_compat_6_1, pc_compat_6_1_len); + m->smp_prefer_sockets = true; } DEFINE_Q35_MACHINE(v6_1, "pc-q35-6.1", NULL, diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index b7bee5f4ff..2986108b5a 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -4685,6 +4685,7 @@ static void spapr_machine_6_1_class_options(MachineClass *mc) spapr_machine_6_2_class_options(mc); compat_props_add(mc->compat_props, hw_compat_6_1, hw_compat_6_1_len); smc->pre_6_2_numa_affinity = true; + mc->smp_prefer_sockets = true; } DEFINE_SPAPR_MACHINE(6_1, "6.1", false); diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c index 61aeccb163..5401c985cf 100644 --- a/hw/s390x/s390-virtio-ccw.c +++ b/hw/s390x/s390-virtio-ccw.c @@ -814,6 +814,7 @@ static void ccw_machine_6_1_class_options(MachineClass *mc) { ccw_machine_6_2_class_options(mc); compat_props_add(mc->compat_props, hw_compat_6_1, hw_compat_6_1_len); + mc->smp_prefer_sockets = true; } DEFINE_CCW_MACHINE(6_1, "6.1", false); diff --git a/include/hw/boards.h b/include/hw/boards.h index 463a5514f9..2ae039b74f 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -247,6 +247,7 @@ struct MachineClass { bool nvdimm_supported; bool numa_mem_supported; bool auto_enable_numa; + bool smp_prefer_sockets; const char *default_ram_id; HotplugHandler *(*get_hotplug_handler)(MachineState *machine, diff --git a/qemu-options.hx b/qemu-options.hx index 2eac830fdb..8ef178180d 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -244,7 +244,8 @@ SRST Historically preference was given to the coarsest topology parameters when computing missing values (ie sockets preferred over cores, which were preferred over threads), however, this behaviour is considered - liable to change. + liable to change. Prior to 6.2 the preference was sockets over cores + over threads. Since 6.2 the preference is cores over sockets over threads. ERST DEF("numa", HAS_ARG, QEMU_OPTION_numa, From 69fc28a78dbff2d3414bbafce38320b4433ed583 Mon Sep 17 00:00:00 2001 From: Yanan Wang <wangyanan55@huawei.com> Date: Wed, 29 Sep 2021 10:58:10 +0800 Subject: [PATCH 10/29] machine: Use ms instead of global current_machine in sanity-check MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In the sanity-check of smp_cpus and max_cpus against mc in function machine_set_smp(), we are now using ms->smp.max_cpus for the check but using current_machine->smp.max_cpus in the error message. Tweak this by uniformly using the local ms. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210929025816.21076-11-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/core/machine.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/core/machine.c b/hw/core/machine.c index f2a34d98c0..12d7416053 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -881,7 +881,7 @@ static void machine_set_smp(Object *obj, Visitor *v, const char *name, } else if (ms->smp.max_cpus > mc->max_cpus) { error_setg(errp, "Invalid SMP CPUs %d. The max CPUs " "supported by machine '%s' is %d", - current_machine->smp.max_cpus, + ms->smp.max_cpus, mc->name, mc->max_cpus); } From 003f230e37d724ac52004d7f8270159da105780f Mon Sep 17 00:00:00 2001 From: Yanan Wang <wangyanan55@huawei.com> Date: Wed, 29 Sep 2021 10:58:11 +0800 Subject: [PATCH 11/29] machine: Tweak the order of topology members in struct CpuTopology MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Now that all the possible topology parameters are integrated in struct CpuTopology, tweak the order of topology members to be "cpus/sockets/ dies/cores/threads/maxcpus" for readability and consistency. We also tweak the comment by adding explanation of dies parameter. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210929025816.21076-12-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/core/machine.c | 10 +++++----- include/hw/boards.h | 7 ++++--- 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/hw/core/machine.c b/hw/core/machine.c index 12d7416053..83cbdcce47 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -829,11 +829,11 @@ static void machine_get_smp(Object *obj, Visitor *v, const char *name, { MachineState *ms = MACHINE(obj); SMPConfiguration *config = &(SMPConfiguration){ - .has_cores = true, .cores = ms->smp.cores, + .has_cpus = true, .cpus = ms->smp.cpus, .has_sockets = true, .sockets = ms->smp.sockets, .has_dies = true, .dies = ms->smp.dies, + .has_cores = true, .cores = ms->smp.cores, .has_threads = true, .threads = ms->smp.threads, - .has_cpus = true, .cpus = ms->smp.cpus, .has_maxcpus = true, .maxcpus = ms->smp.max_cpus, }; if (!visit_type_SMPConfiguration(v, name, &config, &error_abort)) { @@ -1060,10 +1060,10 @@ static void machine_initfn(Object *obj) /* default to mc->default_cpus */ ms->smp.cpus = mc->default_cpus; ms->smp.max_cpus = mc->default_cpus; - ms->smp.cores = 1; - ms->smp.dies = 1; - ms->smp.threads = 1; ms->smp.sockets = 1; + ms->smp.dies = 1; + ms->smp.cores = 1; + ms->smp.threads = 1; } static void machine_finalize(Object *obj) diff --git a/include/hw/boards.h b/include/hw/boards.h index 2ae039b74f..2a1bba86c0 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -275,17 +275,18 @@ typedef struct DeviceMemoryState { /** * CpuTopology: * @cpus: the number of present logical processors on the machine - * @cores: the number of cores in one package - * @threads: the number of threads in one core * @sockets: the number of sockets on the machine + * @dies: the number of dies in one socket + * @cores: the number of cores in one die + * @threads: the number of threads in one core * @max_cpus: the maximum number of logical processors on the machine */ typedef struct CpuTopology { unsigned int cpus; + unsigned int sockets; unsigned int dies; unsigned int cores; unsigned int threads; - unsigned int sockets; unsigned int max_cpus; } CpuTopology; From e4a97a893bcd7511aba812969d1fa6fe42dc1931 Mon Sep 17 00:00:00 2001 From: Yanan Wang <wangyanan55@huawei.com> Date: Wed, 29 Sep 2021 10:58:12 +0800 Subject: [PATCH 12/29] machine: Make smp_parse generic enough for all arches MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Currently the only difference between smp_parse and pc_smp_parse is the support of dies parameter and the related error reporting. With some arch compat variables like "bool dies_supported", we can make smp_parse generic enough for all arches and the PC specific one can be removed. Making smp_parse() generic enough can reduce code duplication and ease the code maintenance, and also allows extending the topology with more arch specific members (e.g., clusters) in the future. Suggested-by: Andrew Jones <drjones@redhat.com> Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-13-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/core/machine.c | 91 +++++++++++++++++++++++++++++++++++---------- hw/i386/pc.c | 84 +---------------------------------------- include/hw/boards.h | 9 +++++ 3 files changed, 81 insertions(+), 103 deletions(-) diff --git a/hw/core/machine.c b/hw/core/machine.c index 83cbdcce47..12872d7715 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -746,20 +746,69 @@ void machine_set_cpu_numa_node(MachineState *machine, } } +/* + * Report information of a machine's supported CPU topology hierarchy. + * Topology members will be ordered from the largest to the smallest + * in the string. + */ +static char *cpu_hierarchy_to_string(MachineState *ms) +{ + MachineClass *mc = MACHINE_GET_CLASS(ms); + GString *s = g_string_new(NULL); + + g_string_append_printf(s, "sockets (%u)", ms->smp.sockets); + + if (mc->smp_props.dies_supported) { + g_string_append_printf(s, " * dies (%u)", ms->smp.dies); + } + + g_string_append_printf(s, " * cores (%u)", ms->smp.cores); + g_string_append_printf(s, " * threads (%u)", ms->smp.threads); + + return g_string_free(s, false); +} + +/* + * smp_parse - Generic function used to parse the given SMP configuration + * + * Any missing parameter in "cpus/maxcpus/sockets/cores/threads" will be + * automatically computed based on the provided ones. + * + * In the calculation of omitted sockets/cores/threads: we prefer sockets + * over cores over threads before 6.2, while preferring cores over sockets + * over threads since 6.2. + * + * In the calculation of cpus/maxcpus: When both maxcpus and cpus are omitted, + * maxcpus will be computed from the given parameters and cpus will be set + * equal to maxcpus. When only one of maxcpus and cpus is given then the + * omitted one will be set to its given counterpart's value. Both maxcpus and + * cpus may be specified, but maxcpus must be equal to or greater than cpus. + * + * For compatibility, apart from the parameters that will be computed, newly + * introduced topology members which are likely to be target specific should + * be directly set as 1 if they are omitted (e.g. dies for PC since 4.1). + */ static void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) { MachineClass *mc = MACHINE_GET_CLASS(ms); unsigned cpus = config->has_cpus ? config->cpus : 0; unsigned sockets = config->has_sockets ? config->sockets : 0; + unsigned dies = config->has_dies ? config->dies : 0; unsigned cores = config->has_cores ? config->cores : 0; unsigned threads = config->has_threads ? config->threads : 0; unsigned maxcpus = config->has_maxcpus ? config->maxcpus : 0; - if (config->has_dies && config->dies > 1) { + /* + * If not supported by the machine, a topology parameter must be + * omitted or specified equal to 1. + */ + if (!mc->smp_props.dies_supported && dies > 1) { error_setg(errp, "dies not supported by this machine's CPU topology"); return; } + dies = dies > 0 ? dies : 1; + /* compute missing values based on the provided ones */ if (cpus == 0 && maxcpus == 0) { sockets = sockets > 0 ? sockets : 1; @@ -773,55 +822,57 @@ static void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) if (sockets == 0) { cores = cores > 0 ? cores : 1; threads = threads > 0 ? threads : 1; - sockets = maxcpus / (cores * threads); + sockets = maxcpus / (dies * cores * threads); } else if (cores == 0) { threads = threads > 0 ? threads : 1; - cores = maxcpus / (sockets * threads); + cores = maxcpus / (sockets * dies * threads); } } else { /* prefer cores over sockets since 6.2 */ if (cores == 0) { sockets = sockets > 0 ? sockets : 1; threads = threads > 0 ? threads : 1; - cores = maxcpus / (sockets * threads); + cores = maxcpus / (sockets * dies * threads); } else if (sockets == 0) { threads = threads > 0 ? threads : 1; - sockets = maxcpus / (cores * threads); + sockets = maxcpus / (dies * cores * threads); } } /* try to calculate omitted threads at last */ if (threads == 0) { - threads = maxcpus / (sockets * cores); + threads = maxcpus / (sockets * dies * cores); } } - maxcpus = maxcpus > 0 ? maxcpus : sockets * cores * threads; + maxcpus = maxcpus > 0 ? maxcpus : sockets * dies * cores * threads; cpus = cpus > 0 ? cpus : maxcpus; - if (sockets * cores * threads != maxcpus) { + ms->smp.cpus = cpus; + ms->smp.sockets = sockets; + ms->smp.dies = dies; + ms->smp.cores = cores; + ms->smp.threads = threads; + ms->smp.max_cpus = maxcpus; + + /* sanity-check of the computed topology */ + if (sockets * dies * cores * threads != maxcpus) { + g_autofree char *topo_msg = cpu_hierarchy_to_string(ms); error_setg(errp, "Invalid CPU topology: " "product of the hierarchy must match maxcpus: " - "sockets (%u) * cores (%u) * threads (%u) " - "!= maxcpus (%u)", - sockets, cores, threads, maxcpus); + "%s != maxcpus (%u)", + topo_msg, maxcpus); return; } if (maxcpus < cpus) { + g_autofree char *topo_msg = cpu_hierarchy_to_string(ms); error_setg(errp, "Invalid CPU topology: " "maxcpus must be equal to or greater than smp: " - "sockets (%u) * cores (%u) * threads (%u) " - "== maxcpus (%u) < smp_cpus (%u)", - sockets, cores, threads, maxcpus, cpus); + "%s == maxcpus (%u) < smp_cpus (%u)", + topo_msg, maxcpus, cpus); return; } - - ms->smp.cpus = cpus; - ms->smp.sockets = sockets; - ms->smp.cores = cores; - ms->smp.threads = threads; - ms->smp.max_cpus = maxcpus; } static void machine_get_smp(Object *obj, Visitor *v, const char *name, diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 6cc32f4048..28e1b83b9d 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -710,88 +710,6 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level) } } -/* - * This function is very similar to smp_parse() - * in hw/core/machine.c but includes CPU die support. - */ -static void pc_smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) -{ - MachineClass *mc = MACHINE_GET_CLASS(ms); - unsigned cpus = config->has_cpus ? config->cpus : 0; - unsigned sockets = config->has_sockets ? config->sockets : 0; - unsigned dies = config->has_dies ? config->dies : 0; - unsigned cores = config->has_cores ? config->cores : 0; - unsigned threads = config->has_threads ? config->threads : 0; - unsigned maxcpus = config->has_maxcpus ? config->maxcpus : 0; - - /* directly default dies to 1 if it's omitted */ - dies = dies > 0 ? dies : 1; - - /* compute missing values based on the provided ones */ - if (cpus == 0 && maxcpus == 0) { - sockets = sockets > 0 ? sockets : 1; - cores = cores > 0 ? cores : 1; - threads = threads > 0 ? threads : 1; - } else { - maxcpus = maxcpus > 0 ? maxcpus : cpus; - - if (mc->smp_prefer_sockets) { - /* prefer sockets over cores before 6.2 */ - if (sockets == 0) { - cores = cores > 0 ? cores : 1; - threads = threads > 0 ? threads : 1; - sockets = maxcpus / (dies * cores * threads); - } else if (cores == 0) { - threads = threads > 0 ? threads : 1; - cores = maxcpus / (sockets * dies * threads); - } - } else { - /* prefer cores over sockets since 6.2 */ - if (cores == 0) { - sockets = sockets > 0 ? sockets : 1; - threads = threads > 0 ? threads : 1; - cores = maxcpus / (sockets * dies * threads); - } else if (sockets == 0) { - threads = threads > 0 ? threads : 1; - sockets = maxcpus / (dies * cores * threads); - } - } - - /* try to calculate omitted threads at last */ - if (threads == 0) { - threads = maxcpus / (sockets * dies * cores); - } - } - - maxcpus = maxcpus > 0 ? maxcpus : sockets * dies * cores * threads; - cpus = cpus > 0 ? cpus : maxcpus; - - if (sockets * dies * cores * threads != maxcpus) { - error_setg(errp, "Invalid CPU topology: " - "product of the hierarchy must match maxcpus: " - "sockets (%u) * dies (%u) * cores (%u) * threads (%u) " - "!= maxcpus (%u)", - sockets, dies, cores, threads, maxcpus); - return; - } - - if (maxcpus < cpus) { - error_setg(errp, "Invalid CPU topology: " - "maxcpus must be equal to or greater than smp: " - "sockets (%u) * dies (%u) * cores (%u) * threads (%u) " - "== maxcpus (%u) < smp_cpus (%u)", - sockets, dies, cores, threads, maxcpus, cpus); - return; - } - - ms->smp.cpus = cpus; - ms->smp.sockets = sockets; - ms->smp.dies = dies; - ms->smp.cores = cores; - ms->smp.threads = threads; - ms->smp.max_cpus = maxcpus; -} - static void pc_machine_done(Notifier *notifier, void *data) { @@ -1755,7 +1673,6 @@ static void pc_machine_class_init(ObjectClass *oc, void *data) mc->auto_enable_numa_with_memdev = true; mc->has_hotpluggable_cpus = true; mc->default_boot_order = "cad"; - mc->smp_parse = pc_smp_parse; mc->block_default_type = IF_IDE; mc->max_cpus = 255; mc->reset = pc_machine_reset; @@ -1766,6 +1683,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data) hc->unplug = pc_machine_device_unplug_cb; mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE; mc->nvdimm_supported = true; + mc->smp_props.dies_supported = true; mc->default_ram_id = "pc.ram"; object_class_property_add(oc, PC_MACHINE_MAX_RAM_BELOW_4G, "size", diff --git a/include/hw/boards.h b/include/hw/boards.h index 2a1bba86c0..72a23e4e0f 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -108,6 +108,14 @@ typedef struct { CPUArchId cpus[]; } CPUArchIdList; +/** + * SMPCompatProps: + * @dies_supported - whether dies are supported by the machine + */ +typedef struct { + bool dies_supported; +} SMPCompatProps; + /** * MachineClass: * @deprecation_reason: If set, the machine is marked as deprecated. The @@ -248,6 +256,7 @@ struct MachineClass { bool numa_mem_supported; bool auto_enable_numa; bool smp_prefer_sockets; + SMPCompatProps smp_props; const char *default_ram_id; HotplugHandler *(*get_hotplug_handler)(MachineState *machine, From 7687b2b3edc3f29ad58e8d6593d5c10dde406c34 Mon Sep 17 00:00:00 2001 From: Yanan Wang <wangyanan55@huawei.com> Date: Wed, 29 Sep 2021 10:58:13 +0800 Subject: [PATCH 13/29] machine: Remove smp_parse callback from MachineClass MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Now we have a generic smp parser for all arches, and there will not be any other arch specific ones, so let's remove the callback from MachineClass and call the parser directly. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-14-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/core/machine.c | 3 +-- include/hw/boards.h | 5 ----- 2 files changed, 1 insertion(+), 7 deletions(-) diff --git a/hw/core/machine.c b/hw/core/machine.c index 12872d7715..8b0f1aed83 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -918,7 +918,7 @@ static void machine_set_smp(Object *obj, Visitor *v, const char *name, "CPU topology parameters must be greater than zero"); } - mc->smp_parse(ms, config, errp); + smp_parse(ms, config, errp); if (*errp) { goto out_free; } @@ -947,7 +947,6 @@ static void machine_class_init(ObjectClass *oc, void *data) /* Default 128 MB as guest ram size */ mc->default_ram_size = 128 * MiB; mc->rom_file_has_mr = true; - mc->smp_parse = smp_parse; /* numa node memory size aligned on 8MB by default. * On Linux, each node's border has to be 8MB aligned diff --git a/include/hw/boards.h b/include/hw/boards.h index 72a23e4e0f..fa284e01e9 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -177,10 +177,6 @@ typedef struct { * kvm-type may be NULL if it is not needed. * @numa_mem_supported: * true if '--numa node.mem' option is supported and false otherwise - * @smp_parse: - * The function pointer to hook different machine specific functions for - * parsing "smp-opts" from QemuOpts to MachineState::CpuTopology and more - * machine specific topology fields, such as smp_dies for PCMachine. * @hotplug_allowed: * If the hook is provided, then it'll be called for each device * hotplug to check whether the device hotplug is allowed. Return @@ -217,7 +213,6 @@ struct MachineClass { void (*reset)(MachineState *state); void (*wakeup)(MachineState *state); int (*kvm_type)(MachineState *machine, const char *arg); - void (*smp_parse)(MachineState *ms, SMPConfiguration *config, Error **errp); BlockInterfaceType block_default_type; int units_per_default_bus; From 2b52619994ab48504c5fc0d32a1af24159405ce0 Mon Sep 17 00:00:00 2001 From: Yanan Wang <wangyanan55@huawei.com> Date: Wed, 29 Sep 2021 10:58:14 +0800 Subject: [PATCH 14/29] machine: Move smp_prefer_sockets to struct SMPCompatProps MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Now we have a common structure SMPCompatProps used to store information about SMP compatibility stuff, so we can also move smp_prefer_sockets there for cleaner code. No functional change intended. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-15-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/arm/virt.c | 2 +- hw/core/machine.c | 2 +- hw/i386/pc_piix.c | 2 +- hw/i386/pc_q35.c | 2 +- hw/ppc/spapr.c | 2 +- hw/s390x/s390-virtio-ccw.c | 2 +- include/hw/boards.h | 3 ++- 7 files changed, 8 insertions(+), 7 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 8c13deb5db..7170aaacd5 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -2815,7 +2815,7 @@ static void virt_machine_6_1_options(MachineClass *mc) virt_machine_6_2_options(mc); compat_props_add(mc->compat_props, hw_compat_6_1, hw_compat_6_1_len); - mc->smp_prefer_sockets = true; + mc->smp_props.prefer_sockets = true; /* qemu ITS was introduced with 6.2 */ vmc->no_tcg_its = true; diff --git a/hw/core/machine.c b/hw/core/machine.c index 8b0f1aed83..54f04a5ac6 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -817,7 +817,7 @@ static void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) } else { maxcpus = maxcpus > 0 ? maxcpus : cpus; - if (mc->smp_prefer_sockets) { + if (mc->smp_props.prefer_sockets) { /* prefer sockets over cores before 6.2 */ if (sockets == 0) { cores = cores > 0 ? cores : 1; diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 932e6e2cfe..6ad0d763c5 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -432,7 +432,7 @@ static void pc_i440fx_6_1_machine_options(MachineClass *m) m->is_default = false; compat_props_add(m->compat_props, hw_compat_6_1, hw_compat_6_1_len); compat_props_add(m->compat_props, pc_compat_6_1, pc_compat_6_1_len); - m->smp_prefer_sockets = true; + m->smp_props.prefer_sockets = true; } DEFINE_I440FX_MACHINE(v6_1, "pc-i440fx-6.1", NULL, diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 57ab07e5b7..fcc6e4eb2b 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -372,7 +372,7 @@ static void pc_q35_6_1_machine_options(MachineClass *m) m->alias = NULL; compat_props_add(m->compat_props, hw_compat_6_1, hw_compat_6_1_len); compat_props_add(m->compat_props, pc_compat_6_1, pc_compat_6_1_len); - m->smp_prefer_sockets = true; + m->smp_props.prefer_sockets = true; } DEFINE_Q35_MACHINE(v6_1, "pc-q35-6.1", NULL, diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 2986108b5a..163c90388a 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -4685,7 +4685,7 @@ static void spapr_machine_6_1_class_options(MachineClass *mc) spapr_machine_6_2_class_options(mc); compat_props_add(mc->compat_props, hw_compat_6_1, hw_compat_6_1_len); smc->pre_6_2_numa_affinity = true; - mc->smp_prefer_sockets = true; + mc->smp_props.prefer_sockets = true; } DEFINE_SPAPR_MACHINE(6_1, "6.1", false); diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c index 5401c985cf..653587ea62 100644 --- a/hw/s390x/s390-virtio-ccw.c +++ b/hw/s390x/s390-virtio-ccw.c @@ -814,7 +814,7 @@ static void ccw_machine_6_1_class_options(MachineClass *mc) { ccw_machine_6_2_class_options(mc); compat_props_add(mc->compat_props, hw_compat_6_1, hw_compat_6_1_len); - mc->smp_prefer_sockets = true; + mc->smp_props.prefer_sockets = true; } DEFINE_CCW_MACHINE(6_1, "6.1", false); diff --git a/include/hw/boards.h b/include/hw/boards.h index fa284e01e9..5adbcbb99b 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -110,9 +110,11 @@ typedef struct { /** * SMPCompatProps: + * @prefer_sockets - whether sockets are preferred over cores in smp parsing * @dies_supported - whether dies are supported by the machine */ typedef struct { + bool prefer_sockets; bool dies_supported; } SMPCompatProps; @@ -250,7 +252,6 @@ struct MachineClass { bool nvdimm_supported; bool numa_mem_supported; bool auto_enable_numa; - bool smp_prefer_sockets; SMPCompatProps smp_props; const char *default_ram_id; From e7f944bb94a375e8ee7469ffa535ea6e11ce59e1 Mon Sep 17 00:00:00 2001 From: Paolo Bonzini <pbonzini@redhat.com> Date: Fri, 1 Oct 2021 19:04:03 +0200 Subject: [PATCH 15/29] machine: Use g_autoptr in machine_set_smp Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/core/machine.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/hw/core/machine.c b/hw/core/machine.c index 54f04a5ac6..d49ebc24e2 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -897,7 +897,7 @@ static void machine_set_smp(Object *obj, Visitor *v, const char *name, { MachineClass *mc = MACHINE_GET_CLASS(obj); MachineState *ms = MACHINE(obj); - SMPConfiguration *config; + g_autoptr(SMPConfiguration) config = NULL; ERRP_GUARD(); if (!visit_type_SMPConfiguration(v, name, &config, errp)) { @@ -920,7 +920,7 @@ static void machine_set_smp(Object *obj, Visitor *v, const char *name, smp_parse(ms, config, errp); if (*errp) { - goto out_free; + return; } /* sanity-check smp_cpus and max_cpus against mc */ @@ -935,9 +935,6 @@ static void machine_set_smp(Object *obj, Visitor *v, const char *name, ms->smp.max_cpus, mc->name, mc->max_cpus); } - -out_free: - qapi_free_SMPConfiguration(config); } static void machine_class_init(ObjectClass *oc, void *data) From 8bdfec393a2ac67df9cecc0983d130db4a6bba58 Mon Sep 17 00:00:00 2001 From: Yanan Wang <wangyanan55@huawei.com> Date: Wed, 29 Sep 2021 10:58:15 +0800 Subject: [PATCH 16/29] machine: Put all sanity-check in the generic SMP parser MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Put both sanity-check of the input SMP configuration and sanity-check of the output SMP configuration uniformly in the generic parser. Then machine_set_smp() will become cleaner, also all the invalid scenarios can be tested only by calling the parser. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-16-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/core/machine.c | 62 +++++++++++++++++++++++------------------------ 1 file changed, 30 insertions(+), 32 deletions(-) diff --git a/hw/core/machine.c b/hw/core/machine.c index d49ebc24e2..3920a2f2af 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -798,6 +798,20 @@ static void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) unsigned threads = config->has_threads ? config->threads : 0; unsigned maxcpus = config->has_maxcpus ? config->maxcpus : 0; + /* + * Specified CPU topology parameters must be greater than zero, + * explicit configuration like "cpus=0" is not allowed. + */ + if ((config->has_cpus && config->cpus == 0) || + (config->has_sockets && config->sockets == 0) || + (config->has_dies && config->dies == 0) || + (config->has_cores && config->cores == 0) || + (config->has_threads && config->threads == 0) || + (config->has_maxcpus && config->maxcpus == 0)) { + warn_report("Deprecated CPU topology (considered invalid): " + "CPU topology parameters must be greater than zero"); + } + /* * If not supported by the machine, a topology parameter must be * omitted or specified equal to 1. @@ -873,6 +887,22 @@ static void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) topo_msg, maxcpus, cpus); return; } + + if (ms->smp.cpus < mc->min_cpus) { + error_setg(errp, "Invalid SMP CPUs %d. The min CPUs " + "supported by machine '%s' is %d", + ms->smp.cpus, + mc->name, mc->min_cpus); + return; + } + + if (ms->smp.max_cpus > mc->max_cpus) { + error_setg(errp, "Invalid SMP CPUs %d. The max CPUs " + "supported by machine '%s' is %d", + ms->smp.max_cpus, + mc->name, mc->max_cpus); + return; + } } static void machine_get_smp(Object *obj, Visitor *v, const char *name, @@ -895,46 +925,14 @@ static void machine_get_smp(Object *obj, Visitor *v, const char *name, static void machine_set_smp(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { - MachineClass *mc = MACHINE_GET_CLASS(obj); MachineState *ms = MACHINE(obj); g_autoptr(SMPConfiguration) config = NULL; - ERRP_GUARD(); if (!visit_type_SMPConfiguration(v, name, &config, errp)) { return; } - /* - * Specified CPU topology parameters must be greater than zero, - * explicit configuration like "cpus=0" is not allowed. - */ - if ((config->has_cpus && config->cpus == 0) || - (config->has_sockets && config->sockets == 0) || - (config->has_dies && config->dies == 0) || - (config->has_cores && config->cores == 0) || - (config->has_threads && config->threads == 0) || - (config->has_maxcpus && config->maxcpus == 0)) { - warn_report("Deprecated CPU topology (considered invalid): " - "CPU topology parameters must be greater than zero"); - } - smp_parse(ms, config, errp); - if (*errp) { - return; - } - - /* sanity-check smp_cpus and max_cpus against mc */ - if (ms->smp.cpus < mc->min_cpus) { - error_setg(errp, "Invalid SMP CPUs %d. The min CPUs " - "supported by machine '%s' is %d", - ms->smp.cpus, - mc->name, mc->min_cpus); - } else if (ms->smp.max_cpus > mc->max_cpus) { - error_setg(errp, "Invalid SMP CPUs %d. The max CPUs " - "supported by machine '%s' is %d", - ms->smp.max_cpus, - mc->name, mc->max_cpus); - } } static void machine_class_init(ObjectClass *oc, void *data) From 988f7b8bfeffbf521814d1e48c321f7674277512 Mon Sep 17 00:00:00 2001 From: Vitaly Kuznetsov <vkuznets@redhat.com> Date: Thu, 2 Sep 2021 11:35:25 +0200 Subject: [PATCH 17/29] i386: Support KVM_CAP_ENFORCE_PV_FEATURE_CPUID By default, KVM allows the guest to use all currently supported PV features even when they were not announced in guest visible CPUIDs. Introduce a new "kvm-pv-enforce-cpuid" flag to limit the supported feature set to the exposed features. The feature is supported by Linux >= 5.10 and is not enabled by default in QEMU. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-4-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- target/i386/cpu.c | 2 ++ target/i386/cpu.h | 3 +++ target/i386/kvm/kvm.c | 10 ++++++++++ 3 files changed, 15 insertions(+) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index cacec605bf..598019de12 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -6860,6 +6860,8 @@ static Property x86_cpu_properties[] = { DEFINE_PROP_BOOL("l3-cache", X86CPU, enable_l3_cache, true), DEFINE_PROP_BOOL("kvm-no-smi-migration", X86CPU, kvm_no_smi_migration, false), + DEFINE_PROP_BOOL("kvm-pv-enforce-cpuid", X86CPU, kvm_pv_enforce_cpuid, + false), DEFINE_PROP_BOOL("vmware-cpuid-freq", X86CPU, vmware_cpuid_freq, true), DEFINE_PROP_BOOL("tcg-cpuid", X86CPU, expose_tcg, true), DEFINE_PROP_BOOL("x-migrate-smi-count", X86CPU, migrate_smi_count, diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 29552dc2a7..c990150373 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1802,6 +1802,9 @@ struct X86CPU { /* Stop SMI delivery for migration compatibility with old machines */ bool kvm_no_smi_migration; + /* Forcefully disable KVM PV features not exposed in guest CPUIDs */ + bool kvm_pv_enforce_cpuid; + /* Number of physical address bits supported */ uint32_t phys_bits; diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 7f1b060e6d..d6a70c27e5 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -1629,6 +1629,16 @@ int kvm_arch_init_vcpu(CPUState *cs) cpu_x86_cpuid(env, 0, 0, &limit, &unused, &unused, &unused); + if (cpu->kvm_pv_enforce_cpuid) { + r = kvm_vcpu_enable_cap(cs, KVM_CAP_ENFORCE_PV_FEATURE_CPUID, 0, 1); + if (r < 0) { + fprintf(stderr, + "failed to enable KVM_CAP_ENFORCE_PV_FEATURE_CPUID: %s", + strerror(-r)); + abort(); + } + } + for (i = 0; i <= limit; i++) { if (cpuid_i == KVM_MAX_CPUID_ENTRIES) { fprintf(stderr, "unsupported level value: 0x%x\n", limit); From 70367f091777419f42e5f68f4206deb641335877 Mon Sep 17 00:00:00 2001 From: Vitaly Kuznetsov <vkuznets@redhat.com> Date: Thu, 2 Sep 2021 11:35:26 +0200 Subject: [PATCH 18/29] i386: Support KVM_CAP_HYPERV_ENFORCE_CPUID By default, KVM allows the guest to use all currently supported Hyper-V enlightenments when Hyper-V CPUID interface was exposed, regardless of if some features were not announced in guest visible CPUIDs. hv-enforce-cpuid feature alters this behavior and only allows the guest to use exposed Hyper-V enlightenments. The feature is supported by Linux >= 5.14 and is not enabled by default in QEMU. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- docs/hyperv.txt | 17 ++++++++++++++--- target/i386/cpu.c | 1 + target/i386/cpu.h | 1 + target/i386/kvm/kvm.c | 9 +++++++++ 4 files changed, 25 insertions(+), 3 deletions(-) diff --git a/docs/hyperv.txt b/docs/hyperv.txt index 000638a2fd..072709a68f 100644 --- a/docs/hyperv.txt +++ b/docs/hyperv.txt @@ -203,8 +203,11 @@ When the option is set to 'on' QEMU will always enable the feature, regardless of host setup. To keep guests secure, this can only be used in conjunction with exposing correct vCPU topology and vCPU pinning. -4. Development features -======================== +4. Supplementary features +========================= + +4.1. hv-passthrough +=================== In some cases (e.g. during development) it may make sense to use QEMU in 'pass-through' mode and give Windows guests all enlightenments currently supported by KVM. This pass-through mode is enabled by "hv-passthrough" CPU @@ -215,8 +218,16 @@ values from KVM to QEMU. "hv-passthrough" overrides all other "hv-*" settings on the command line. Also, enabling this flag effectively prevents migration as the list of enabled enlightenments may differ between target and destination hosts. +4.2. hv-enforce-cpuid +===================== +By default, KVM allows the guest to use all currently supported Hyper-V +enlightenments when Hyper-V CPUID interface was exposed, regardless of if +some features were not announced in guest visible CPUIDs. 'hv-enforce-cpuid' +feature alters this behavior and only allows the guest to use exposed Hyper-V +enlightenments. -4. Useful links + +5. Useful links ================ Hyper-V Top Level Functional specification and other information: https://github.com/MicrosoftDocs/Virtualization-Documentation diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 598019de12..2a19eba56d 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -6834,6 +6834,7 @@ static Property x86_cpu_properties[] = { DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU, hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF), DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false), + DEFINE_PROP_BOOL("hv-enforce-cpuid", X86CPU, hyperv_enforce_cpuid, false), DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true), DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false), diff --git a/target/i386/cpu.h b/target/i386/cpu.h index c990150373..8a7209bbf2 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1719,6 +1719,7 @@ struct X86CPU { uint32_t hyperv_version_id[4]; uint32_t hyperv_limits[3]; uint32_t hyperv_nested[4]; + bool hyperv_enforce_cpuid; bool check_cpuid; bool enforce_cpuid; diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index d6a70c27e5..fbe6b7ac72 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -1531,6 +1531,15 @@ static int hyperv_init_vcpu(X86CPU *cpu) cpu->hyperv_nested[0] = evmcs_version; } + if (cpu->hyperv_enforce_cpuid) { + ret = kvm_vcpu_enable_cap(cs, KVM_CAP_HYPERV_ENFORCE_CPUID, 0, 1); + if (ret < 0) { + error_report("failed to enable KVM_CAP_HYPERV_ENFORCE_CPUID: %s", + strerror(-ret)); + return ret; + } + } + return 0; } From 050716292a63f4969b32cac32b85774521738ef5 Mon Sep 17 00:00:00 2001 From: Vitaly Kuznetsov <vkuznets@redhat.com> Date: Thu, 2 Sep 2021 11:35:27 +0200 Subject: [PATCH 19/29] i386: Move HV_APIC_ACCESS_RECOMMENDED bit setting to hyperv_fill_cpuids() In preparation to enabling Hyper-V + APICv/AVIC move HV_APIC_ACCESS_RECOMMENDED setting out of kvm_hyperv_properties[]: the 'real' feature bit for the vAPIC features is HV_APIC_ACCESS_AVAILABLE, HV_APIC_ACCESS_RECOMMENDED is a recommendation to use the feature which we may not always want to give. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-6-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- target/i386/kvm/kvm.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index fbe6b7ac72..a9a8f77df3 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -821,9 +821,7 @@ static struct { .desc = "virtual APIC (hv-vapic)", .flags = { {.func = HV_CPUID_FEATURES, .reg = R_EAX, - .bits = HV_APIC_ACCESS_AVAILABLE}, - {.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX, - .bits = HV_APIC_ACCESS_RECOMMENDED} + .bits = HV_APIC_ACCESS_AVAILABLE} } }, [HYPERV_FEAT_TIME] = { @@ -1366,6 +1364,7 @@ static int hyperv_fill_cpuids(CPUState *cs, c->ebx |= HV_POST_MESSAGES | HV_SIGNAL_EVENTS; } + /* Not exposed by KVM but needed to make CPU hotplug in Windows work */ c->edx |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE; @@ -1374,6 +1373,10 @@ static int hyperv_fill_cpuids(CPUState *cs, c->eax = hv_build_cpuid_leaf(cs, HV_CPUID_ENLIGHTMENT_INFO, R_EAX); c->ebx = cpu->hyperv_spinlock_attempts; + if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VAPIC)) { + c->eax |= HV_APIC_ACCESS_RECOMMENDED; + } + if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_ON) { c->eax |= HV_NO_NONARCH_CORESHARING; } else if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_AUTO) { From e1f9a8e8c90ae54387922e33e5ac4fd759747d01 Mon Sep 17 00:00:00 2001 From: Vitaly Kuznetsov <vkuznets@redhat.com> Date: Thu, 2 Sep 2021 11:35:28 +0200 Subject: [PATCH 20/29] i386: Implement pseudo 'hv-avic' ('hv-apicv') enlightenment The enlightenment allows to use Hyper-V SynIC with hardware APICv/AVIC enabled. Normally, Hyper-V SynIC disables these hardware features and suggests the guest to use paravirtualized AutoEOI feature. Linux-4.15 gains support for conditional APICv/AVIC disablement, the feature stays on until the guest tries to use AutoEOI feature with SynIC. With 'HV_DEPRECATING_AEOI_RECOMMENDED' bit exposed, modern enough Windows/ Hyper-V versions should follow the recommendation and not use the (unwanted) feature. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-7-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- docs/hyperv.txt | 10 +++++++++- target/i386/cpu.c | 4 ++++ target/i386/cpu.h | 1 + target/i386/kvm/hyperv-proto.h | 1 + target/i386/kvm/kvm.c | 10 +++++++++- 5 files changed, 24 insertions(+), 2 deletions(-) diff --git a/docs/hyperv.txt b/docs/hyperv.txt index 072709a68f..cd1ea3bbe9 100644 --- a/docs/hyperv.txt +++ b/docs/hyperv.txt @@ -189,7 +189,15 @@ enabled. Requires: hv-vpindex, hv-synic, hv-time, hv-stimer -3.17. hv-no-nonarch-coresharing=on/off/auto +3.18. hv-avic (hv-apicv) +======================= +The enlightenment allows to use Hyper-V SynIC with hardware APICv/AVIC enabled. +Normally, Hyper-V SynIC disables these hardware feature and suggests the guest +to use paravirtualized AutoEOI feature. +Note: enabling this feature on old hardware (without APICv/AVIC support) may +have negative effect on guest's performace. + +3.19. hv-no-nonarch-coresharing=on/off/auto =========================================== This enlightenment tells guest OS that virtual processors will never share a physical core unless they are reported as sibling SMT threads. This information diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 2a19eba56d..8154343cc4 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -6644,6 +6644,8 @@ static void x86_cpu_initfn(Object *obj) object_property_add_alias(obj, "sse4_1", obj, "sse4.1"); object_property_add_alias(obj, "sse4_2", obj, "sse4.2"); + object_property_add_alias(obj, "hv-apicv", obj, "hv-avic"); + if (xcc->model) { x86_cpu_load_model(cpu, xcc->model); } @@ -6831,6 +6833,8 @@ static Property x86_cpu_properties[] = { HYPERV_FEAT_IPI, 0), DEFINE_PROP_BIT64("hv-stimer-direct", X86CPU, hyperv_features, HYPERV_FEAT_STIMER_DIRECT, 0), + DEFINE_PROP_BIT64("hv-avic", X86CPU, hyperv_features, + HYPERV_FEAT_AVIC, 0), DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU, hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF), DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false), diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 8a7209bbf2..65f0ee2caf 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1056,6 +1056,7 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS]; #define HYPERV_FEAT_EVMCS 12 #define HYPERV_FEAT_IPI 13 #define HYPERV_FEAT_STIMER_DIRECT 14 +#define HYPERV_FEAT_AVIC 15 #ifndef HYPERV_SPINLOCK_NEVER_NOTIFY #define HYPERV_SPINLOCK_NEVER_NOTIFY 0xFFFFFFFF diff --git a/target/i386/kvm/hyperv-proto.h b/target/i386/kvm/hyperv-proto.h index 5fbb385cc1..89f81afda7 100644 --- a/target/i386/kvm/hyperv-proto.h +++ b/target/i386/kvm/hyperv-proto.h @@ -66,6 +66,7 @@ #define HV_APIC_ACCESS_RECOMMENDED (1u << 3) #define HV_SYSTEM_RESET_RECOMMENDED (1u << 4) #define HV_RELAXED_TIMING_RECOMMENDED (1u << 5) +#define HV_DEPRECATING_AEOI_RECOMMENDED (1u << 9) #define HV_CLUSTER_IPI_RECOMMENDED (1u << 10) #define HV_EX_PROCESSOR_MASKS_RECOMMENDED (1u << 11) #define HV_ENLIGHTENED_VMCS_RECOMMENDED (1u << 14) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index a9a8f77df3..68faf72e34 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -924,6 +924,13 @@ static struct { }, .dependencies = BIT(HYPERV_FEAT_STIMER) }, + [HYPERV_FEAT_AVIC] = { + .desc = "AVIC/APICv support (hv-avic/hv-apicv)", + .flags = { + {.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX, + .bits = HV_DEPRECATING_AEOI_RECOMMENDED} + } + }, }; static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max, @@ -1373,7 +1380,8 @@ static int hyperv_fill_cpuids(CPUState *cs, c->eax = hv_build_cpuid_leaf(cs, HV_CPUID_ENLIGHTMENT_INFO, R_EAX); c->ebx = cpu->hyperv_spinlock_attempts; - if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VAPIC)) { + if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VAPIC) && + !hyperv_feat_enabled(cpu, HYPERV_FEAT_AVIC)) { c->eax |= HV_APIC_ACCESS_RECOMMENDED; } From af7228b88dbe80ed5d5258b49be8b48ab351a476 Mon Sep 17 00:00:00 2001 From: Vitaly Kuznetsov <vkuznets@redhat.com> Date: Thu, 2 Sep 2021 11:35:29 +0200 Subject: [PATCH 21/29] i386: Make Hyper-V version id configurable Currently, we hardcode Hyper-V version id (CPUID 0x40000002) to WS2008R2 and it is known that certain tools in Windows check this. It seems useful to provide some flexibility by making it possible to change this info at will. CPUID information is defined in TLFS as: EAX: Build Number EBX Bits 31-16: Major Version Bits 15-0: Minor Version ECX Service Pack EDX Bits 31-24: Service Branch Bits 23-0: Service Number Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-8-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- docs/hyperv.txt | 14 ++++++++++++++ target/i386/cpu.c | 15 +++++++++++---- target/i386/cpu.h | 7 ++++++- target/i386/kvm/kvm.c | 26 ++++++++++++++++---------- 4 files changed, 47 insertions(+), 15 deletions(-) diff --git a/docs/hyperv.txt b/docs/hyperv.txt index cd1ea3bbe9..7803495468 100644 --- a/docs/hyperv.txt +++ b/docs/hyperv.txt @@ -211,6 +211,20 @@ When the option is set to 'on' QEMU will always enable the feature, regardless of host setup. To keep guests secure, this can only be used in conjunction with exposing correct vCPU topology and vCPU pinning. +3.20. hv-version-id-{build,major,minor,spack,sbranch,snumber} +============================================================= +This changes Hyper-V version identification in CPUID 0x40000002.EAX-EDX from the +default (WS2008R2). +- hv-version-id-build sets 'Build Number' (32 bits) +- hv-version-id-major sets 'Major Version' (16 bits) +- hv-version-id-minor sets 'Minor Version' (16 bits) +- hv-version-id-spack sets 'Service Pack' (32 bits) +- hv-version-id-sbranch sets 'Service Branch' (8 bits) +- hv-version-id-snumber sets 'Service Number' (24 bits) + +Note: hv-version-id-* are not enlightenments and thus don't enable Hyper-V +identification when specified without any other enlightenments. + 4. Supplementary features ========================= diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 8154343cc4..d1d057fabe 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -6258,10 +6258,6 @@ static void x86_cpu_hyperv_realize(X86CPU *cpu) cpu->hyperv_interface_id[2] = 0; cpu->hyperv_interface_id[3] = 0; - /* Hypervisor system identity */ - cpu->hyperv_version_id[0] = 0x00001bbc; - cpu->hyperv_version_id[1] = 0x00060001; - /* Hypervisor implementation limits */ cpu->hyperv_limits[0] = 64; cpu->hyperv_limits[1] = 0; @@ -6840,6 +6836,17 @@ static Property x86_cpu_properties[] = { DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false), DEFINE_PROP_BOOL("hv-enforce-cpuid", X86CPU, hyperv_enforce_cpuid, false), + /* WS2008R2 identify by default */ + DEFINE_PROP_UINT32("hv-version-id-build", X86CPU, hyperv_ver_id_build, + 0x1bbc), + DEFINE_PROP_UINT16("hv-version-id-major", X86CPU, hyperv_ver_id_major, + 0x0006), + DEFINE_PROP_UINT16("hv-version-id-minor", X86CPU, hyperv_ver_id_minor, + 0x0001), + DEFINE_PROP_UINT32("hv-version-id-spack", X86CPU, hyperv_ver_id_sp, 0), + DEFINE_PROP_UINT8("hv-version-id-sbranch", X86CPU, hyperv_ver_id_sb, 0), + DEFINE_PROP_UINT32("hv-version-id-snumber", X86CPU, hyperv_ver_id_sn, 0), + DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true), DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false), DEFINE_PROP_BOOL("x-force-features", X86CPU, force_features, false), diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 65f0ee2caf..3edaad7688 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1717,10 +1717,15 @@ struct X86CPU { OnOffAuto hyperv_no_nonarch_cs; uint32_t hyperv_vendor_id[3]; uint32_t hyperv_interface_id[4]; - uint32_t hyperv_version_id[4]; uint32_t hyperv_limits[3]; uint32_t hyperv_nested[4]; bool hyperv_enforce_cpuid; + uint32_t hyperv_ver_id_build; + uint16_t hyperv_ver_id_major; + uint16_t hyperv_ver_id_minor; + uint32_t hyperv_ver_id_sp; + uint8_t hyperv_ver_id_sb; + uint32_t hyperv_ver_id_sn; bool check_cpuid; bool enforce_cpuid; diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 68faf72e34..f25837f63f 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -1258,14 +1258,18 @@ bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp) cpu->hyperv_interface_id[3] = hv_cpuid_get_host(cs, HV_CPUID_INTERFACE, R_EDX); - cpu->hyperv_version_id[0] = + cpu->hyperv_ver_id_build = hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EAX); - cpu->hyperv_version_id[1] = - hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EBX); - cpu->hyperv_version_id[2] = + cpu->hyperv_ver_id_major = + hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EBX) >> 16; + cpu->hyperv_ver_id_minor = + hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EBX) & 0xffff; + cpu->hyperv_ver_id_sp = hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_ECX); - cpu->hyperv_version_id[3] = - hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EDX); + cpu->hyperv_ver_id_sb = + hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EDX) >> 24; + cpu->hyperv_ver_id_sn = + hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EDX) & 0xffffff; cpu->hv_max_vps = hv_cpuid_get_host(cs, HV_CPUID_IMPLEMENT_LIMITS, R_EAX); @@ -1351,10 +1355,12 @@ static int hyperv_fill_cpuids(CPUState *cs, c = &cpuid_ent[cpuid_i++]; c->function = HV_CPUID_VERSION; - c->eax = cpu->hyperv_version_id[0]; - c->ebx = cpu->hyperv_version_id[1]; - c->ecx = cpu->hyperv_version_id[2]; - c->edx = cpu->hyperv_version_id[3]; + c->eax = cpu->hyperv_ver_id_build; + c->ebx = (uint32_t)cpu->hyperv_ver_id_major << 16 | + cpu->hyperv_ver_id_minor; + c->ecx = cpu->hyperv_ver_id_sp; + c->edx = (uint32_t)cpu->hyperv_ver_id_sb << 24 | + (cpu->hyperv_ver_id_sn & 0xffffff); c = &cpuid_ent[cpuid_i++]; c->function = HV_CPUID_FEATURES; From f701ecec2bbaae2d04985eba87924a7329534e9a Mon Sep 17 00:00:00 2001 From: Vitaly Kuznetsov <vkuznets@redhat.com> Date: Thu, 2 Sep 2021 11:35:30 +0200 Subject: [PATCH 22/29] i386: Change the default Hyper-V version to match WS2016 KVM implements some Hyper-V 2016 functions so providing WS2008R2 version is somewhat incorrect. While generally guests shouldn't care about it and always check feature bits, it is known that some tools in Windows actually check version info. For compatibility reasons make the change for 6.2 machine types only. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-9-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- docs/hyperv.txt | 2 +- hw/i386/pc.c | 6 +++++- target/i386/cpu.c | 6 +++--- 3 files changed, 9 insertions(+), 5 deletions(-) diff --git a/docs/hyperv.txt b/docs/hyperv.txt index 7803495468..5d99fd9a72 100644 --- a/docs/hyperv.txt +++ b/docs/hyperv.txt @@ -214,7 +214,7 @@ exposing correct vCPU topology and vCPU pinning. 3.20. hv-version-id-{build,major,minor,spack,sbranch,snumber} ============================================================= This changes Hyper-V version identification in CPUID 0x40000002.EAX-EDX from the -default (WS2008R2). +default (WS2016). - hv-version-id-build sets 'Build Number' (32 bits) - hv-version-id-major sets 'Major Version' (16 bits) - hv-version-id-minor sets 'Minor Version' (16 bits) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 28e1b83b9d..86223acfd3 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -93,7 +93,11 @@ #include "trace.h" #include CONFIG_DEVICES -GlobalProperty pc_compat_6_1[] = {}; +GlobalProperty pc_compat_6_1[] = { + { TYPE_X86_CPU, "hv-version-id-build", "0x1bbc" }, + { TYPE_X86_CPU, "hv-version-id-major", "0x0006" }, + { TYPE_X86_CPU, "hv-version-id-minor", "0x0001" }, +}; const size_t pc_compat_6_1_len = G_N_ELEMENTS(pc_compat_6_1); GlobalProperty pc_compat_6_0[] = { diff --git a/target/i386/cpu.c b/target/i386/cpu.c index d1d057fabe..a7b1b6aa93 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -6838,11 +6838,11 @@ static Property x86_cpu_properties[] = { /* WS2008R2 identify by default */ DEFINE_PROP_UINT32("hv-version-id-build", X86CPU, hyperv_ver_id_build, - 0x1bbc), + 0x3839), DEFINE_PROP_UINT16("hv-version-id-major", X86CPU, hyperv_ver_id_major, - 0x0006), + 0x000A), DEFINE_PROP_UINT16("hv-version-id-minor", X86CPU, hyperv_ver_id_minor, - 0x0001), + 0x0000), DEFINE_PROP_UINT32("hv-version-id-spack", X86CPU, hyperv_ver_id_sp, 0), DEFINE_PROP_UINT8("hv-version-id-sbranch", X86CPU, hyperv_ver_id_sb, 0), DEFINE_PROP_UINT32("hv-version-id-snumber", X86CPU, hyperv_ver_id_sn, 0), From bcfdfae78f111fa3c0f81b2708098a545201bb68 Mon Sep 17 00:00:00 2001 From: Paolo Bonzini <pbonzini@redhat.com> Date: Tue, 7 Sep 2021 16:27:08 +0200 Subject: [PATCH 23/29] docs: name included files ".rst.inc" Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- docs/devel/{ci-definitions.rst => ci-definitions.rst.inc} | 0 docs/devel/{ci-jobs.rst => ci-jobs.rst.inc} | 0 docs/devel/{ci-runners.rst => ci-runners.rst.inc} | 0 docs/devel/ci.rst | 6 +++--- 4 files changed, 3 insertions(+), 3 deletions(-) rename docs/devel/{ci-definitions.rst => ci-definitions.rst.inc} (100%) rename docs/devel/{ci-jobs.rst => ci-jobs.rst.inc} (100%) rename docs/devel/{ci-runners.rst => ci-runners.rst.inc} (100%) diff --git a/docs/devel/ci-definitions.rst b/docs/devel/ci-definitions.rst.inc similarity index 100% rename from docs/devel/ci-definitions.rst rename to docs/devel/ci-definitions.rst.inc diff --git a/docs/devel/ci-jobs.rst b/docs/devel/ci-jobs.rst.inc similarity index 100% rename from docs/devel/ci-jobs.rst rename to docs/devel/ci-jobs.rst.inc diff --git a/docs/devel/ci-runners.rst b/docs/devel/ci-runners.rst.inc similarity index 100% rename from docs/devel/ci-runners.rst rename to docs/devel/ci-runners.rst.inc diff --git a/docs/devel/ci.rst b/docs/devel/ci.rst index 8d95247188..d106610096 100644 --- a/docs/devel/ci.rst +++ b/docs/devel/ci.rst @@ -8,6 +8,6 @@ found at:: https://wiki.qemu.org/Testing/CI -.. include:: ci-definitions.rst -.. include:: ci-jobs.rst -.. include:: ci-runners.rst +.. include:: ci-definitions.rst.inc +.. include:: ci-jobs.rst.inc +.. include:: ci-runners.rst.inc From f9df7aac758fed5cc2fb9210ad0edba79434aeed Mon Sep 17 00:00:00 2001 From: Paolo Bonzini <pbonzini@redhat.com> Date: Tue, 7 Sep 2021 16:42:18 +0200 Subject: [PATCH 24/29] docs: move notes inside the body of the document Make all documents start with a heading. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- docs/devel/multi-process.rst | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/docs/devel/multi-process.rst b/docs/devel/multi-process.rst index 69699329d6..e5758a79ab 100644 --- a/docs/devel/multi-process.rst +++ b/docs/devel/multi-process.rst @@ -1,15 +1,17 @@ -This is the design document for multi-process QEMU. It does not -necessarily reflect the status of the current implementation, which -may lack features or be considerably different from what is described -in this document. This document is still useful as a description of -the goals and general direction of this feature. - -Please refer to the following wiki for latest details: -https://wiki.qemu.org/Features/MultiProcessQEMU - Multi-process QEMU =================== +.. note:: + + This is the design document for multi-process QEMU. It does not + necessarily reflect the status of the current implementation, which + may lack features or be considerably different from what is described + in this document. This document is still useful as a description of + the goals and general direction of this feature. + + Please refer to the following wiki for latest details: + https://wiki.qemu.org/Features/MultiProcessQEMU + QEMU is often used as the hypervisor for virtual machines running in the Oracle cloud. Since one of the advantages of cloud computing is the ability to run many VMs from different tenants in the same cloud From 8b8939e44fc8315885598a9e1ee8deeea3b68a96 Mon Sep 17 00:00:00 2001 From: Paolo Bonzini <pbonzini@redhat.com> Date: Tue, 7 Sep 2021 16:25:50 +0200 Subject: [PATCH 25/29] docs: put "make" information together in build-system.rst Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- docs/devel/build-system.rst | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/docs/devel/build-system.rst b/docs/devel/build-system.rst index 3baec158f2..0f636d620e 100644 --- a/docs/devel/build-system.rst +++ b/docs/devel/build-system.rst @@ -380,6 +380,16 @@ phony target, while benchmarks are run with ``make bench``. Meson test suites such as ``unit`` can be ran with ``make check-unit`` too. It is also possible to run tests defined in meson.build with ``meson test``. +Useful make targets +------------------- + +``help`` + Print a help message for the most common build targets. + +``print-VAR`` + Print the value of the variable VAR. Useful for debugging the build + system. + Important files for the build system ==================================== @@ -473,14 +483,3 @@ Built by Makefile: meson.build. The rules are produced from Meson's JSON description of tests (obtained with "meson introspect --tests") through the script scripts/mtest2make.py. - - -Useful make targets -------------------- - -``help`` - Print a help message for the most common build targets. - -``print-VAR`` - Print the value of the variable VAR. Useful for debugging the build - system. From 768f14f94ec50fe57a3964fff75d8b3456b588b5 Mon Sep 17 00:00:00 2001 From: Paolo Bonzini <pbonzini@redhat.com> Date: Tue, 7 Sep 2021 16:53:01 +0200 Subject: [PATCH 26/29] docs: reorganize qgraph.rst Clean up the heading levels to use === --- ~~~, and move the command line building near to the other execution steps. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- docs/devel/qgraph.rst | 132 +++++++++++++++++++++--------------------- 1 file changed, 67 insertions(+), 65 deletions(-) diff --git a/docs/devel/qgraph.rst b/docs/devel/qgraph.rst index c2882c3a33..db44d71002 100644 --- a/docs/devel/qgraph.rst +++ b/docs/devel/qgraph.rst @@ -1,8 +1,7 @@ .. _qgraph: -======================================== Qtest Driver Framework -======================================== +====================== In order to test a specific driver, plain libqos tests need to take care of booting QEMU with the right machine and devices. @@ -31,13 +30,15 @@ so the sdhci-test should only care of linking its qgraph node with that interface. In this way, if the command line of a sdhci driver is changed, only the respective qgraph driver node has to be adjusted. +QGraph concepts +--------------- + The graph is composed by nodes that represent machines, drivers, tests and edges that define the relationships between them (``CONSUMES``, ``PRODUCES``, and ``CONTAINS``). - Nodes -^^^^^^ +~~~~~ A node can be of four types: @@ -64,7 +65,7 @@ Notes for the nodes: drivers name, otherwise they won't be discovered Edges -^^^^^^ +~~~~~ An edge relation between two nodes (drivers or machines) ``X`` and ``Y`` can be: @@ -73,7 +74,7 @@ An edge relation between two nodes (drivers or machines) ``X`` and ``Y`` can be: - ``X CONTAINS Y``: ``Y`` is part of ``X`` component Execution steps -^^^^^^^^^^^^^^^ +~~~~~~~~~~~~~~~ The basic framework steps are the following: @@ -92,8 +93,64 @@ The basic framework steps are the following: Depending on the QEMU binary used, only some drivers/machines will be available and only test that are reached by them will be executed. +Command line +~~~~~~~~~~~~ + +Command line is built by using node names and optional arguments +passed by the user when building the edges. + +There are three types of command line arguments: + +- ``in node`` : created from the node name. For example, machines will + have ``-M <machine>`` to its command line, while devices + ``-device <device>``. It is automatically done by the framework. +- ``after node`` : added as additional argument to the node name. + This argument is added optionally when creating edges, + by setting the parameter ``after_cmd_line`` and + ``extra_edge_opts`` in ``QOSGraphEdgeOptions``. + The framework automatically adds + a comma before ``extra_edge_opts``, + because it is going to add attributes + after the destination node pointed by + the edge containing these options, and automatically + adds a space before ``after_cmd_line``, because it + adds an additional device, not an attribute. +- ``before node`` : added as additional argument to the node name. + This argument is added optionally when creating edges, + by setting the parameter ``before_cmd_line`` in + ``QOSGraphEdgeOptions``. This attribute + is going to add attributes before the destination node + pointed by the edge containing these options. It is + helpful to commands that are not node-representable, + such as ``-fdsev`` or ``-netdev``. + +While adding command line in edges is always used, not all nodes names are +used in every path walk: this is because the contained or produced ones +are already added by QEMU, so only nodes that "consumes" will be used to +build the command line. Also, nodes that will have ``{ "abstract" : true }`` +as QMP attribute will loose their command line, since they are not proper +devices to be added in QEMU. + +Example:: + + QOSGraphEdgeOptions opts = { + .before_cmd_line = "-drive id=drv0,if=none,file=null-co://," + "file.read-zeroes=on,format=raw", + .after_cmd_line = "-device scsi-hd,bus=vs0.0,drive=drv0", + + opts.extra_device_opts = "id=vs0"; + }; + + qos_node_create_driver("virtio-scsi-device", + virtio_scsi_device_create); + qos_node_consumes("virtio-scsi-device", "virtio-bus", &opts); + +Will produce the following command line: +``-drive id=drv0,if=none,file=null-co://, -device virtio-scsi-device,id=vs0 -device scsi-hd,bus=vs0.0,drive=drv0`` + Troubleshooting unavailable tests -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + If there is no path from an available machine to a test then that test will be unavailable and won't execute. This can happen if a test or driver did not set up its qgraph node correctly. It can also happen if the necessary machine type @@ -151,7 +208,7 @@ Typically this is because the QEMU binary lacks support for the necessary machine type or device. Creating a new driver and its interface -""""""""""""""""""""""""""""""""""""""""" +--------------------------------------- Here we continue the ``sdhci`` use case, with the following scenario: @@ -489,7 +546,7 @@ or inverting the consumes edge in consumed_by:: arm/raspi2b --contains--> generic-sdhci Adding a new test -""""""""""""""""" +----------------- Given the above setup, adding a new test is very simple. ``sdhci-test``, taken from ``tests/qtest/sdhci-test.c``:: @@ -565,62 +622,7 @@ and for the binary ``QTEST_QEMU_BINARY=./qemu-system-arm``: Additional examples are also in ``test-qgraph.c`` -Command line: -"""""""""""""" - -Command line is built by using node names and optional arguments -passed by the user when building the edges. - -There are three types of command line arguments: - -- ``in node`` : created from the node name. For example, machines will - have ``-M <machine>`` to its command line, while devices - ``-device <device>``. It is automatically done by the framework. -- ``after node`` : added as additional argument to the node name. - This argument is added optionally when creating edges, - by setting the parameter ``after_cmd_line`` and - ``extra_edge_opts`` in ``QOSGraphEdgeOptions``. - The framework automatically adds - a comma before ``extra_edge_opts``, - because it is going to add attributes - after the destination node pointed by - the edge containing these options, and automatically - adds a space before ``after_cmd_line``, because it - adds an additional device, not an attribute. -- ``before node`` : added as additional argument to the node name. - This argument is added optionally when creating edges, - by setting the parameter ``before_cmd_line`` in - ``QOSGraphEdgeOptions``. This attribute - is going to add attributes before the destination node - pointed by the edge containing these options. It is - helpful to commands that are not node-representable, - such as ``-fdsev`` or ``-netdev``. - -While adding command line in edges is always used, not all nodes names are -used in every path walk: this is because the contained or produced ones -are already added by QEMU, so only nodes that "consumes" will be used to -build the command line. Also, nodes that will have ``{ "abstract" : true }`` -as QMP attribute will loose their command line, since they are not proper -devices to be added in QEMU. - -Example:: - - QOSGraphEdgeOptions opts = { - .before_cmd_line = "-drive id=drv0,if=none,file=null-co://," - "file.read-zeroes=on,format=raw", - .after_cmd_line = "-device scsi-hd,bus=vs0.0,drive=drv0", - - opts.extra_device_opts = "id=vs0"; - }; - - qos_node_create_driver("virtio-scsi-device", - virtio_scsi_device_create); - qos_node_consumes("virtio-scsi-device", "virtio-bus", &opts); - -Will produce the following command line: -``-drive id=drv0,if=none,file=null-co://, -device virtio-scsi-device,id=vs0 -device scsi-hd,bus=vs0.0,drive=drv0`` - Qgraph API reference -^^^^^^^^^^^^^^^^^^^^ +-------------------- .. kernel-doc:: tests/qtest/libqos/qgraph.h From e9adb4ace229dfa742176e9ddb629dbb6a6081bc Mon Sep 17 00:00:00 2001 From: Paolo Bonzini <pbonzini@redhat.com> Date: Tue, 7 Sep 2021 17:06:07 +0200 Subject: [PATCH 27/29] docs: reorganize tcg-plugins.rst Clean up the heading levels to use === --- ~~~, and create a new "writing plugins" section. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- docs/devel/tcg-plugins.rst | 117 ++++++++++++++++++------------------- 1 file changed, 58 insertions(+), 59 deletions(-) diff --git a/docs/devel/tcg-plugins.rst b/docs/devel/tcg-plugins.rst index dac5101a3c..842ae01a4c 100644 --- a/docs/devel/tcg-plugins.rst +++ b/docs/devel/tcg-plugins.rst @@ -3,7 +3,6 @@ Copyright (c) 2019, Linaro Limited Written by Emilio Cota and Alex Bennée -================ QEMU TCG Plugins ================ @@ -16,60 +15,8 @@ only monitor it passively. However they can do this down to an individual instruction granularity including potentially subscribing to all load and store operations. -API Stability -============= - -This is a new feature for QEMU and it does allow people to develop -out-of-tree plugins that can be dynamically linked into a running QEMU -process. However the project reserves the right to change or break the -API should it need to do so. The best way to avoid this is to submit -your plugin upstream so they can be updated if/when the API changes. - -API versioning --------------- - -All plugins need to declare a symbol which exports the plugin API -version they were built against. This can be done simply by:: - - QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION; - -The core code will refuse to load a plugin that doesn't export a -``qemu_plugin_version`` symbol or if plugin version is outside of QEMU's -supported range of API versions. - -Additionally the ``qemu_info_t`` structure which is passed to the -``qemu_plugin_install`` method of a plugin will detail the minimum and -current API versions supported by QEMU. The API version will be -incremented if new APIs are added. The minimum API version will be -incremented if existing APIs are changed or removed. - -Exposure of QEMU internals --------------------------- - -The plugin architecture actively avoids leaking implementation details -about how QEMU's translation works to the plugins. While there are -conceptions such as translation time and translation blocks the -details are opaque to plugins. The plugin is able to query select -details of instructions and system configuration only through the -exported *qemu_plugin* functions. - -Query Handle Lifetime ---------------------- - -Each callback provides an opaque anonymous information handle which -can usually be further queried to find out information about a -translation, instruction or operation. The handles themselves are only -valid during the lifetime of the callback so it is important that any -information that is needed is extracted during the callback and saved -by the plugin. - -API -=== - -.. kernel-doc:: include/qemu/qemu-plugin.h - Usage -===== +----- Any QEMU binary with TCG support has plugins enabled by default. Earlier releases needed to be explicitly enabled with:: @@ -87,8 +34,45 @@ Arguments are plugin specific and can be used to modify their behaviour. In this case the howvec plugin is being asked to use inline ops to count and break down the hint instructions by type. -Plugin Life cycle -================= +Writing plugins +--------------- + +API versioning +~~~~~~~~~~~~~~ + +This is a new feature for QEMU and it does allow people to develop +out-of-tree plugins that can be dynamically linked into a running QEMU +process. However the project reserves the right to change or break the +API should it need to do so. The best way to avoid this is to submit +your plugin upstream so they can be updated if/when the API changes. + +All plugins need to declare a symbol which exports the plugin API +version they were built against. This can be done simply by:: + + QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION; + +The core code will refuse to load a plugin that doesn't export a +``qemu_plugin_version`` symbol or if plugin version is outside of QEMU's +supported range of API versions. + +Additionally the ``qemu_info_t`` structure which is passed to the +``qemu_plugin_install`` method of a plugin will detail the minimum and +current API versions supported by QEMU. The API version will be +incremented if new APIs are added. The minimum API version will be +incremented if existing APIs are changed or removed. + +Lifetime of the query handle +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Each callback provides an opaque anonymous information handle which +can usually be further queried to find out information about a +translation, instruction or operation. The handles themselves are only +valid during the lifetime of the callback so it is important that any +information that is needed is extracted during the callback and saved +by the plugin. + +Plugin life cycle +~~~~~~~~~~~~~~~~~ First the plugin is loaded and the public qemu_plugin_install function is called. The plugin will then register callbacks for various plugin @@ -111,11 +95,26 @@ callback which can then ensure atomicity itself. Finally when QEMU exits all the registered *atexit* callbacks are invoked. +Exposure of QEMU internals +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The plugin architecture actively avoids leaking implementation details +about how QEMU's translation works to the plugins. While there are +conceptions such as translation time and translation blocks the +details are opaque to plugins. The plugin is able to query select +details of instructions and system configuration only through the +exported *qemu_plugin* functions. + +API +~~~ + +.. kernel-doc:: include/qemu/qemu-plugin.h + Internals -========= +--------- Locking -------- +~~~~~~~ We have to ensure we cannot deadlock, particularly under MTTCG. For this we acquire a lock when called from plugin code. We also keep the @@ -142,7 +141,7 @@ requested. The plugin isn't completely uninstalled until the safe work has executed while all vCPUs are quiescent. Example Plugins -=============== +--------------- There are a number of plugins included with QEMU and you are encouraged to contribute your own plugins plugins upstream. There is a From 9fce3601761779f14d8d1ea32e2b6abf2f704edb Mon Sep 17 00:00:00 2001 From: Paolo Bonzini <pbonzini@redhat.com> Date: Wed, 22 Sep 2021 17:06:57 +0200 Subject: [PATCH 28/29] docs: move gcov section at the end of testing.rst gcov testing applies to all tests, not just make check. Move it out of the make check section. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- docs/devel/testing.rst | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst index 64c9744795..a80df07f6d 100644 --- a/docs/devel/testing.rst +++ b/docs/devel/testing.rst @@ -114,25 +114,6 @@ check-block are in the "auto" group). See the "QEMU iotests" section below for more information. -GCC gcov support ----------------- - -``gcov`` is a GCC tool to analyze the testing coverage by -instrumenting the tested code. To use it, configure QEMU with -``--enable-gcov`` option and build. Then run ``make check`` as usual. - -If you want to gather coverage information on a single test the ``make -clean-gcda`` target can be used to delete any existing coverage -information before running a single test. - -You can generate a HTML coverage report by executing ``make -coverage-html`` which will create -``meson-logs/coveragereport/index.html``. - -Further analysis can be conducted by running the ``gcov`` command -directly on the various .gcda output files. Please read the ``gcov`` -documentation for more information. - QEMU iotests ============ @@ -1302,3 +1283,22 @@ exercise as many corner cases as possible. It is a useful test suite to run to exercise QEMU's linux-user code:: https://linux-test-project.github.io/ + +GCC gcov support +================ + +``gcov`` is a GCC tool to analyze the testing coverage by +instrumenting the tested code. To use it, configure QEMU with +``--enable-gcov`` option and build. Then run the tests as usual. + +If you want to gather coverage information on a single test the ``make +clean-gcda`` target can be used to delete any existing coverage +information before running a single test. + +You can generate a HTML coverage report by executing ``make +coverage-html`` which will create +``meson-logs/coveragereport/index.html``. + +Further analysis can be conducted by running the ``gcov`` command +directly on the various .gcda output files. Please read the ``gcov`` +documentation for more information. From 16e79e1b01a698908e14eda3078d4a8e7b1b9c2b Mon Sep 17 00:00:00 2001 From: Paolo Bonzini <pbonzini@redhat.com> Date: Tue, 7 Sep 2021 17:21:58 +0200 Subject: [PATCH 29/29] docs: reorganize testing.rst Clean up the heading levels to use === --- ~~~ ^^^ '''. Reorganize the outline for the Avocado part, and always include headings for the class names. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- docs/devel/testing.rst | 146 +++++++++++++++++++++-------------------- 1 file changed, 76 insertions(+), 70 deletions(-) diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst index a80df07f6d..7500f076c2 100644 --- a/docs/devel/testing.rst +++ b/docs/devel/testing.rst @@ -1,11 +1,10 @@ -=============== Testing in QEMU =============== This document describes the testing infrastructure in QEMU. Testing with "make check" -========================= +------------------------- The "make check" testing family includes most of the C based tests in QEMU. For a quick help, run ``make check-help`` from the source tree. @@ -24,7 +23,7 @@ expect the executables to exist and will fail with obscure messages if they cannot find them. Unit tests ----------- +~~~~~~~~~~ Unit tests, which can be invoked with ``make check-unit``, are simple C tests that typically link to individual QEMU object files and exercise them by @@ -67,7 +66,7 @@ and copy the actual command line which executes the unit test, then run it from the command line. QTest ------ +~~~~~ QTest is a device emulation testing framework. It can be very useful to test device models; it could also control certain aspects of QEMU (such as virtual @@ -81,7 +80,7 @@ QTest cases can be executed with make check-qtest QAPI schema tests ------------------ +~~~~~~~~~~~~~~~~~ The QAPI schema tests validate the QAPI parser used by QMP, by feeding predefined input to the parser and comparing the result with the reference @@ -108,14 +107,14 @@ parser (either fixing a bug or extending/modifying the syntax). To do this: ``qapi-schema += foo.json`` check-block ------------ +~~~~~~~~~~~ ``make check-block`` runs a subset of the block layer iotests (the tests that are in the "auto" group). See the "QEMU iotests" section below for more information. QEMU iotests -============ +------------ QEMU iotests, under the directory ``tests/qemu-iotests``, is the testing framework widely used to test block layer related features. It is higher level @@ -152,7 +151,7 @@ More options are supported by the ``./check`` script, run ``./check -h`` for help. Writing a new test case ------------------------ +~~~~~~~~~~~~~~~~~~~~~~~ Consider writing a tests case when you are making any changes to the block layer. An iotest case is usually the choice for that. There are already many @@ -206,7 +205,8 @@ test failure. If using such devices are explicitly desired, consider adding ``locking=off`` option to disable image locking. Debugging a test case ------------------------ +~~~~~~~~~~~~~~~~~~~~~ + The following options to the ``check`` script can be useful when debugging a failing test: @@ -235,7 +235,7 @@ a failing test: ``$TEST_DIR/qemu-machine-<random_string>``. Test case groups ----------------- +~~~~~~~~~~~~~~~~ "Tests may belong to one or more test groups, which are defined in the form of a comment in the test source file. By convention, test groups are listed @@ -285,10 +285,10 @@ Note that the following group names have a special meaning: .. _container-ref: Container based tests -===================== +--------------------- Introduction ------------- +~~~~~~~~~~~~ The container testing framework in QEMU utilizes public images to build and test QEMU in predefined and widely accessible Linux @@ -303,7 +303,7 @@ The container images are also used to augment the generation of tests for testing TCG. See :ref:`checktcg-ref` for more details. Docker Prerequisites --------------------- +~~~~~~~~~~~~~~~~~~~~ Install "docker" with the system package manager and start the Docker service on your development machine, then make sure you have the privilege to run @@ -334,7 +334,7 @@ exploit the whole host with Docker bind mounting or other privileged operations. So only do it on development machines. Podman Prerequisites --------------------- +~~~~~~~~~~~~~~~~~~~~ Install "podman" with the system package manager. @@ -346,7 +346,7 @@ Install "podman" with the system package manager. The last command should print an empty table, to verify the system is ready. Quickstart ----------- +~~~~~~~~~~ From source tree, type ``make docker-help`` to see the help. Testing can be started without configuring or building QEMU (``configure`` and @@ -362,7 +362,7 @@ is downloaded and initialized automatically), in which the ``test-build`` job is executed. Registry --------- +~~~~~~~~ The QEMU project has a container registry hosted by GitLab at ``registry.gitlab.com/qemu-project/qemu`` which will automatically be @@ -376,7 +376,7 @@ locally by using the ``NOCACHE`` build option: make docker-image-debian10 NOCACHE=1 Images ------- +~~~~~~ Along with many other images, the ``centos8`` image is defined in a Dockerfile in ``tests/docker/dockerfiles/``, called ``centos8.docker``. ``make docker-help`` @@ -391,7 +391,7 @@ mainly used to do necessary host side setup. One such setup is ``binfmt_misc``, for example, to make qemu-user powered cross build containers work. Tests ------ +~~~~~ Different tests are added to cover various configurations to build and test QEMU. Docker tests are the executables under ``tests/docker`` named @@ -402,7 +402,7 @@ source and build it. The full list of tests is printed in the ``make docker-help`` help. Debugging a Docker test failure -------------------------------- +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When CI tasks, maintainers or yourself report a Docker test failure, follow the below steps to debug it: @@ -419,7 +419,7 @@ below steps to debug it: the prompt for debug. Options -------- +~~~~~~~ Various options can be used to affect how Docker tests are done. The full list is in the ``make docker`` help text. The frequently used ones are: @@ -433,7 +433,7 @@ list is in the ``make docker`` help text. The frequently used ones are: failure" section. Thread Sanitizer -================ +---------------- Thread Sanitizer (TSan) is a tool which can detect data races. QEMU supports building and testing with this tool. @@ -443,7 +443,7 @@ For more information on TSan: https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual Thread Sanitizer in Docker ---------------------------- +~~~~~~~~~~~~~~~~~~~~~~~~~~ TSan is currently supported in the ubuntu2004 docker. The test-tsan test will build using TSan and then run make check. @@ -458,7 +458,7 @@ We recommend using DEBUG=1 to allow launching the test from inside the docker, and to allow review of the warnings generated by TSan. Building and Testing with TSan ------------------------------- +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It is possible to build and test with TSan, with a few additional steps. These steps are normally done automatically in the docker. @@ -497,7 +497,7 @@ This allows for running the test and then checking the warnings afterwards. If you want TSan to stop and exit with error on warnings, use exitcode=66. TSan Suppressions ------------------ +~~~~~~~~~~~~~~~~~ Keep in mind that for any data race warning, although there might be a data race detected by TSan, there might be no actual bug here. TSan provides several different mechanisms for suppressing warnings. In general it is recommended @@ -523,7 +523,7 @@ More information on the file format can be found here under "Blacklist Format": https://github.com/google/sanitizers/wiki/ThreadSanitizerFlags TSan Annotations ----------------- +~~~~~~~~~~~~~~~~ include/qemu/tsan.h defines annotations. See this file for more descriptions of the annotations themselves. Annotations can be used to suppress TSan warnings or give TSan more information so that it can detect proper @@ -540,14 +540,14 @@ The full set of annotations can be found here: https://github.com/llvm/llvm-project/blob/master/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp VM testing -========== +---------- This test suite contains scripts that bootstrap various guest images that have necessary packages to build QEMU. The basic usage is documented in ``Makefile`` help which is displayed with ``make vm-help``. Quickstart ----------- +~~~~~~~~~~ Run ``make vm-help`` to list available make targets. Invoke a specific make command to run build test in an image. For example, ``make vm-build-freebsd`` @@ -562,7 +562,7 @@ concerned about attackers taking control of the guest and potentially exploiting a QEMU security bug to compromise the host. QEMU binaries -------------- +~~~~~~~~~~~~~ By default, qemu-system-x86_64 is searched in $PATH to run the guest. If there isn't one, or if it is older than 2.10, the test won't work. In this case, @@ -571,20 +571,20 @@ provide the QEMU binary in env var: ``QEMU=/path/to/qemu-2.10+``. Likewise the path to qemu-img can be set in QEMU_IMG environment variable. Make jobs ---------- +~~~~~~~~~ The ``-j$X`` option in the make command line is not propagated into the VM, specify ``J=$X`` to control the make jobs in the guest. Debugging ---------- +~~~~~~~~~ Add ``DEBUG=1`` and/or ``V=1`` to the make command to allow interactive debugging and verbose output. If this is not enough, see the next section. ``V=1`` will be propagated down into the make jobs in the guest. Manual invocation ------------------ +~~~~~~~~~~~~~~~~~ Each guest script is an executable script with the same command line options. For example to work with the netbsd guest, use ``$QEMU_SRC/tests/vm/netbsd``: @@ -608,7 +608,7 @@ For example to work with the netbsd guest, use ``$QEMU_SRC/tests/vm/netbsd``: $ ./netbsd --interactive --image /var/tmp/netbsd.img sh Adding new guests ------------------ +~~~~~~~~~~~~~~~~~ Please look at existing guest scripts for how to add new guests. @@ -641,7 +641,7 @@ the script's ``main()``. recommended. Image fuzzer testing -==================== +-------------------- An image fuzzer was added to exercise format drivers. Currently only qcow2 is supported. To start the fuzzer, run @@ -654,7 +654,7 @@ Alternatively, some command different from "qemu-img info" can be tested, by changing the ``-c`` option. Acceptance tests using the Avocado Framework -============================================ +-------------------------------------------- The ``tests/acceptance`` directory hosts functional tests, also known as acceptance level tests. They're usually higher level tests, and @@ -693,7 +693,7 @@ Tests based on ``avocado_qemu.Test`` can easily: - http://avocado-framework.readthedocs.io/en/latest/api/utils/avocado.utils.html Running tests -------------- +~~~~~~~~~~~~~ You can run the acceptance tests simply by executing: @@ -791,7 +791,7 @@ of Avocado or ``make check-acceptance``, and can also be queried using: tests/venv/bin/avocado list tests/acceptance Manual Installation -------------------- +~~~~~~~~~~~~~~~~~~~ To manually install Avocado and its dependencies, run: @@ -804,7 +804,7 @@ Alternatively, follow the instructions on this link: https://avocado-framework.readthedocs.io/en/latest/guides/user/chapters/installing.html Overview --------- +~~~~~~~~ The ``tests/acceptance/avocado_qemu`` directory provides the ``avocado_qemu`` Python module, containing the ``avocado_qemu.Test`` @@ -840,7 +840,7 @@ in the current directory, tagged as "quick", run: avocado run -t quick . The ``avocado_qemu.Test`` base test class ------------------------------------------ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The ``avocado_qemu.Test`` class has a number of characteristics that are worth being mentioned right away. @@ -890,7 +890,7 @@ At test "tear down", ``avocado_qemu.Test`` handles all the QEMUMachines shutdown. The ``avocado_qemu.LinuxTest`` base test class -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The ``avocado_qemu.LinuxTest`` is further specialization of the ``avocado_qemu.Test`` class, so it contains all the characteristics of @@ -933,7 +933,7 @@ execution of a QEMU binary, giving its users: a more succinct and intuitive way QEMU binary selection -~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^ The QEMU binary used for the ``self.vm`` QEMUMachine instance will primarily depend on the value of the ``qemu_bin`` parameter. If it's @@ -954,20 +954,23 @@ The resulting ``qemu_bin`` value will be preserved in the ``avocado_qemu.Test`` as an attribute with the same name. Attribute reference -------------------- +~~~~~~~~~~~~~~~~~~~ + +Test +^^^^ Besides the attributes and methods that are part of the base ``avocado.Test`` class, the following attributes are available on any ``avocado_qemu.Test`` instance. vm -~~ +'' A QEMUMachine instance, initially configured according to the given ``qemu_bin`` parameter. arch -~~~~ +'''' The architecture can be used on different levels of the stack, e.g. by the framework or by the test itself. At the framework level, it will @@ -984,7 +987,7 @@ name. If one is not given explicitly, it will either be set to ``:avocado: tags=arch:VALUE`` tag, it will be set to ``VALUE``. cpu -~~~ +''' The cpu model that will be set to all QEMUMachine instances created by the test. @@ -995,7 +998,7 @@ name. If one is not given explicitly, it will either be set to ``:avocado: tags=cpu:VALUE`` tag, it will be set to ``VALUE``. machine -~~~~~~~ +''''''' The machine type that will be set to all QEMUMachine instances created by the test. @@ -1006,20 +1009,20 @@ name. If one is not given explicitly, it will either be set to ``:avocado: tags=machine:VALUE`` tag, it will be set to ``VALUE``. qemu_bin -~~~~~~~~ +'''''''' The preserved value of the ``qemu_bin`` parameter or the result of the dynamic probe for a QEMU binary in the current working directory or source tree. LinuxTest -~~~~~~~~~ +^^^^^^^^^ Besides the attributes present on the ``avocado_qemu.Test`` base class, the ``avocado_qemu.LinuxTest`` adds the following attributes: distro -...... +'''''' The name of the Linux distribution used as the guest image for the test. The name should match the **Provider** column on the list @@ -1028,7 +1031,7 @@ of images supported by the avocado.utils.vmimage library: https://avocado-framework.readthedocs.io/en/latest/guides/writer/libs/vmimage.html#supported-images distro_version -.............. +'''''''''''''' The version of the Linux distribution as the guest image for the test. The name should match the **Version** column on the list @@ -1037,7 +1040,7 @@ of images supported by the avocado.utils.vmimage library: https://avocado-framework.readthedocs.io/en/latest/guides/writer/libs/vmimage.html#supported-images distro_checksum -............... +''''''''''''''' The sha256 hash of the guest image file used for the test. @@ -1046,7 +1049,7 @@ same name), no validation on the integrity of the image will be performed. Parameter reference -------------------- +~~~~~~~~~~~~~~~~~~~ To understand how Avocado parameters are accessed by tests, and how they can be passed to tests, please refer to:: @@ -1060,8 +1063,11 @@ like the following: PARAMS (key=qemu_bin, path=*, default=./qemu-system-x86_64) => './qemu-system-x86_64 +Test +^^^^ + arch -~~~~ +'''' The architecture that will influence the selection of a QEMU binary (when one is not explicitly given). @@ -1074,31 +1080,30 @@ This parameter has a direct relation with the ``arch`` attribute. If not given, it will default to None. cpu -~~~ +''' The cpu model that will be set to all QEMUMachine instances created by the test. machine -~~~~~~~ +''''''' The machine type that will be set to all QEMUMachine instances created by the test. - qemu_bin -~~~~~~~~ +'''''''' The exact QEMU binary to be used on QEMUMachine. LinuxTest -~~~~~~~~~ +^^^^^^^^^ Besides the parameters present on the ``avocado_qemu.Test`` base class, the ``avocado_qemu.LinuxTest`` adds the following parameters: distro -...... +'''''' The name of the Linux distribution used as the guest image for the test. The name should match the **Provider** column on the list @@ -1107,7 +1112,7 @@ of images supported by the avocado.utils.vmimage library: https://avocado-framework.readthedocs.io/en/latest/guides/writer/libs/vmimage.html#supported-images distro_version -.............. +'''''''''''''' The version of the Linux distribution as the guest image for the test. The name should match the **Version** column on the list @@ -1116,7 +1121,7 @@ of images supported by the avocado.utils.vmimage library: https://avocado-framework.readthedocs.io/en/latest/guides/writer/libs/vmimage.html#supported-images distro_checksum -............... +''''''''''''''' The sha256 hash of the guest image file used for the test. @@ -1124,7 +1129,8 @@ If this value is not set in the code or by this parameter no validation on the integrity of the image will be performed. Skipping tests --------------- +~~~~~~~~~~~~~~ + The Avocado framework provides Python decorators which allow for easily skip tests running under certain conditions. For example, on the lack of a binary on the test system or when the running environment is a CI system. For further @@ -1139,7 +1145,7 @@ environment variables became a kind of standard way to enable/disable tests. Here is a list of the most used variables: AVOCADO_ALLOW_LARGE_STORAGE -~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^ Tests which are going to fetch or produce assets considered *large* are not going to run unless that ``AVOCADO_ALLOW_LARGE_STORAGE=1`` is exported on the environment. @@ -1148,7 +1154,7 @@ The definition of *large* is a bit arbitrary here, but it usually means an asset which occupies at least 1GB of size on disk when uncompressed. AVOCADO_ALLOW_UNTRUSTED_CODE -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ There are tests which will boot a kernel image or firmware that can be considered not safe to run on the developer's workstation, thus they are skipped by default. The definition of *not safe* is also arbitrary but @@ -1159,7 +1165,7 @@ You should export ``AVOCADO_ALLOW_UNTRUSTED_CODE=1`` on the environment in order to allow tests which make use of those kind of assets. AVOCADO_TIMEOUT_EXPECTED -~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^ The Avocado framework has a timeout mechanism which interrupts tests to avoid the test suite of getting stuck. The timeout value can be set via test parameter or property defined in the test class, for further details:: @@ -1173,7 +1179,7 @@ compiled with debug flags. Therefore, the ``AVOCADO_TIMEOUT_EXPECTED`` variable has been used to determine whether those tests should run or not. GITLAB_CI -~~~~~~~~~ +^^^^^^^^^ A number of tests are flagged to not run on the GitLab CI. Usually because they proved to the flaky or there are constraints on the CI environment which would make them fail. If you encounter a similar situation then use that @@ -1186,7 +1192,7 @@ variable as shown on the code snippet below to skip the test: do_something() Uninstalling Avocado --------------------- +~~~~~~~~~~~~~~~~~~~~ If you've followed the manual installation instructions above, you can easily uninstall Avocado. Start by listing the packages you have @@ -1204,7 +1210,7 @@ Avocado is installed will be cleaned up as part of ``make check-clean``. .. _checktcg-ref: Testing with "make check-tcg" -============================= +----------------------------- The check-tcg tests are intended for simple smoke tests of both linux-user and softmmu TCG functionality. However to build test @@ -1237,7 +1243,7 @@ itself. See :ref:`container-ref` for more details. Running subset of tests ------------------------ +~~~~~~~~~~~~~~~~~~~~~~~ You can build the tests for one architecture:: @@ -1251,7 +1257,7 @@ Adding ``V=1`` to the invocation will show the details of how to invoke QEMU for the test which is useful for debugging tests. TCG test dependencies ---------------------- +~~~~~~~~~~~~~~~~~~~~~ The TCG tests are deliberately very light on dependencies and are either totally bare with minimal gcc lib support (for softmmu tests) @@ -1285,7 +1291,7 @@ to run to exercise QEMU's linux-user code:: https://linux-test-project.github.io/ GCC gcov support -================ +---------------- ``gcov`` is a GCC tool to analyze the testing coverage by instrumenting the tested code. To use it, configure QEMU with