* Fixes & doc updates for the new "boot order" s390x bios feature

* Provide a "loadparm" property for scsi-hd & scsi-cd devices on s390x (required for the "boot order" feature) * Fix the floating-point multiply-and-add NaN rules on s390x * Raise timeout on cross-accel build jobs to 60m -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmc7ercRHHRodXRoQHJl ZGhhdC5jb20ACgkQLtnXdP5wLbVjyg//ZuhSDCj+oBSU6vwM7Lwh3CS6GwZvGECU h60V3tizKypiRNtTJRXHoWcx95brXmoZgI+QQhDEXe3fFLkOEKT6AIlDhrKZRUsd rpLPr6O8TVKO+rSE7JVJAP3X1tpOOQDxnq83uWBv53b0S+Da0VwDRtI9gcugRMmh d58P8Q1bV344fQdcrebejstpSUG7RxSA4Plj2uSQx4mSHT7cy/hN+vA34Ha7reE3 tcN9yfQq3Rmfvt0MV5I9Umd6JXEoDlEAwjSNsWRsCzo69jBZwiMtXSH8LyLtwRTp C919G/MIRuhvImF74dStLVCr82sNq54YR1NP6CGcmqPH76FOH8Mx3vmx9Cxj9ckA 6NI6SvIg++bW2O1efG2apz8p5fjbDzYXSAbHnaWTcEu3gPgH4PQ5QXoyKaDymvWV JIh5/gXEy+twEXgIBsdWQ44A9E06lL/tNfKnqGdXK4ZYF2JIrI+Lq7AKBee7tebP +72I4PljHLSHQ3GxdkoOeJ8ahu70IBdSz2/VEIwOWK1wIf5C5WFNBerLJyDmkyx8 xIvIm0vlRLwPcuOC711nlaMaKqTNT+8W4DIqIY6fHs2Jy0psMdgey1uHQxYEj9Kh fg7CvalK8n3MkGAwTqAvRJIwMFe0a4Ss6c6CaemSaYa38ud/pCNnv+IT+Eqr+mjq 6y5PZWNrZi0= =UaDH -----END PGP SIGNATURE----- Merge tag 'pull-request-2024-11-18' of https://gitlab.com/thuth/qemu into staging * Fixes & doc updates for the new "boot order" s390x bios feature * Provide a "loadparm" property for scsi-hd & scsi-cd devices on s390x (required for the "boot order" feature) * Fix the floating-point multiply-and-add NaN rules on s390x * Raise timeout on cross-accel build jobs to 60m # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmc7ercRHHRodXRoQHJl # ZGhhdC5jb20ACgkQLtnXdP5wLbVjyg//ZuhSDCj+oBSU6vwM7Lwh3CS6GwZvGECU # h60V3tizKypiRNtTJRXHoWcx95brXmoZgI+QQhDEXe3fFLkOEKT6AIlDhrKZRUsd # rpLPr6O8TVKO+rSE7JVJAP3X1tpOOQDxnq83uWBv53b0S+Da0VwDRtI9gcugRMmh # d58P8Q1bV344fQdcrebejstpSUG7RxSA4Plj2uSQx4mSHT7cy/hN+vA34Ha7reE3 # tcN9yfQq3Rmfvt0MV5I9Umd6JXEoDlEAwjSNsWRsCzo69jBZwiMtXSH8LyLtwRTp # C919G/MIRuhvImF74dStLVCr82sNq54YR1NP6CGcmqPH76FOH8Mx3vmx9Cxj9ckA # 6NI6SvIg++bW2O1efG2apz8p5fjbDzYXSAbHnaWTcEu3gPgH4PQ5QXoyKaDymvWV # JIh5/gXEy+twEXgIBsdWQ44A9E06lL/tNfKnqGdXK4ZYF2JIrI+Lq7AKBee7tebP # +72I4PljHLSHQ3GxdkoOeJ8ahu70IBdSz2/VEIwOWK1wIf5C5WFNBerLJyDmkyx8 # xIvIm0vlRLwPcuOC711nlaMaKqTNT+8W4DIqIY6fHs2Jy0psMdgey1uHQxYEj9Kh # fg7CvalK8n3MkGAwTqAvRJIwMFe0a4Ss6c6CaemSaYa38ud/pCNnv+IT+Eqr+mjq # 6y5PZWNrZi0= # =UaDH # -----END PGP SIGNATURE----- # gpg: Signature made Mon 18 Nov 2024 17:34:47 GMT # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "thuth@redhat.com" # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full] # gpg: aka "Thomas Huth <thuth@redhat.com>" [full] # gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full] # gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown] # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * tag 'pull-request-2024-11-18' of https://gitlab.com/thuth/qemu: .gitlab-ci.d: Raise timeout on cross-accel build jobs to 60m pc-bios: Update the s390 bios images with the recent fixes pc-bios/s390-ccw: Re-initialize receive queue index before each boot attempt pc-bios/s390x: Initialize machine loadparm before probing IPL devices pc-bios/s390x: Initialize cdrom type to false for each IPL device hw: Add "loadparm" property to scsi disk devices for booting on s390x hw/s390x: Restrict "loadparm" property to devices that can be used for booting docs/system/bootindex: Make it clear that s390x can also boot from virtio-net docs/system/s390x/bootdevices: Update loadparm documentation tests/tcg/s390x: Add the floating-point multiply-and-add test target/s390x: Fix the floating-point multiply-and-add NaN rules hw/usb: Use __attribute__((packed)) vs __packed Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-11-18 20:23:59 +00:00 · 2024-11-18 20:23:59 +00:00 · 2c471a8291
parent 3428a3894c 4483d98ab8
commit 2c471a8291
23 changed files with 568 additions and 177 deletions
--- a/.gitlab-ci.d/crossbuild-template.yml
+++ b/.gitlab-ci.d/crossbuild-template.yml
@ -57,7 +57,7 @@
  extends: .base_job_template
  stage: build
  image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:$QEMU_CI_CONTAINER_TAG
-  timeout: 30m
+  timeout: 60m
  cache:
    paths:
      - ccache/
--- a/docs/system/bootindex.rst
+++ b/docs/system/bootindex.rst
@ -53,7 +53,7 @@ booting.  For instance, the x86 PC BIOS boot specification allows only one
 disk to be bootable.  If boot from disk fails for some reason, the x86 BIOS
 won't retry booting from other disk.  It can still try to boot from
 floppy or net, though. In the case of s390x BIOS, the BIOS will try up to
-8 total devices, any number of which may be disks.
+8 total devices, any number of which may be disks or virtio-net devices.

 Sometimes, firmware cannot map the device path QEMU wants firmware to
 boot from to a boot method.  It doesn't happen for devices the firmware
--- a/docs/system/s390x/bootdevices.rst
+++ b/docs/system/s390x/bootdevices.rst
@ -79,7 +79,29 @@ The second way to use this parameter is to use a number in the range from 0
 to 31. The numbers that can be used here correspond to the numbers that are
 shown when using the ``PROMPT`` option, and the s390-ccw bios will then try
 to automatically boot the kernel that is associated with the given number.
-Note that ``0`` can be used to boot the default entry.
+Note that ``0`` can be used to boot the default entry. If the machine
+``loadparm`` is not assigned a value, then the default entry is used.
+
+By default, the machine ``loadparm`` applies to all boot devices. If multiple
+devices are assigned a ``bootindex`` and the ``loadparm`` is to be different
+between them, an independent ``loadparm`` may be assigned on a per-device basis.
+
+An example guest using per-device ``loadparm``::
+
+  qemu-system-s390x -drive if=none,id=dr1,file=primary.qcow2 \
+                   -device virtio-blk,drive=dr1,bootindex=1 \
+                   -drive if=none,id=dr2,file=secondary.qcow2 \
+                   -device virtio-blk,drive=dr2,bootindex=2,loadparm=3
+
+In this case, the primary boot device will attempt to IPL using the default
+entry (because no ``loadparm`` is specified for this device or for the
+machine). If that device fails to boot, the secondary device will attempt to
+IPL using entry number 3.
+
+If a ``loadparm`` is specified on both the machine and a device, the per-device
+value will superseded the machine value.  Per-device ``loadparm`` values are
+only used for devices with an assigned ``bootindex``. The machine ``loadparm``
+is used when attempting to boot without a ``bootindex``.


 Booting from a network device
--- a/fpu/softfloat-specialize.c.inc
+++ b/fpu/softfloat-specialize.c.inc
@ -597,6 +597,25 @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
        float_raise(float_flag_invalid | float_flag_invalid_imz, status);
    }
    return 3; /* default NaN */
+#elif defined(TARGET_S390X)
+    if (infzero) {
+        float_raise(float_flag_invalid | float_flag_invalid_imz, status);
+        return 3;
+    }
+
+    if (is_snan(a_cls)) {
+        return 0;
+    } else if (is_snan(b_cls)) {
+        return 1;
+    } else if (is_snan(c_cls)) {
+        return 2;
+    } else if (is_qnan(a_cls)) {
+        return 0;
+    } else if (is_qnan(b_cls)) {
+        return 1;
+    } else {
+        return 2;
+    }
 #elif defined(TARGET_SPARC)
    /* For (inf,0,nan) return c. */
    if (infzero) {
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@ -58,6 +58,32 @@ static bool check_prop_still_unset(Object *obj, const char *name,
    return false;
 }

+bool qdev_prop_sanitize_s390x_loadparm(uint8_t *loadparm, const char *str,
+                                       Error **errp)
+{
+    int i, len;
+
+    len = strlen(str);
+    if (len > 8) {
+        error_setg(errp, "'loadparm' can only contain up to 8 characters");
+        return false;
+    }
+
+    for (i = 0; i < len; i++) {
+        uint8_t c = qemu_toupper(str[i]); /* mimic HMC */
+
+        if (qemu_isalnum(c) || c == '.' || c == ' ') {
+            loadparm[i] = c;
+        } else {
+            error_setg(errp,
+                       "invalid character in 'loadparm': '%c' (ASCII 0x%02x)",
+                       c, c);
+            return false;
+        }
+    }
+
+    return true;
+}

 /* --- drive --- */

--- a/hw/s390x/ccw-device.c
+++ b/hw/s390x/ccw-device.c
@ -73,7 +73,7 @@ static void ccw_device_set_loadparm(Object *obj, Visitor *v,
    s390_ipl_fmt_loadparm(dev->loadparm, val, errp);
 }

-static const PropertyInfo ccw_loadparm = {
+const PropertyInfo ccw_loadparm = {
    .name  = "ccw_loadparm",
    .description = "Up to 8 chars in set of [A-Za-z0-9. ] to pass"
            " to the guest loader/kernel",
@ -85,8 +85,6 @@ static Property ccw_device_properties[] = {
    DEFINE_PROP_CSS_DEV_ID("devno", CcwDevice, devno),
    DEFINE_PROP_CSS_DEV_ID_RO("dev_id", CcwDevice, dev_id),
    DEFINE_PROP_CSS_DEV_ID_RO("subch_id", CcwDevice, subch_id),
-    DEFINE_PROP("loadparm", CcwDevice, loadparm, ccw_loadparm,
-            typeof(uint8_t[8])),
    DEFINE_PROP_END_OF_LIST(),
 };

--- a/hw/s390x/ccw-device.h
+++ b/hw/s390x/ccw-device.h
@ -51,4 +51,9 @@ static inline CcwDevice *to_ccw_dev_fast(DeviceState *d)

 OBJECT_DECLARE_TYPE(CcwDevice, CCWDeviceClass, CCW_DEVICE)

+extern const PropertyInfo ccw_loadparm;
+
+#define DEFINE_PROP_CCW_LOADPARM(_n, _s, _f) \
+    DEFINE_PROP(_n, _s, _f, ccw_loadparm, typeof(uint8_t[8]))
+
 #endif
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@ -418,21 +418,9 @@ static uint64_t s390_ipl_map_iplb_chain(IplParameterBlock *iplb_chain)

 void s390_ipl_fmt_loadparm(uint8_t *loadparm, char *str, Error **errp)
 {
-    int i;
-
    /* Initialize the loadparm with spaces */
    memset(loadparm, ' ', LOADPARM_LEN);
-    for (i = 0; i < LOADPARM_LEN && str[i]; i++) {
-        uint8_t c = qemu_toupper(str[i]); /* mimic HMC */
-
-        if (qemu_isalnum(c) || c == '.' || c == ' ') {
-            loadparm[i] = c;
-        } else {
-            error_setg(errp, "LOADPARM: invalid character '%c' (ASCII 0x%02x)",
-                       c, c);
-            return;
-        }
-    }
+    qdev_prop_sanitize_s390x_loadparm(loadparm, str, errp);
 }

 void s390_ipl_convert_loadparm(char *ascii_lp, uint8_t *ebcdic_lp)
@ -452,6 +440,7 @@ static bool s390_build_iplb(DeviceState *dev_st, IplParameterBlock *iplb)
    SCSIDevice *sd;
    int devtype;
    uint8_t *lp;
+    g_autofree void *scsi_lp = NULL;

    /*
     * Currently allow IPL only from CCW devices.
@ -463,6 +452,10 @@ static bool s390_build_iplb(DeviceState *dev_st, IplParameterBlock *iplb)
        switch (devtype) {
        case CCW_DEVTYPE_SCSI:
            sd = SCSI_DEVICE(dev_st);
+            scsi_lp = object_property_get_str(OBJECT(sd), "loadparm", NULL);
+            if (scsi_lp && strlen(scsi_lp) > 0) {
+                lp = scsi_lp;
+            }
            iplb->len = cpu_to_be32(S390_IPLB_MIN_QEMU_SCSI_LEN);
            iplb->blk0_len =
                cpu_to_be32(S390_IPLB_MIN_QEMU_SCSI_LEN - S390_IPLB_HEADER_LEN);
--- a/hw/s390x/virtio-ccw-blk.c
+++ b/hw/s390x/virtio-ccw-blk.c
@ -48,6 +48,7 @@ static Property virtio_ccw_blk_properties[] = {
                    VIRTIO_CCW_FLAG_USE_IOEVENTFD_BIT, true),
    DEFINE_PROP_UINT32("max_revision", VirtioCcwDevice, max_rev,
                       VIRTIO_CCW_MAX_REV),
+    DEFINE_PROP_CCW_LOADPARM("loadparm", CcwDevice, loadparm),
    DEFINE_PROP_END_OF_LIST(),
 };

--- a/hw/s390x/virtio-ccw-net.c
+++ b/hw/s390x/virtio-ccw-net.c
@ -51,6 +51,7 @@ static Property virtio_ccw_net_properties[] = {
                    VIRTIO_CCW_FLAG_USE_IOEVENTFD_BIT, true),
    DEFINE_PROP_UINT32("max_revision", VirtioCcwDevice, max_rev,
                       VIRTIO_CCW_MAX_REV),
+    DEFINE_PROP_CCW_LOADPARM("loadparm", CcwDevice, loadparm),
    DEFINE_PROP_END_OF_LIST(),
 };

--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@ -32,6 +32,7 @@
 #include "migration/vmstate.h"
 #include "hw/scsi/emulation.h"
 #include "scsi/constants.h"
+#include "sysemu/arch_init.h"
 #include "sysemu/block-backend.h"
 #include "sysemu/blockdev.h"
 #include "hw/block/block.h"
@ -111,6 +112,7 @@ struct SCSIDiskState {
    char *vendor;
    char *product;
    char *device_id;
+    char *loadparm;     /* only for s390x */
    bool tray_open;
    bool tray_locked;
    /*
@ -3135,6 +3137,43 @@ BlockAIOCB *scsi_dma_writev(int64_t offset, QEMUIOVector *iov,
    return blk_aio_pwritev(s->qdev.conf.blk, offset, iov, 0, cb, cb_opaque);
 }

+static char *scsi_property_get_loadparm(Object *obj, Error **errp)
+{
+    return g_strdup(SCSI_DISK_BASE(obj)->loadparm);
+}
+
+static void scsi_property_set_loadparm(Object *obj, const char *value,
+                                       Error **errp)
+{
+    void *lp_str;
+
+    if (object_property_get_int(obj, "bootindex", NULL) < 0) {
+        error_setg(errp, "'loadparm' is only valid for boot devices");
+        return;
+    }
+
+    lp_str = g_malloc0(strlen(value));
+    if (!qdev_prop_sanitize_s390x_loadparm(lp_str, value, errp)) {
+        g_free(lp_str);
+        return;
+    }
+    SCSI_DISK_BASE(obj)->loadparm = lp_str;
+}
+
+static void scsi_property_add_specifics(DeviceClass *dc)
+{
+    ObjectClass *oc = OBJECT_CLASS(dc);
+
+    /* The loadparm property is only supported on s390x */
+    if (arch_type & QEMU_ARCH_S390X) {
+        object_class_property_add_str(oc, "loadparm",
+                                      scsi_property_get_loadparm,
+                                      scsi_property_set_loadparm);
+        object_class_property_set_description(oc, "loadparm",
+                                              "load parameter (s390x only)");
+    }
+}
+
 static void scsi_disk_base_class_initfn(ObjectClass *klass, void *data)
 {
    DeviceClass *dc = DEVICE_CLASS(klass);
@ -3218,6 +3257,8 @@ static void scsi_hd_class_initfn(ObjectClass *klass, void *data)
    dc->desc = "virtual SCSI disk";
    device_class_set_props(dc, scsi_hd_properties);
    dc->vmsd  = &vmstate_scsi_disk_state;
+
+    scsi_property_add_specifics(dc);
 }

 static const TypeInfo scsi_hd_info = {
@ -3258,6 +3299,8 @@ static void scsi_cd_class_initfn(ObjectClass *klass, void *data)
    dc->desc = "virtual SCSI CD-ROM";
    device_class_set_props(dc, scsi_cd_properties);
    dc->vmsd  = &vmstate_scsi_disk_state;
+
+    scsi_property_add_specifics(dc);
 }

 static const TypeInfo scsi_cd_info = {
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@ -662,6 +662,7 @@ static Property vfio_ccw_properties[] = {
    DEFINE_PROP_LINK("iommufd", VFIOCCWDevice, vdev.iommufd,
                     TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *),
 #endif
+    DEFINE_PROP_CCW_LOADPARM("loadparm", CcwDevice, loadparm),
    DEFINE_PROP_END_OF_LIST(),
 };

--- a/include/hw/qdev-properties-system.h
+++ b/include/hw/qdev-properties-system.h
@ -3,6 +3,9 @@

 #include "hw/qdev-properties.h"

+bool qdev_prop_sanitize_s390x_loadparm(uint8_t *loadparm, const char *str,
+                                       Error **errp);
+
 extern const PropertyInfo qdev_prop_chr;
 extern const PropertyInfo qdev_prop_macaddr;
 extern const PropertyInfo qdev_prop_reserved_region;
--- a/include/hw/usb/dwc2-regs.h
+++ b/include/hw/usb/dwc2-regs.h
@ -838,7 +838,7 @@
 struct dwc2_dma_desc {
        uint32_t status;
        uint32_t buf;
-} __packed;
+} QEMU_PACKED;

 /* Host Mode DMA descriptor status quadlet */

--- a/pc-bios/s390-ccw.img
+++ b/pc-bios/s390-ccw.img
--- a/pc-bios/s390-ccw/main.c
+++ b/pc-bios/s390-ccw/main.c
@ -191,7 +191,7 @@ static void boot_setup(void)
 {
    char lpmsg[] = "LOADPARM=[________]\n";

-    if (memcmp(iplb.loadparm, NO_LOADPARM, LOADPARM_LEN) != 0) {
+    if (have_iplb && memcmp(iplb.loadparm, NO_LOADPARM, LOADPARM_LEN) != 0) {
        ebcdic_to_ascii((char *) iplb.loadparm, loadparm_str, LOADPARM_LEN);
    } else {
        sclp_get_loadparm_ascii(loadparm_str);
@ -242,6 +242,7 @@ static bool find_boot_device(void)
 static int virtio_setup(void)
 {
    VDev *vdev = virtio_get_device();
+    vdev->is_cdrom = false;
    int ret;

    switch (vdev->senseid.cu_model) {
@ -315,6 +316,7 @@ void main(void)
    css_setup();
    have_iplb = store_iplb(&iplb);
    if (!have_iplb) {
+        boot_setup();
        probe_boot_device();
    }

--- a/pc-bios/s390-ccw/virtio-net.c
+++ b/pc-bios/s390-ccw/virtio-net.c
@ -51,6 +51,8 @@ int virtio_net_init(void *mac_addr)
    void *buf;
    int i;

+    rx_last_idx = 0;
+
    vdev->guest_features[0] = VIRTIO_NET_F_MAC_BIT;
    virtio_setup_ccw(vdev);

--- a/target/s390x/tcg/fpu_helper.c
+++ b/target/s390x/tcg/fpu_helper.c
@ -780,7 +780,7 @@ uint32_t HELPER(kxb)(CPUS390XState *env, Int128 a, Int128 b)
 uint64_t HELPER(maeb)(CPUS390XState *env, uint64_t f1,
                      uint64_t f2, uint64_t f3)
 {
-    float32 ret = float32_muladd(f2, f3, f1, 0, &env->fpu_status);
+    float32 ret = float32_muladd(f3, f2, f1, 0, &env->fpu_status);
    handle_exceptions(env, false, GETPC());
    return ret;
 }
@ -789,7 +789,7 @@ uint64_t HELPER(maeb)(CPUS390XState *env, uint64_t f1,
 uint64_t HELPER(madb)(CPUS390XState *env, uint64_t f1,
                      uint64_t f2, uint64_t f3)
 {
-    float64 ret = float64_muladd(f2, f3, f1, 0, &env->fpu_status);
+    float64 ret = float64_muladd(f3, f2, f1, 0, &env->fpu_status);
    handle_exceptions(env, false, GETPC());
    return ret;
 }
@ -798,7 +798,7 @@ uint64_t HELPER(madb)(CPUS390XState *env, uint64_t f1,
 uint64_t HELPER(mseb)(CPUS390XState *env, uint64_t f1,
                      uint64_t f2, uint64_t f3)
 {
-    float32 ret = float32_muladd(f2, f3, f1, float_muladd_negate_c,
+    float32 ret = float32_muladd(f3, f2, f1, float_muladd_negate_c,
                                 &env->fpu_status);
    handle_exceptions(env, false, GETPC());
    return ret;
@ -808,7 +808,7 @@ uint64_t HELPER(mseb)(CPUS390XState *env, uint64_t f1,
 uint64_t HELPER(msdb)(CPUS390XState *env, uint64_t f1,
                      uint64_t f2, uint64_t f3)
 {
-    float64 ret = float64_muladd(f2, f3, f1, float_muladd_negate_c,
+    float64 ret = float64_muladd(f3, f2, f1, float_muladd_negate_c,
                                 &env->fpu_status);
    handle_exceptions(env, false, GETPC());
    return ret;
--- a/target/s390x/tcg/vec_fpu_helper.c
+++ b/target/s390x/tcg/vec_fpu_helper.c
@ -621,8 +621,8 @@ static void vfma32(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
    int i;

    for (i = 0; i < 4; i++) {
-        const float32 a = s390_vec_read_float32(v2, i);
-        const float32 b = s390_vec_read_float32(v3, i);
+        const float32 a = s390_vec_read_float32(v3, i);
+        const float32 b = s390_vec_read_float32(v2, i);
        const float32 c = s390_vec_read_float32(v4, i);
        float32 ret = float32_muladd(a, b, c, flags, &env->fpu_status);

@ -645,8 +645,8 @@ static void vfma64(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
    int i;

    for (i = 0; i < 2; i++) {
-        const float64 a = s390_vec_read_float64(v2, i);
-        const float64 b = s390_vec_read_float64(v3, i);
+        const float64 a = s390_vec_read_float64(v3, i);
+        const float64 b = s390_vec_read_float64(v2, i);
        const float64 c = s390_vec_read_float64(v4, i);
        const float64 ret = float64_muladd(a, b, c, flags, &env->fpu_status);

@ -664,8 +664,8 @@ static void vfma128(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
                    const S390Vector *v4, CPUS390XState *env, bool s, int flags,
                    uintptr_t retaddr)
 {
-    const float128 a = s390_vec_read_float128(v2);
-    const float128 b = s390_vec_read_float128(v3);
+    const float128 a = s390_vec_read_float128(v3);
+    const float128 b = s390_vec_read_float128(v2);
    const float128 c = s390_vec_read_float128(v4);
    uint8_t vxc, vec_exc = 0;
    float128 ret;
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@ -74,8 +74,11 @@ $(Z13_TESTS): CFLAGS+=-march=z13 -O2
 TESTS+=$(Z13_TESTS)

 ifneq ($(CROSS_CC_HAS_Z14),)
-Z14_TESTS=vfminmax
+Z14_TESTS=fma vfminmax
+fma: float.h
+fma: LDFLAGS+=-lm
 vfminmax: LDFLAGS+=-lm
+vfminmax: float.h
 $(Z14_TESTS): CFLAGS+=-march=z14 -O2
 TESTS+=$(Z14_TESTS)
 endif
--- a/tests/tcg/s390x/float.h
+++ b/tests/tcg/s390x/float.h
@ -0,0 +1,104 @@
+/*
+ * Helpers for floating-point tests.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#ifndef FLOAT_H
+#define FLOAT_H
+
+/*
+ * Floating-point value classes.
+ */
+#define N_FORMATS 3
+#define CLASS_MINUS_INF 0
+#define CLASS_MINUS_FN 1
+#define CLASS_MINUS_ZERO 2
+#define CLASS_PLUS_ZERO 3
+#define CLASS_PLUS_FN 4
+#define CLASS_PLUS_INF 5
+#define CLASS_QNAN 6
+#define CLASS_SNAN 7
+#define N_SIGNED_CLASSES 8
+static const size_t float_sizes[N_FORMATS] = {
+    /* M4 == 2: short    */ 4,
+    /* M4 == 3: long     */ 8,
+    /* M4 == 4: extended */ 16,
+};
+static const size_t e_bits[N_FORMATS] = {
+    /* M4 == 2: short    */ 8,
+    /* M4 == 3: long     */ 11,
+    /* M4 == 4: extended */ 15,
+};
+struct float_class {
+    size_t n;
+    unsigned char v[2][16];
+};
+static const struct float_class signed_floats[N_FORMATS][N_SIGNED_CLASSES] = {
+    /* M4 == 2: short */
+    {
+        /* -inf */ {1, {{0xff, 0x80, 0x00, 0x00}}},
+        /* -Fn */  {2, {{0xc2, 0x28, 0x00, 0x00},
+                        {0xc2, 0x29, 0x00, 0x00}}},
+        /* -0 */   {1, {{0x80, 0x00, 0x00, 0x00}}},
+        /* +0 */   {1, {{0x00, 0x00, 0x00, 0x00}}},
+        /* +Fn */  {2, {{0x42, 0x28, 0x00, 0x00},
+                        {0x42, 0x2a, 0x00, 0x00}}},
+        /* +inf */ {1, {{0x7f, 0x80, 0x00, 0x00}}},
+        /* QNaN */ {2, {{0x7f, 0xff, 0xff, 0xff},
+                        {0x7f, 0xff, 0xff, 0xfe}}},
+        /* SNaN */ {2, {{0x7f, 0xbf, 0xff, 0xff},
+                        {0x7f, 0xbf, 0xff, 0xfd}}},
+    },
+
+    /* M4 == 3: long */
+    {
+        /* -inf */ {1, {{0xff, 0xf0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}}},
+        /* -Fn */  {2, {{0xc0, 0x45, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
+                        {0xc0, 0x46, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}}},
+        /* -0 */   {1, {{0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}}},
+        /* +0 */   {1, {{0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}}},
+        /* +Fn */  {2, {{0x40, 0x45, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
+                        {0x40, 0x47, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}}},
+        /* +inf */ {1, {{0x7f, 0xf0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}}},
+        /* QNaN */ {2, {{0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
+                        {0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xfe}}},
+        /* SNaN */ {2, {{0x7f, 0xf7, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
+                        {0x7f, 0xf7, 0xff, 0xff, 0xff, 0xff, 0xff, 0xfd}}},
+    },
+
+    /* M4 == 4: extended */
+    {
+        /* -inf */ {1, {{0xff, 0xff, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}}},
+        /* -Fn */  {2, {{0xc0, 0x04, 0x50, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
+                        {0xc0, 0x04, 0x51, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}}},
+        /* -0 */   {1, {{0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}}},
+        /* +0 */   {1, {{0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}}},
+        /* +Fn */  {2, {{0x40, 0x04, 0x50, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
+                        {0x40, 0x04, 0x52, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}}},
+        /* +inf */ {1, {{0x7f, 0xff, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}}},
+        /* QNaN */ {2, {{0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
+                        {0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xfe}}},
+        /* SNaN */ {2, {{0x7f, 0xff, 0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
+                        {0x7f, 0xff, 0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xfd}}},
+    },
+};
+static const unsigned char default_nans[N_FORMATS][16] = {
+    /* M4 == 2: short    */ {0x7f, 0xc0, 0x00, 0x00},
+    /* M4 == 3: long     */ {0x7f, 0xf8, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
+    /* M4 == 4: extended */ {0x7f, 0xff, 0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
+};
+
+static void dump_v(FILE *f, const void *v, size_t n)
+{
+    for (int i = 0; i < n; i++) {
+        fprintf(f, "%02x", ((const unsigned char *)v)[i]);
+    }
+}
+
+static void snan_to_qnan(char *v, int fmt)
+{
+    size_t bit = 1 + e_bits[fmt];
+    v[bit / 8] |= 1 << (7 - (bit % 8));
+}
+
+#endif
--- a/tests/tcg/s390x/fma.c
+++ b/tests/tcg/s390x/fma.c
@ -0,0 +1,233 @@
+/*
+ * Test floating-point multiply-and-add instructions.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include <fenv.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include "float.h"
+
+union val {
+    float e;
+    double d;
+    long double x;
+    char buf[16];
+};
+
+/*
+ * PoP tables as close to the original as possible.
+ */
+static const char *table1[N_SIGNED_CLASSES][N_SIGNED_CLASSES] = {
+     /*         -inf           -Fn          -0             +0             +Fn          +inf           QNaN         SNaN     */
+    {/* -inf */ "P(+inf)",     "P(+inf)",   "Xi: T(dNaN)", "Xi: T(dNaN)", "P(-inf)",   "P(-inf)",     "P(b)",      "Xi: T(b*)"},
+    {/* -Fn  */ "P(+inf)",     "P(a*b)",    "P(+0)",       "P(-0)",       "P(a*b)",    "P(-inf)",     "P(b)",      "Xi: T(b*)"},
+    {/* -0   */ "Xi: T(dNaN)", "P(+0)",     "P(+0)",       "P(-0)",       "P(-0)",     "Xi: T(dNaN)", "P(b)",      "Xi: T(b*)"},
+    {/* +0   */ "Xi: T(dNaN)", "P(-0)",     "P(-0)",       "P(+0)",       "P(+0)",     "Xi: T(dNaN)", "P(b)",      "Xi: T(b*)"},
+    {/* +Fn  */ "P(-inf)",     "P(a*b)",    "P(-0)",       "P(+0)",       "P(a*b)",    "P(+inf)",     "P(b)",      "Xi: T(b*)"},
+    {/* +inf */ "P(-inf)",     "P(-inf)",   "Xi: T(dNaN)", "Xi: T(dNaN)", "P(+inf)",   "P(+inf)",     "P(b)",      "Xi: T(b*)"},
+    {/* QNaN */ "P(a)",        "P(a)",      "P(a)",        "P(a)",        "P(a)",      "P(a)",        "P(a)",      "Xi: T(b*)"},
+    {/* SNaN */ "Xi: T(a*)",   "Xi: T(a*)", "Xi: T(a*)",   "Xi: T(a*)",   "Xi: T(a*)", "Xi: T(a*)",   "Xi: T(a*)", "Xi: T(a*)"},
+};
+
+static const char *table2[N_SIGNED_CLASSES][N_SIGNED_CLASSES] = {
+     /*         -inf           -Fn        -0         +0         +Fn        +inf           QNaN    SNaN     */
+    {/* -inf */ "T(-inf)",     "T(-inf)", "T(-inf)", "T(-inf)", "T(-inf)", "Xi: T(dNaN)", "T(c)", "Xi: T(c*)"},
+    {/* -Fn  */ "T(-inf)",     "R(p+c)",  "R(p)",    "R(p)",    "R(p+c)",  "T(+inf)",     "T(c)", "Xi: T(c*)"},
+    {/* -0   */ "T(-inf)",     "R(c)",    "T(-0)",   "Rezd",    "R(c)",    "T(+inf)",     "T(c)", "Xi: T(c*)"},
+    {/* +0   */ "T(-inf)",     "R(c)",    "Rezd",    "T(+0)",   "R(c)",    "T(+inf)",     "T(c)", "Xi: T(c*)"},
+    {/* +Fn  */ "T(-inf)",     "R(p+c)",  "R(p)",    "R(p)",    "R(p+c)",  "T(+inf)",     "T(c)", "Xi: T(c*)"},
+    {/* +inf */ "Xi: T(dNaN)", "T(+inf)", "T(+inf)", "T(+inf)", "T(+inf)", "T(+inf)",     "T(c)", "Xi: T(c*)"},
+    {/* QNaN */ "T(p)",        "T(p)",    "T(p)",    "T(p)",    "T(p)",    "T(p)",        "T(p)", "Xi: T(c*)"},
+     /* SNaN: can't happen */
+};
+
+static void interpret_tables(union val *r, bool *xi, int fmt,
+                             int cls_a, const union val *a,
+                             int cls_b, const union val *b,
+                             int cls_c, const union val *c)
+{
+    const char *spec1 = table1[cls_a][cls_b];
+    const char *spec2;
+    union val p;
+    int cls_p;
+
+    *xi = false;
+
+    if (strcmp(spec1, "P(-inf)") == 0) {
+        cls_p = CLASS_MINUS_INF;
+    } else if (strcmp(spec1, "P(+inf)") == 0) {
+        cls_p = CLASS_PLUS_INF;
+    } else if (strcmp(spec1, "P(-0)") == 0) {
+        cls_p = CLASS_MINUS_ZERO;
+    } else if (strcmp(spec1, "P(+0)") == 0) {
+        cls_p = CLASS_PLUS_ZERO;
+    } else if (strcmp(spec1, "P(a)") == 0) {
+        cls_p = cls_a;
+        memcpy(&p, a, sizeof(p));
+    } else if (strcmp(spec1, "P(b)") == 0) {
+        cls_p = cls_b;
+        memcpy(&p, b, sizeof(p));
+    } else if (strcmp(spec1, "P(a*b)") == 0) {
+        /*
+         * In the general case splitting fma into multiplication and addition
+         * doesn't work, but this is the case with our test inputs.
+         */
+        cls_p = cls_a == cls_b ? CLASS_PLUS_FN : CLASS_MINUS_FN;
+        switch (fmt) {
+        case 0:
+            p.e = a->e * b->e;
+            break;
+        case 1:
+            p.d = a->d * b->d;
+            break;
+        case 2:
+            p.x = a->x * b->x;
+            break;
+        default:
+            fprintf(stderr, "Unsupported fmt: %d\n", fmt);
+            exit(1);
+        }
+    } else if (strcmp(spec1, "Xi: T(dNaN)") == 0) {
+        memcpy(r, default_nans[fmt], sizeof(*r));
+        *xi = true;
+        return;
+    } else if (strcmp(spec1, "Xi: T(a*)") == 0) {
+        memcpy(r, a, sizeof(*r));
+        snan_to_qnan(r->buf, fmt);
+        *xi = true;
+        return;
+    } else if (strcmp(spec1, "Xi: T(b*)") == 0) {
+        memcpy(r, b, sizeof(*r));
+        snan_to_qnan(r->buf, fmt);
+        *xi = true;
+        return;
+    } else {
+        fprintf(stderr, "Unsupported spec1: %s\n", spec1);
+        exit(1);
+    }
+
+    spec2 = table2[cls_p][cls_c];
+    if (strcmp(spec2, "T(-inf)") == 0) {
+        memcpy(r, signed_floats[fmt][CLASS_MINUS_INF].v[0], sizeof(*r));
+    } else if (strcmp(spec2, "T(+inf)") == 0) {
+        memcpy(r, signed_floats[fmt][CLASS_PLUS_INF].v[0], sizeof(*r));
+    } else if (strcmp(spec2, "T(-0)") == 0) {
+        memcpy(r, signed_floats[fmt][CLASS_MINUS_ZERO].v[0], sizeof(*r));
+    } else if (strcmp(spec2, "T(+0)") == 0 || strcmp(spec2, "Rezd") == 0) {
+        memcpy(r, signed_floats[fmt][CLASS_PLUS_ZERO].v[0], sizeof(*r));
+    } else if (strcmp(spec2, "R(c)") == 0 || strcmp(spec2, "T(c)") == 0) {
+        memcpy(r, c, sizeof(*r));
+    } else if (strcmp(spec2, "R(p)") == 0 || strcmp(spec2, "T(p)") == 0) {
+        memcpy(r, &p, sizeof(*r));
+    } else if (strcmp(spec2, "R(p+c)") == 0 || strcmp(spec2, "T(p+c)") == 0) {
+        switch (fmt) {
+        case 0:
+            r->e = p.e + c->e;
+            break;
+        case 1:
+            r->d = p.d + c->d;
+            break;
+        case 2:
+            r->x = p.x + c->x;
+            break;
+        default:
+            fprintf(stderr, "Unsupported fmt: %d\n", fmt);
+            exit(1);
+        }
+    } else if (strcmp(spec2, "Xi: T(dNaN)") == 0) {
+        memcpy(r, default_nans[fmt], sizeof(*r));
+        *xi = true;
+    } else if (strcmp(spec2, "Xi: T(c*)") == 0) {
+        memcpy(r, c, sizeof(*r));
+        snan_to_qnan(r->buf, fmt);
+        *xi = true;
+    } else {
+        fprintf(stderr, "Unsupported spec2: %s\n", spec2);
+        exit(1);
+    }
+}
+
+struct iter {
+    int fmt;
+    int cls[3];
+    int val[3];
+};
+
+static bool iter_next(struct iter *it)
+{
+    int i;
+
+    for (i = 2; i >= 0; i--) {
+        if (++it->val[i] != signed_floats[it->fmt][it->cls[i]].n) {
+            return true;
+        }
+        it->val[i] = 0;
+
+        if (++it->cls[i] != N_SIGNED_CLASSES) {
+            return true;
+        }
+        it->cls[i] = 0;
+    }
+
+    return ++it->fmt != N_FORMATS;
+}
+
+int main(void)
+{
+    int ret = EXIT_SUCCESS;
+    struct iter it = {};
+
+    do {
+        size_t n = float_sizes[it.fmt];
+        union val a, b, c, exp, res;
+        bool xi_exp, xi;
+
+        memcpy(&a, signed_floats[it.fmt][it.cls[0]].v[it.val[0]], sizeof(a));
+        memcpy(&b, signed_floats[it.fmt][it.cls[1]].v[it.val[1]], sizeof(b));
+        memcpy(&c, signed_floats[it.fmt][it.cls[2]].v[it.val[2]], sizeof(c));
+
+        interpret_tables(&exp, &xi_exp, it.fmt,
+                         it.cls[1], &b, it.cls[2], &c, it.cls[0], &a);
+
+        memcpy(&res, &a, sizeof(res));
+        feclearexcept(FE_ALL_EXCEPT);
+        switch (it.fmt) {
+        case 0:
+            asm("maebr %[a],%[b],%[c]"
+                : [a] "+f" (res.e) : [b] "f" (b.e), [c] "f" (c.e));
+            break;
+        case 1:
+            asm("madbr %[a],%[b],%[c]"
+                : [a] "+f" (res.d) : [b] "f" (b.d), [c] "f" (c.d));
+            break;
+        case 2:
+            asm("wfmaxb %[a],%[c],%[b],%[a]"
+                : [a] "+v" (res.x) : [b] "v" (b.x), [c] "v" (c.x));
+            break;
+        default:
+            fprintf(stderr, "Unsupported fmt: %d\n", it.fmt);
+            exit(1);
+        }
+        xi = fetestexcept(FE_ALL_EXCEPT) == FE_INVALID;
+
+        if (memcmp(&res, &exp, n) != 0 || xi != xi_exp) {
+            fprintf(stderr, "[  FAILED  ] ");
+            dump_v(stderr, &b, n);
+            fprintf(stderr, " * ");
+            dump_v(stderr, &c, n);
+            fprintf(stderr, " + ");
+            dump_v(stderr, &a, n);
+            fprintf(stderr, ": actual=");
+            dump_v(stderr, &res, n);
+            fprintf(stderr, "/%d, expected=", (int)xi);
+            dump_v(stderr, &exp, n);
+            fprintf(stderr, "/%d\n", (int)xi_exp);
+            ret = EXIT_FAILURE;
+        }
+    } while (iter_next(&it));
+
+    return ret;
+}
--- a/tests/tcg/s390x/vfminmax.c
+++ b/tests/tcg/s390x/vfminmax.c
@ -4,6 +4,8 @@
 #include <stdio.h>
 #include <string.h>

+#include "float.h"
+
 /*
 * vfmin/vfmax instruction execution.
 */
@ -21,98 +23,21 @@ static void vfminmax(unsigned int op,
                     unsigned int m4, unsigned int m5, unsigned int m6,
                     void *v1, const void *v2, const void *v3)
 {
-   insn[3] = (m6 << 4) | m5;
-   insn[4] = (m4 << 4) | 0x0e;
-   insn[5] = op;
+    insn[3] = (m6 << 4) | m5;
+    insn[4] = (m4 << 4) | 0x0e;
+    insn[5] = op;

    asm("vl %%v25,%[v2]\n"
        "vl %%v26,%[v3]\n"
        "ex 0,%[insn]\n"
        "vst %%v24,%[v1]\n"
        : [v1] "=m" (*(char (*)[16])v1)
-        : [v2] "m" (*(char (*)[16])v2)
-        , [v3] "m" (*(char (*)[16])v3)
-        , [insn] "m"(insn)
+        : [v2] "m" (*(const char (*)[16])v2)
+        , [v3] "m" (*(const char (*)[16])v3)
+        , [insn] "m" (insn)
        : "v24", "v25", "v26");
 }

-/*
- * Floating-point value classes.
- */
-#define N_FORMATS 3
-#define N_SIGNED_CLASSES 8
-static const size_t float_sizes[N_FORMATS] = {
-    /* M4 == 2: short    */ 4,
-    /* M4 == 3: long     */ 8,
-    /* M4 == 4: extended */ 16,
-};
-static const size_t e_bits[N_FORMATS] = {
-    /* M4 == 2: short    */ 8,
-    /* M4 == 3: long     */ 11,
-    /* M4 == 4: extended */ 15,
-};
-static const unsigned char signed_floats[N_FORMATS][N_SIGNED_CLASSES][2][16] = {
-    /* M4 == 2: short */
-    {
-        /* -inf */ {{0xff, 0x80, 0x00, 0x00},
-                    {0xff, 0x80, 0x00, 0x00}},
-        /* -Fn */  {{0xc2, 0x28, 0x00, 0x00},
-                    {0xc2, 0x29, 0x00, 0x00}},
-        /* -0 */   {{0x80, 0x00, 0x00, 0x00},
-                    {0x80, 0x00, 0x00, 0x00}},
-        /* +0 */   {{0x00, 0x00, 0x00, 0x00},
-                    {0x00, 0x00, 0x00, 0x00}},
-        /* +Fn */  {{0x42, 0x28, 0x00, 0x00},
-                    {0x42, 0x2a, 0x00, 0x00}},
-        /* +inf */ {{0x7f, 0x80, 0x00, 0x00},
-                    {0x7f, 0x80, 0x00, 0x00}},
-        /* QNaN */ {{0x7f, 0xff, 0xff, 0xff},
-                    {0x7f, 0xff, 0xff, 0xfe}},
-        /* SNaN */ {{0x7f, 0xbf, 0xff, 0xff},
-                    {0x7f, 0xbf, 0xff, 0xfd}},
-    },
-
-    /* M4 == 3: long */
-    {
-        /* -inf */ {{0xff, 0xf0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
-                    {0xff, 0xf0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}},
-        /* -Fn */  {{0xc0, 0x45, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
-                    {0xc0, 0x46, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}},
-        /* -0 */   {{0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
-                    {0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}},
-        /* +0 */   {{0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
-                    {0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}},
-        /* +Fn */  {{0x40, 0x45, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
-                    {0x40, 0x47, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}},
-        /* +inf */ {{0x7f, 0xf0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
-                    {0x7f, 0xf0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}},
-        /* QNaN */ {{0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
-                    {0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xfe}},
-        /* SNaN */ {{0x7f, 0xf7, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
-                    {0x7f, 0xf7, 0xff, 0xff, 0xff, 0xff, 0xff, 0xfd}},
-    },
-
-    /* M4 == 4: extended */
-    {
-        /* -inf */ {{0xff, 0xff, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
-                    {0xff, 0xff, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}},
-        /* -Fn */  {{0xc0, 0x04, 0x50, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
-                    {0xc0, 0x04, 0x51, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}},
-        /* -0 */   {{0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
-                    {0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}},
-        /* +0 */   {{0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
-                    {0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}},
-        /* +Fn */  {{0x40, 0x04, 0x50, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
-                    {0x40, 0x04, 0x52, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}},
-        /* +inf */ {{0x7f, 0xff, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
-                    {0x7f, 0xff, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}},
-        /* QNaN */ {{0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
-                    {0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xfe}},
-        /* SNaN */ {{0x7f, 0xff, 0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
-                    {0x7f, 0xff, 0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xfd}},
-    },
-};
-
 /*
 * PoP tables as close to the original as possible.
 */
@ -285,13 +210,6 @@ struct signed_test {
    },
 };

-static void dump_v(FILE *f, const void *v, size_t n)
-{
-    for (int i = 0; i < n; i++) {
-        fprintf(f, "%02x", ((const unsigned char *)v)[i]);
-    }
-}
-
 static int signed_test(struct signed_test *test, int m4, int m5,
                       const void *v1_exp, bool xi_exp,
                       const void *v2, const void *v3)
@ -320,10 +238,28 @@ static int signed_test(struct signed_test *test, int m4, int m5,
    return 0;
 }

-static void snan_to_qnan(char *v, int m4)
+struct iter {
+    int cls[2];
+    int val[2];
+};
+
+static bool iter_next(struct iter *it, int fmt)
 {
-    size_t bit = 1 + e_bits[m4 - 2];
-    v[bit / 8] |= 1 << (7 - (bit % 8));
+    int i;
+
+    for (i = 1; i >= 0; i--) {
+        if (++it->val[i] != signed_floats[fmt][it->cls[i]].n) {
+            return true;
+        }
+        it->val[i] = 0;
+
+        if (++it->cls[i] != N_SIGNED_CLASSES) {
+            return true;
+        }
+        it->cls[i] = 0;
+    }
+
+    return false;
 }

 int main(void)
@ -333,72 +269,71 @@ int main(void)

    for (i = 0; i < sizeof(signed_tests) / sizeof(signed_tests[0]); i++) {
        struct signed_test *test = &signed_tests[i];
-        int m4;
+        int fmt;

-        for (m4 = 2; m4 <= 4; m4++) {
-            const unsigned char (*floats)[2][16] = signed_floats[m4 - 2];
-            size_t float_size = float_sizes[m4 - 2];
+        for (fmt = 0; fmt < N_FORMATS; fmt++) {
+            size_t float_size = float_sizes[fmt];
+            int m4 = fmt + 2;
            int m5;

            for (m5 = 0; m5 <= 8; m5 += 8) {
                char v1_exp[16], v2[16], v3[16];
                bool xi_exp = false;
+                struct iter it = {};
                int pos = 0;
-                int i2;

-                for (i2 = 0; i2 < N_SIGNED_CLASSES * 2; i2++) {
-                    int i3;
+                do {
+                    const char *spec = test->table[it.cls[0]][it.cls[1]];

-                    for (i3 = 0; i3 < N_SIGNED_CLASSES * 2; i3++) {
-                        const char *spec = test->table[i2 / 2][i3 / 2];
+                    memcpy(&v2[pos],
+                           signed_floats[fmt][it.cls[0]].v[it.val[0]],
+                           float_size);
+                    memcpy(&v3[pos],
+                           signed_floats[fmt][it.cls[1]].v[it.val[1]],
+                           float_size);
+                    if (strcmp(spec, "T(a)") == 0 ||
+                        strcmp(spec, "Xi: T(a)") == 0) {
+                        memcpy(&v1_exp[pos], &v2[pos], float_size);
+                    } else if (strcmp(spec, "T(b)") == 0 ||
+                               strcmp(spec, "Xi: T(b)") == 0) {
+                        memcpy(&v1_exp[pos], &v3[pos], float_size);
+                    } else if (strcmp(spec, "Xi: T(a*)") == 0) {
+                        memcpy(&v1_exp[pos], &v2[pos], float_size);
+                        snan_to_qnan(&v1_exp[pos], fmt);
+                    } else if (strcmp(spec, "Xi: T(b*)") == 0) {
+                        memcpy(&v1_exp[pos], &v3[pos], float_size);
+                        snan_to_qnan(&v1_exp[pos], fmt);
+                    } else if (strcmp(spec, "T(M(a,b))") == 0) {
+                        /*
+                         * Comparing floats is risky, since the compiler might
+                         * generate the same instruction that we are testing.
+                         * Compare ints instead. This works, because we get
+                         * here only for +-Fn, and the corresponding test
+                         * values have identical exponents.
+                         */
+                        int v2_int = *(int *)&v2[pos];
+                        int v3_int = *(int *)&v3[pos];

-                        memcpy(&v2[pos], floats[i2 / 2][i2 % 2], float_size);
-                        memcpy(&v3[pos], floats[i3 / 2][i3 % 2], float_size);
-                        if (strcmp(spec, "T(a)") == 0 ||
-                            strcmp(spec, "Xi: T(a)") == 0) {
+                        if ((v2_int < v3_int) ==
+                            ((test->op == VFMIN) != (v2_int < 0))) {
                            memcpy(&v1_exp[pos], &v2[pos], float_size);
-                        } else if (strcmp(spec, "T(b)") == 0 ||
-                                   strcmp(spec, "Xi: T(b)") == 0) {
-                            memcpy(&v1_exp[pos], &v3[pos], float_size);
-                        } else if (strcmp(spec, "Xi: T(a*)") == 0) {
-                            memcpy(&v1_exp[pos], &v2[pos], float_size);
-                            snan_to_qnan(&v1_exp[pos], m4);
-                        } else if (strcmp(spec, "Xi: T(b*)") == 0) {
-                            memcpy(&v1_exp[pos], &v3[pos], float_size);
-                            snan_to_qnan(&v1_exp[pos], m4);
-                        } else if (strcmp(spec, "T(M(a,b))") == 0) {
-                            /*
-                             * Comparing floats is risky, since the compiler
-                             * might generate the same instruction that we are
-                             * testing. Compare ints instead. This works,
-                             * because we get here only for +-Fn, and the
-                             * corresponding test values have identical
-                             * exponents.
-                             */
-                            int v2_int = *(int *)&v2[pos];
-                            int v3_int = *(int *)&v3[pos];
-
-                            if ((v2_int < v3_int) ==
-                                ((test->op == VFMIN) != (v2_int < 0))) {
-                                memcpy(&v1_exp[pos], &v2[pos], float_size);
-                            } else {
-                                memcpy(&v1_exp[pos], &v3[pos], float_size);
-                            }
                        } else {
-                            fprintf(stderr, "Unexpected spec: %s\n", spec);
-                            return 1;
-                        }
-                        xi_exp |= spec[0] == 'X';
-                        pos += float_size;
-
-                        if ((m5 & 8) || pos == 16) {
-                            ret |= signed_test(test, m4, m5,
-                                               v1_exp, xi_exp, v2, v3);
-                            pos = 0;
-                            xi_exp = false;
+                            memcpy(&v1_exp[pos], &v3[pos], float_size);
                        }
+                    } else {
+                        fprintf(stderr, "Unexpected spec: %s\n", spec);
+                        return 1;
                    }
-                }
+                    xi_exp |= spec[0] == 'X';
+                    pos += float_size;
+
+                    if ((m5 & 8) || pos == 16) {
+                        ret |= signed_test(test, m4, m5,
+                                           v1_exp, xi_exp, v2, v3);
+                        pos = 0;
+                        xi_exp = false;
+                    }
+                } while (iter_next(&it, fmt));

                if (pos != 0) {
                    ret |= signed_test(test, m4, m5, v1_exp, xi_exp, v2, v3);