This normalization was added in 02ac5e95c8 (and changed to use floats in 4bf031c064). However, this normalization introduces NaN values in some cases, which is causing problems for the version of Mesa in use on FifoCI (currently 20.3.5). Although Mesa's NaN behavior is corrected by b3f3287eac (21.2.0), FifoCI is currently stuck with the older version.
This normalization was added in 02ac5e95c8, and changed to use floats in 4bf031c064. The conversion to floats means that sometimes there is insufficient precision for the normalization process, which results in values of NaN or infinity. Performing the whole process with doubles prevents that, but games also sometimes set the values to NaN or infinity directly (possibly accidentally due to the values not being initialized due to them not being used in the current configuration?).
The version of Mesa currently in use on FifoCI (20.3.5) has issues with NaN. Although this bug has been fixed (b3f3287eac in 21.2.0), FifoCI is stuck with the older version.
This change may or may not be incorrect, but it should result in the same behavior as already present in Dolphin, while working around the Mesa bug.
CARDUCode, GBAUCode, and INITUCode previously didn't have an implementation of it. In practice it's unlikely that this caused an issue, since these uCodes are only active for a few frames at most, but now that GBAUCode doesn't have global state, we can implement it there. I also implemented it for CARDUCode, although our CARDUCode implementation does not have all states handled yet - this is simply future-proofing so that when the card uCode is properly implemented, the save state version does not need to be bumped. INITUCode does not have any state to save, though.
The accuracy improvements are:
* The request mail must be 0xabba0000 exactly; both the low and high parts are checked
* The address is masked with 0x0fffffff
* Before, the global state meant that after the GBA uCode had been used once, it would accept 0xcdd1 commands immediately. Now, it only accepts them after execution has finished.
These lookup tables total 4 megabytes, and contain data that's entirely redundant to the actual cache state (as part of an optimization, though I'm not sure whether the optimization actually is useful). This change instead recomputes these lookup tables when loading the state (which involves filling the lookup table with a marker (0xff), and then setting the 128 * 8 valid entries (1 kilobyte)).