Commit Graph

7060 Commits

Author SHA1 Message Date
Wunkolo 06daedf077 [a64] Implement `LSE` and `FP16C` detection
Adds two new flags for allowing the use of LSE and FP16C
2024-06-23 14:00:26 -07:00
Wunkolo 96d444da9c [a64] Implement `OPCODE_UNPACK`
This is a very literal translation from the x64 code into ARM and may not be very optimized. Passes unit test save for a couple off-by-one errors.
2024-06-23 14:00:26 -07:00
Wunkolo 6478623d47 [a64] Fix `OPCODE_PACK` saturation edge-cases
Passes cpu-ppc-tests
2024-06-23 14:00:26 -07:00
Wunkolo 40d908b596 [a64] Implement `OPCODE_PACK`(2101010, 4202020, 8-in-16, 16-in-32) 2024-06-23 14:00:26 -07:00
Wunkolo 7c094dc6cf [a64] Implement `OPCODE_LOAD_CLOCk` `clock_source_raw`
Uses the `CNTVCT_EL0`-register and applies frequency scaling
2024-06-23 14:00:26 -07:00
Wunkolo 9b5a690706 [a64] Optimize `OPCODE_MEMSET`
Use pair-stores rather than singular-stores to write 32-bytes of data at a time.
2024-06-23 14:00:26 -07:00
Wunkolo 6e2910b25e [a64] Optimize memory-address calculation
The LSL can be embedded into the ADD to remove an additional instruction.
What was `cset`+`lsl`+`add` should now just be `cset`+`add ... LSL 12`
2024-06-23 14:00:26 -07:00
Wunkolo e2d1e5d7f8 [a64] Optimize vector-constant generation
Uses MOVI to optimize some cases of constants rather than EOR.
MOVI is a register-renaming idiom on many architectures.
2024-06-23 14:00:26 -07:00
Wunkolo a7ae117c90 [a64] Implement `b` `bl` `br` `blr` `cbnz` `cbz` instruction-stepping 2024-06-23 14:00:26 -07:00
Wunkolo c3efaaa286 [a64] Implement instruction stepping.
Uses `0x0000'dead` as an instructon-stepping sentinel value.
Support for basic jumping instructions like `b`, `bl`, `br`, and `blr`.
2024-06-23 14:00:26 -07:00
Wunkolo f7bd0c89a3 [a64] Implement guest-debugger stalk-walks 2024-06-23 14:00:26 -07:00
Wunkolo eb0736eb25 [a64] Reduce function prolog/epilog to 16 bytes
Just need to store `fp` and `lr`
2024-06-23 14:00:26 -07:00
Wunkolo a54226578e [a64] Implement memory tracing 2024-06-23 14:00:26 -07:00
Wunkolo f1235be462 [a64] Fix `ATOMIC_COMPARE_EXCHANGE_I32` comparison type
This fixes 32-bit atomic-compare-exchanges.
The upper-half of the input register _must_ be clipped off.

This fixes a deadlock in some games.
2024-06-23 14:00:25 -07:00
Wunkolo c33f543503 [a64] Implement `kDebugInfoTraceFunctions` and `kDebugInfoTraceFunctionCoverage`
Relies on armv8.1-a atomic features
2024-06-23 14:00:25 -07:00
Wunkolo bec248c2f8 [a64] Fix `OPCODE_CNTLZ`
8 and 16 bit CNTLZ needs its bit-count fixed to its original element-type
2024-06-23 14:00:25 -07:00
Wunkolo b9d0752b40 [a64] Optimize `OPCODE_MUL_ADD`
Use `FMADD` and `FMLA`
Tests are the same, though now it should run a bit faster.
The tests that fail are primarily denormals and other subtle precision
issues it seems.

Ex:
```
i> 00002358   - vmaddfp_7298_GEN
!> 00002358 Register v4 assert failed:
!> 00002358   Expected: v4 == [00000000, 00000000, 00000000, 00000000]
!> 00002358     Actual: v4 == [000D000E, 00138014, 000E4CDC, 0018B34D]
!> 00002358     TEST FAILED
```

Host-To-Guest and Guest-To-Host thunks should probably restore/preserve
the FPCR to maintain these roundings.
2024-06-23 14:00:25 -07:00
Wunkolo 684904c487 [a64] Implement `PERMUTE_V128`(int16)
Passes 'vmrghh' and `vmrglh` unit-tests
2024-06-23 14:00:25 -07:00
Wunkolo 7eca228027 [a64] Fix `VECTOR_CONVERT_F2I` rounding
```
4.2.2.4 Floating-Point Rounding and Conversion Instructions
...
Floating-point conversions to integers (vctuxs, vctsxs) use round-toward-zero (truncate).
...
```

This passes all of the `vctuxs` and `vctsxs` unit tests
2024-06-23 14:00:25 -07:00
Wunkolo d3d3ea3149 [a64] Fix `FPCR` starting bit index 2024-06-23 14:00:25 -07:00
Wunkolo 1919dda336 [a64] Fix `OPCODE_VECTOR_CONVERT_{I2F,F2I}`
😳
2024-06-23 14:00:25 -07:00
Wunkolo 0e2f756cdd [a64] Implement `VECTOR_CONVERT_{F2I,I2F}` 2024-06-23 14:00:25 -07:00
Wunkolo e2d141e505 [a64] Fix `OPCODE_VECTOR_SHA`(constant)
Values should be modulo-element-size
2024-06-23 14:00:25 -07:00
Wunkolo 41eeae16f5 [a64] Fix `MUL_HI_I32` operands 2024-06-23 14:00:25 -07:00
Wunkolo 28b629e529 [a64] Fix `OPCODE_MAX`
Was not handling constant arguments properly
2024-06-23 14:00:25 -07:00
Wunkolo be0c7932ad [a64] Refactor `OPCODE_ATOMIC_COMPARE_EXCHANGE`
Much more explicit arguments while trying to debug a deadlock
2024-06-23 14:00:25 -07:00
Wunkolo 42d41a52f1 [a64 Fix floating-point `BRANCH_FALSE` 2024-06-23 14:00:25 -07:00
Wunkolo 6b4ff8bb62 [CPU] Fix multi-arch cpu-test support 2024-06-23 14:00:25 -07:00
Wunkolo edfd2f219b [a64] Implement `OPCODE_VECTOR_AVERAGE`
Passes generated unit tests
2024-06-23 14:00:25 -07:00
Wunkolo 1ad0d7e514 [a64] Fix `SELECT_V128_V128`
Potential input-register stomping and operand order is seemingly wrong.

Passes generated unit tests.
2024-06-23 14:00:25 -07:00
Wunkolo de040f0b42 [a64] Fix `OPCODE_SPLAT`
Writing to the wrong register!
2024-06-23 14:00:25 -07:00
Wunkolo 207e2c11fd [a64] Implement `VECTOR_COMPARE_{EQ,UGT,UGE,SGT,SGE}_V128` 2024-06-23 14:00:25 -07:00
Wunkolo 2e2f47f2de [a64] Fix `AND_NOT_V128`
Operand order is wrong.
2024-06-23 14:00:25 -07:00
Wunkolo 87cca91405 [a64] Fix `PERMUTE_V128` out-of-index case 2024-06-23 14:00:25 -07:00
Wunkolo 3adb86ce58 [a64] Implement `OPCODE_VECTOR_SUB` 2024-06-23 14:00:25 -07:00
Wunkolo 737f2b582b [UI] Implement Arm64 host register info 2024-06-23 14:00:25 -07:00
Wunkolo f5e14d6a40 [a64] Fix `SET_ROUNDING_MODE_I32` exception 2024-06-23 14:00:25 -07:00
Wunkolo 046e8edc2a [a64] Fix `SELECT` register usage 2024-06-23 14:00:25 -07:00
Wunkolo f73c8fe947 [a64] Implement `OPCODE_SWIZZLE` 2024-06-23 14:00:24 -07:00
Wunkolo c4b263894d [a64] Implement `PERMUTE_I32` 2024-06-23 14:00:24 -07:00
Wunkolo b532ab5f48 [a64] Implement `PERMUTE_V128`(int8) 2024-06-23 14:00:24 -07:00
Wunkolo 50d7ad5114 [a64] Fix non-const MUL_I32
Was picking up `W0` rather than src1
2024-06-23 14:00:24 -07:00
Wunkolo 866ce9756a [a64] Fix signed MUL_HI 2024-06-23 14:00:24 -07:00
Wunkolo 1bdc243e05 [a64] Fix ADDC carry-bit assignment 2024-06-23 14:00:24 -07:00
Wunkolo 6f0ff9e54b [a64] Preserve X0 when resolving functions
Fixes indirect branches
2024-06-23 14:00:24 -07:00
Wunkolo 31b2ccd3bb [a64] Protect address-generation from imm-overflow 2024-06-23 14:00:24 -07:00
Wunkolo c495fe726f [PPC] Add a64 backend testing support 2024-06-23 14:00:24 -07:00
Wunkolo fbc306f702 [a64] Implement multi-arch capstone support 2024-06-23 14:00:24 -07:00
Wunkolo 6e83e2a42d [a64] Fix instruction constant generation
Fixes some offset generation as well
2024-06-23 14:00:24 -07:00
Wunkolo dc6666d4d2 [a64] Update guest calling conventions
Guest-function calls will use W17 for indirect calls
2024-06-23 14:00:24 -07:00