Wunkolo
06daedf077
[a64] Implement `LSE` and `FP16C` detection
...
Adds two new flags for allowing the use of LSE and FP16C
2024-06-23 14:00:26 -07:00
Wunkolo
96d444da9c
[a64] Implement `OPCODE_UNPACK`
...
This is a very literal translation from the x64 code into ARM and may not be very optimized. Passes unit test save for a couple off-by-one errors.
2024-06-23 14:00:26 -07:00
Wunkolo
6478623d47
[a64] Fix `OPCODE_PACK` saturation edge-cases
...
Passes cpu-ppc-tests
2024-06-23 14:00:26 -07:00
Wunkolo
40d908b596
[a64] Implement `OPCODE_PACK`(2101010, 4202020, 8-in-16, 16-in-32)
2024-06-23 14:00:26 -07:00
Wunkolo
7c094dc6cf
[a64] Implement `OPCODE_LOAD_CLOCk` `clock_source_raw`
...
Uses the `CNTVCT_EL0`-register and applies frequency scaling
2024-06-23 14:00:26 -07:00
Wunkolo
9b5a690706
[a64] Optimize `OPCODE_MEMSET`
...
Use pair-stores rather than singular-stores to write 32-bytes of data at a time.
2024-06-23 14:00:26 -07:00
Wunkolo
6e2910b25e
[a64] Optimize memory-address calculation
...
The LSL can be embedded into the ADD to remove an additional instruction.
What was `cset`+`lsl`+`add` should now just be `cset`+`add ... LSL 12`
2024-06-23 14:00:26 -07:00
Wunkolo
e2d1e5d7f8
[a64] Optimize vector-constant generation
...
Uses MOVI to optimize some cases of constants rather than EOR.
MOVI is a register-renaming idiom on many architectures.
2024-06-23 14:00:26 -07:00
Wunkolo
a7ae117c90
[a64] Implement `b` `bl` `br` `blr` `cbnz` `cbz` instruction-stepping
2024-06-23 14:00:26 -07:00
Wunkolo
c3efaaa286
[a64] Implement instruction stepping.
...
Uses `0x0000'dead` as an instructon-stepping sentinel value.
Support for basic jumping instructions like `b`, `bl`, `br`, and `blr`.
2024-06-23 14:00:26 -07:00
Wunkolo
f7bd0c89a3
[a64] Implement guest-debugger stalk-walks
2024-06-23 14:00:26 -07:00
Wunkolo
eb0736eb25
[a64] Reduce function prolog/epilog to 16 bytes
...
Just need to store `fp` and `lr`
2024-06-23 14:00:26 -07:00
Wunkolo
a54226578e
[a64] Implement memory tracing
2024-06-23 14:00:26 -07:00
Wunkolo
f1235be462
[a64] Fix `ATOMIC_COMPARE_EXCHANGE_I32` comparison type
...
This fixes 32-bit atomic-compare-exchanges.
The upper-half of the input register _must_ be clipped off.
This fixes a deadlock in some games.
2024-06-23 14:00:25 -07:00
Wunkolo
c33f543503
[a64] Implement `kDebugInfoTraceFunctions` and `kDebugInfoTraceFunctionCoverage`
...
Relies on armv8.1-a atomic features
2024-06-23 14:00:25 -07:00
Wunkolo
bec248c2f8
[a64] Fix `OPCODE_CNTLZ`
...
8 and 16 bit CNTLZ needs its bit-count fixed to its original element-type
2024-06-23 14:00:25 -07:00
Wunkolo
b9d0752b40
[a64] Optimize `OPCODE_MUL_ADD`
...
Use `FMADD` and `FMLA`
Tests are the same, though now it should run a bit faster.
The tests that fail are primarily denormals and other subtle precision
issues it seems.
Ex:
```
i> 00002358 - vmaddfp_7298_GEN
!> 00002358 Register v4 assert failed:
!> 00002358 Expected: v4 == [00000000, 00000000, 00000000, 00000000]
!> 00002358 Actual: v4 == [000D000E, 00138014, 000E4CDC, 0018B34D]
!> 00002358 TEST FAILED
```
Host-To-Guest and Guest-To-Host thunks should probably restore/preserve
the FPCR to maintain these roundings.
2024-06-23 14:00:25 -07:00
Wunkolo
684904c487
[a64] Implement `PERMUTE_V128`(int16)
...
Passes 'vmrghh' and `vmrglh` unit-tests
2024-06-23 14:00:25 -07:00
Wunkolo
7eca228027
[a64] Fix `VECTOR_CONVERT_F2I` rounding
...
```
4.2.2.4 Floating-Point Rounding and Conversion Instructions
...
Floating-point conversions to integers (vctuxs, vctsxs) use round-toward-zero (truncate).
...
```
This passes all of the `vctuxs` and `vctsxs` unit tests
2024-06-23 14:00:25 -07:00
Wunkolo
d3d3ea3149
[a64] Fix `FPCR` starting bit index
2024-06-23 14:00:25 -07:00
Wunkolo
1919dda336
[a64] Fix `OPCODE_VECTOR_CONVERT_{I2F,F2I}`
...
😳
2024-06-23 14:00:25 -07:00
Wunkolo
0e2f756cdd
[a64] Implement `VECTOR_CONVERT_{F2I,I2F}`
2024-06-23 14:00:25 -07:00
Wunkolo
e2d141e505
[a64] Fix `OPCODE_VECTOR_SHA`(constant)
...
Values should be modulo-element-size
2024-06-23 14:00:25 -07:00
Wunkolo
41eeae16f5
[a64] Fix `MUL_HI_I32` operands
2024-06-23 14:00:25 -07:00
Wunkolo
28b629e529
[a64] Fix `OPCODE_MAX`
...
Was not handling constant arguments properly
2024-06-23 14:00:25 -07:00
Wunkolo
be0c7932ad
[a64] Refactor `OPCODE_ATOMIC_COMPARE_EXCHANGE`
...
Much more explicit arguments while trying to debug a deadlock
2024-06-23 14:00:25 -07:00
Wunkolo
42d41a52f1
[a64 Fix floating-point `BRANCH_FALSE`
2024-06-23 14:00:25 -07:00
Wunkolo
6b4ff8bb62
[CPU] Fix multi-arch cpu-test support
2024-06-23 14:00:25 -07:00
Wunkolo
edfd2f219b
[a64] Implement `OPCODE_VECTOR_AVERAGE`
...
Passes generated unit tests
2024-06-23 14:00:25 -07:00
Wunkolo
1ad0d7e514
[a64] Fix `SELECT_V128_V128`
...
Potential input-register stomping and operand order is seemingly wrong.
Passes generated unit tests.
2024-06-23 14:00:25 -07:00
Wunkolo
de040f0b42
[a64] Fix `OPCODE_SPLAT`
...
Writing to the wrong register!
2024-06-23 14:00:25 -07:00
Wunkolo
207e2c11fd
[a64] Implement `VECTOR_COMPARE_{EQ,UGT,UGE,SGT,SGE}_V128`
2024-06-23 14:00:25 -07:00
Wunkolo
2e2f47f2de
[a64] Fix `AND_NOT_V128`
...
Operand order is wrong.
2024-06-23 14:00:25 -07:00
Wunkolo
87cca91405
[a64] Fix `PERMUTE_V128` out-of-index case
2024-06-23 14:00:25 -07:00
Wunkolo
3adb86ce58
[a64] Implement `OPCODE_VECTOR_SUB`
2024-06-23 14:00:25 -07:00
Wunkolo
737f2b582b
[UI] Implement Arm64 host register info
2024-06-23 14:00:25 -07:00
Wunkolo
f5e14d6a40
[a64] Fix `SET_ROUNDING_MODE_I32` exception
2024-06-23 14:00:25 -07:00
Wunkolo
046e8edc2a
[a64] Fix `SELECT` register usage
2024-06-23 14:00:25 -07:00
Wunkolo
f73c8fe947
[a64] Implement `OPCODE_SWIZZLE`
2024-06-23 14:00:24 -07:00
Wunkolo
c4b263894d
[a64] Implement `PERMUTE_I32`
2024-06-23 14:00:24 -07:00
Wunkolo
b532ab5f48
[a64] Implement `PERMUTE_V128`(int8)
2024-06-23 14:00:24 -07:00
Wunkolo
50d7ad5114
[a64] Fix non-const MUL_I32
...
Was picking up `W0` rather than src1
2024-06-23 14:00:24 -07:00
Wunkolo
866ce9756a
[a64] Fix signed MUL_HI
2024-06-23 14:00:24 -07:00
Wunkolo
1bdc243e05
[a64] Fix ADDC carry-bit assignment
2024-06-23 14:00:24 -07:00
Wunkolo
6f0ff9e54b
[a64] Preserve X0 when resolving functions
...
Fixes indirect branches
2024-06-23 14:00:24 -07:00
Wunkolo
31b2ccd3bb
[a64] Protect address-generation from imm-overflow
2024-06-23 14:00:24 -07:00
Wunkolo
c495fe726f
[PPC] Add a64 backend testing support
2024-06-23 14:00:24 -07:00
Wunkolo
fbc306f702
[a64] Implement multi-arch capstone support
2024-06-23 14:00:24 -07:00
Wunkolo
6e83e2a42d
[a64] Fix instruction constant generation
...
Fixes some offset generation as well
2024-06-23 14:00:24 -07:00
Wunkolo
dc6666d4d2
[a64] Update guest calling conventions
...
Guest-function calls will use W17 for indirect calls
2024-06-23 14:00:24 -07:00