dolphin

Commit Graph

Author	SHA1	Message	Date
JosJuice	b5c5371848	Arm64Emitter: Don't optimize ADD to MOV for SP Unlike ADD (immediate), MOV (register) treats SP as ZR. Therefore the ADDI2R optimization that was added in `67791d227c` can't optimize ADD to MOV when exactly one of the registers is SP. There currently isn't any code in Dolphin that calls ADDI2R with parameters that would trigger this case.	2024-02-06 21:58:07 +01:00
JosJuice	d8c78f2a92	JitArm64: Fix the "do nothing" cases of ANDI2R and friends So somehow I forgot that AArch64 uses three-operand encoding... Fixes a regression from `6303416201` which manifested in various ways, such as incorrect rendering of the Wind Waker title screen.	2023-12-21 20:51:32 +01:00
JosJuice	dc60bc5f1e	JitArm64: Improve codegen in ANDI2R and friends The codegen for the functions themselves, not for the emitted code. This seems to save 32 bytes per function. We also get rid of the oddity we had before where ANDI2R would do masking for 32-bit operations but the other functions wouldn't.	2023-12-17 18:13:32 +01:00
JosJuice	a8e1e1ae48	JitArm64: Optimize additional cases of ANDI2R and friends Now we'll never need a scratch register for values that are all zeroes or all ones.	2023-12-17 18:13:32 +01:00
JosJuice	6303416201	JitArm64: Optimize ANDI2R and friends to no-ops when possible This optimizes rlwnmx with mask == 0xFFFFFFFF.	2023-12-17 18:13:30 +01:00
JosJuice	e0eb4ef5bc	JitArm64: Use enum class for LogicalImm size parameter This should prevent issues like the one fixed in the previous commit from happening again.	2023-12-16 16:48:26 +01:00
JosJuice	67791d227c	JitArm64: Add special zero case to ADDI2R This normally doesn't reduce the instruction count, but is nonetheless useful on CPUs that can do 0-cycle moves.	2023-12-01 21:31:11 +01:00
JosJuice	25ffb0dbfc	JitArm64: Mask input to 32-bit ADDI2R In case the input was a s32 that got sign extended as part of conversion to u64.	2023-12-01 21:26:37 +01:00
JosJuice	c248a69268	JitArm64: Add utility for calling a function with arguments With this, situations where multiple arguments need to be moved from multiple registers become easy to handle, and we also get compile-time checking that the number of arguments is correct.	2023-11-01 19:01:58 +01:00
JosJuice	6e88c44d5d	Move SmallVector to Common We had one implementation of this type of data structure in Arm64Emitter and one in VideoCommon. This moves the Arm64Emitter implementation to its own file and adds begin and end functions to it, so that VideoCommon can use it. You may notice that the license header for the new file is CC0. I wrote the Arm64Emitter implementation of SmallVector, so this should be no problem.	2023-08-22 13:19:49 +02:00
Lioncash	784a216927	Common/MathUtil: Move IntLog2 into MathUtil namespace Gets this out of the global namespace.	2023-04-15 03:35:05 -04:00
JosJuice	b5b8871bce	Arm64Emitter: Fix SHRN/SHRN2 The "vector shift by immediate" category encodes the shift amount for right shifts as `size - amount`, whereas left shifts use `amount`. We're not actually using SHRN/SHRN2 anywhere, which is why this has gone undetected.	2022-12-10 11:20:23 +01:00
JosJuice	06e60ac327	JitArm64: Implement accurate NaNs For quite some time now, we've had a setting on x86-64 that makes Dolphin handle NaNs in a more accurate but slower way. There's only one game that cares about this, Dragon Ball: Revenge of King Piccolo, and what that game cares about more specifically is that the default NaN (or "generated NaN" as I believe it's called in PowerPC documentation) is the same as on PowerPC. On ARM, the default NaN is the same as on PowerPC, so for the longest time we didn't need to do anything special to get Dragon Ball: Revenge of King Piccolo working. However, in `93e636a` I changed how we handle FMA instructions in a way that resulted in the sign of NaNs becoming inverted for nmadd/nmsub instructions, breaking the game. To fix this, let's implement the AccurateNaNs setting, like on x86-64.	2022-12-03 19:41:32 +01:00
JosJuice	f45d3a6a2c	JitArm64: Optimize ps_mergeXX 1. In some cases, ps_merge01 can be implemented using one instruction. 2. When we need two instructions for ps_merge01, it's best to start with a MOV to avoid false dependencies on the destination register. 3. ps_merge10 can be implemented using a single EXT instruction.	2022-11-26 18:14:58 +01:00
JosJuice	4dbf0b8e90	JitArm64: Reimplement Force25BitPrecision The previous implementation of Force25BitPrecision was essentially a translation of the x86-64 implementation. It worked, but we can make a more efficient implementation by using an AArch64 instruction I don't believe x86-64 has an equivalent of: URSHR. The latency is the same as before, but the instruction count and register count are both reduced.	2022-10-22 10:03:52 +02:00
JosJuice	84375a91d9	Arm64Emitter: Combine immh and immb for Emit(Scalar)ShiftImm This simplifies the callers of EmitShiftImm and EmitScalarShiftImm.	2022-10-19 20:20:39 +02:00
Pokechu22	a34d5e5960	Arm64Emitter: Add additional alignment assertions Before, unaligned values would be silently ignored in most cases.	2022-09-18 23:33:24 -07:00
JosJuice	52661dcc76	Arm64Emitter: Fix encoding of size for ADD (vector) This was causing a bug in the rounding of paired single multiplication operands. If Force25BitPrecision was called for quad registers, the element size of its ADD instruction would get treated as if it was 16 instead of the intended 64, which would cause the result of the calculation to be incorrect if the carry had to pass a 16-bit boundary. Fixes one of the two bugs reported in https://bugs.dolphin-emu.org/issues/12998.	2022-08-05 21:49:28 +02:00
Pokechu22	1a92699455	Cast to int for enums that are not formattable	2022-01-13 11:11:08 -08:00
Pokechu22	558de04cfc	Common/Assert: Actually use the ASSERT_MSG's log type parameter Since it was unused, nonexistent values were used in a few places. I've replaced them.	2022-01-09 12:44:14 -08:00
Pokechu22	44e93e91d7	Common/Assert: Switch to fmt	2022-01-09 12:43:11 -08:00
Pokechu22	2025763420	Treewide: Adjust order of includes	2021-12-10 14:49:57 -08:00
JMC47	e5a4a86672	Merge pull request #10055 from JosJuice/jitarm64-reuse-memory JitArm64: Codegen space reuse	2021-11-20 17:35:24 -05:00
Merry	7c2b09e156	Arm64Emitter: Add FRINTI instruction	2021-11-06 19:15:26 +00:00
JosJuice	44beaeaff5	Arm64Emitter: Check end of allocated space when emitting code JitArm64 port of `5b52b3e`.	2021-10-13 21:52:16 +02:00
JosJuice	09cdb076a3	JitArm64: divwx - Optimize constant dividend When the dividend is known at compile time, we can eliminate some of the branching and precompute the result for the overflow case.	2021-08-26 14:50:01 +02:00
JosJuice	a90b0a1c93	JitArm64: Implement mtfsfx The sixth and final part of implementing the FPSCR system register instructions.	2021-07-31 23:50:20 +02:00
JosJuice	8af5095ff4	JitArm64: Stop using hand-encoded logical immediates	2021-07-12 22:25:49 +02:00
JosJuice	10861ed8ce	JitArm64: Turn IsImmLogical into a constexpr constructor	2021-07-10 20:31:28 +02:00
JosJuice	ab024b218e	JitArm64: Accept LogicalImm struct as bitwise inst parameter	2021-07-10 20:13:11 +02:00
JosJuice	cbbd3d3956	Arm64Emitter: Fix 64-bit TBZ/TBNZ encoding We haven't actually used 64-bit TBZ/TBNZ anywhere in Dolphin, so this mistake hasn't broken anything, but let's fix it regardless.	2021-07-07 12:21:07 +02:00
Pierre Bourdon	e149ad4f0a	treewide: convert GPLv2+ license info to SPDX tags SPDX standardizes how source code conveys its copyright and licensing information. See https://spdx.github.io/spdx-spec/1-rationale/ . SPDX tags are adopted in many large projects, including things like the Linux kernel.	2021-07-05 04:35:56 +02:00
JosJuice	e0c81ae54a	JitArm64: Fix MSVC warnings	2021-05-28 15:34:08 +02:00
Skyler Saleh	948764d37b	Apple M1: Build, Analytics, and Memory Management Analytics: - Incorporated fix to allow the full set of analytics that was recommended by spotlightishere BuildMacOSUniversalBinary: - The x86_64 slice for a universal binary is now built for 10.12 - The universal binary build script now can be configured though command line options instead of modifying the script itself. - os.system calls were replaced with equivalent subprocess calls - Formatting was reworked to be more PEP 8 compliant - The script was refactored to make it more modular - The com.apple.security.cs.disable-library-validation entitlement was removed Memory Management: - Changed the JITPageWriteExecute() functions to incorporate support for nesting Other: - Fixed several small lint errors - Fixed doc and formatting mistakes - Several small refactors to make things clearer	2021-05-22 15:25:17 -07:00
Skyler Saleh	4ecb3084b7	Apple M1 Support for MacOS This commit adds support for compiling Dolphin for ARM on MacOS so that it can run natively on the M1 processors without running through Rosseta2 emulation providing a 30-50% performance speedup and less hitches from Rosseta2. It consists of several key changes: - Adding support for W^X allocation(MAP_JIT) for the ARM JIT - Adding the machine context and config info to identify the M1 processor - Additions to the build system and docs to support building universal binaries - Adding code signing entitlements to access the MAP_JIT functionality - Updating the MoltenVK libvulkan.dylib to a newer version with M1 support	2021-05-22 15:25:17 -07:00
Mai M	1054abc9cc	Merge pull request #9712 from JosJuice/jitarm64-fmul-rounding JitArm64: Fix fmul rounding issues	2021-05-20 10:25:02 -04:00
JosJuice	11be2314fe	JitArm64: Fix fmul rounding issues This is a port of `4f18f60` to JitArm64.	2021-05-15 23:27:34 +02:00
JosJuice	85226e09f0	JitArm64: Implement fres	2021-05-15 19:16:32 +02:00
JosJuice	749db94dec	Arm64Emitter: Implement more variants of FMOV	2021-05-13 10:13:59 +02:00
JosJuice	1d106ceaf5	JitArm64: Optimize ConvertSingleToDouble, part 2 If we can prove that FCVT will provide a correct conversion, we can use FCVT. This makes the common case a bit faster and the less likely cases (unfortunately including zero, which FCVT actually can convert correctly) a bit slower.	2021-04-25 15:56:19 +02:00
JosJuice	a45a0a2066	Merge pull request #9494 from Dentomologist/convert_arm64reg_to_enum_class Arm64Gen: Convert ARM64Reg to enum class	2021-03-17 00:05:23 +01:00
Dentomologist	f0f206714f	Arm64Gen: Convert ARM64Reg to enum class Most changes are just adding ARM64Reg:: in front of the constants.	2021-03-13 10:10:59 -08:00
Dentomologist	686314b548	Arm64Gen: Move constant and make constexpr Namespace-scope variable was only used in one function so move it there	2021-03-07 10:09:59 -08:00
Dentomologist	dffcbcc6c4	Arm64Gen: Remove unused constant	2021-03-07 10:09:59 -08:00
JosJuice	1e500d96b0	JitArm64: Workaround for GCC ICE	2021-02-15 23:46:08 +01:00
JosJuice	9ad4f724e4	Arm64Emitter: Use ORR in MOVI2R	2021-02-13 21:04:13 +01:00
JosJuice	0d5ed06daf	Arm64Emitter: Improve MOVI2R More or less a complete rewrite of the function which aims to be equally good or better for each given input, without relying on special cases like the old implementation did. In particular, we now have more extensive support for MOVN, as mentioned in a TODO comment.	2021-02-13 20:23:03 +01:00
JosJuice	4e107935ac	Arm64Emitter: Allow specifying 21th bit of ADRP imm	2021-02-13 11:33:27 +01:00
JosJuice	d226b8f825	Arm64Emitter: Remove optimize parameter from MOVI2R I don't really see the use of this. (Maybe in the past it was used for when we need a constant number of instructions for backpatching? But we don't use MOVI2R for that now.)	2021-02-13 11:33:27 +01:00
MerryMage	8aa2013a2d	Arm64Emitter: Add additional assertions to BFI/UBFIZ	2021-01-31 12:04:57 +00:00

1 2 3 4

155 Commits