Commit Graph

34535 Commits

Author SHA1 Message Date
Vicki Pfau 4ce3362bce SI/DeviceGBA: Fix SI timings to actually closely match hardware 2021-04-26 01:36:43 -07:00
JosJuice ac679eb24d
Merge pull request #9666 from leoetlino/jit-block-hashtable
Jit: Optimize block link queries by using hash tables
2021-04-25 18:45:41 +02:00
JosJuice a2c8050eba DolphinQt/Android: Unify the JIT naming scheme
I think the AArch64 JIT has come far enough that it doesn't have to
be called experimental anymore.

I'm also labeling the x86-64 JIT as x86-64 for consistence with the
AArch64 JIT. This will especially be helpful if we start supporting
AArch64 on macOS, as AArch64 macOS can run both the x86-64 JIT and
the AArch64 JIT depending on whether you enable Rosetta 2.
2021-04-25 17:19:50 +02:00
JMC47 5da85f3a25
Merge pull request #9458 from JosJuice/arm-fpu-round
JitArm64: Set flush-to-zero/rounding mode and improve float/double conversion accuracy
2021-04-25 10:23:19 -04:00
JosJuice 69c14d6ec3 JitArm64: Fix frspx with single precision source
I haven't observed this breaking any game, but it didn't match
the behavior of the interpreter as far as I could tell from
reading the code, in that denormals weren't being flushed.
2021-04-25 15:56:59 +02:00
JosJuice 54451ac731 JitArm64: Use ConvertSingleToDoubleLower in RW when faster 2021-04-25 15:56:59 +02:00
JosJuice 9d6263f306 JitArm64: Add unit tests for single/double conversion 2021-04-25 15:56:58 +02:00
JosJuice 2a9d88739c JitArm64: Skip accurate single/double conversion if store-safe 2021-04-25 15:56:58 +02:00
JosJuice 1d106ceaf5 JitArm64: Optimize ConvertSingleToDouble, part 2
If we can prove that FCVT will provide a correct conversion,
we can use FCVT. This makes the common case a bit faster
and the less likely cases (unfortunately including zero,
which FCVT actually can convert correctly) a bit slower.
2021-04-25 15:56:19 +02:00
JosJuice 018e247624 JitArm64: Optimize ConvertSingleToDouble, part 1 2021-04-25 15:56:19 +02:00
JosJuice 28e4869c43 JitArm64: Optimize ConvertDoubleToSingle 2021-04-25 15:56:19 +02:00
JosJuice 6e0a5876ef JitArm64: Use accurate single/double conversions
Our old conversion approach became a lot more inaccurate when
enabling flush-to-zero, to the point of obviously breaking games.
2021-04-25 15:56:19 +02:00
JosJuice 39eccf6603 JitArm64: Call RW before FCMPE in fselx
Needed because the next commit will make RW clobber flags.
2021-04-25 15:56:19 +02:00
JosJuice 949686bbe7 JitArm64: Factor out single/double conversion code to functions
Preparation for following commits.

This commit intentionally doesn't touch paired stores,
since paired stores are supposed to flush to zero.
(Consistent with Jit64.)
2021-04-25 15:56:19 +02:00
JosJuice fdf7744a53 JitArm64: Move float conversion code out of EmitBackpatchRoutine
This simplifies some of the following commits. It does require
an extra register, but hey, we have 32 of them.

Something I think would be nice to add to the register cache
in the future is the ability to keep both the single and double
version of a guest register in two different host registers
when that is useful. That way, the extra register we write to
here can be read by a later instruction, saving us from
having to perform the same conversion again.
2021-04-25 15:56:19 +02:00
JosJuice f96ee475e4 Implement ArmFPURoundMode.cpp
Fixes https://bugs.dolphin-emu.org/issues/12388. Might also fix
other games that have problems with float/paired instructions
in JitArm64, but I haven't tested any.
2021-04-25 15:56:19 +02:00
Léo Lam aa3a96f048
Merge pull request #9644 from JosJuice/jit-fallback-discard
Jits: Fix interpreter fallback handling of discarded registers
2021-04-25 13:20:41 +02:00
JosJuice b3b5016f54 Jits: Fix interpreter fallback handling of discarded registers
When the interpreter writes to a discarded register, its type
must be changed so that it is no longer considered discarded.

Fixes a 62ce1c7 regression.
2021-04-25 13:01:40 +02:00
JosJuice 0f563ffd59 Translation resources sync with Transifex 2021-04-24 21:26:12 +02:00
Léo Lam 1c6232e95f
Merge pull request #9646 from PatrickFerry/sw-textureencoder-alignedwidth
SW: Fix alignedWidth in TextureEncoder
2021-04-24 20:13:10 +02:00
Léo Lam 18174d3ed6
Merge pull request #9649 from leoetlino/cmake-auto-update-track
Make it possible to enable auto-updates by default with CMake builds
2021-04-24 19:44:51 +02:00
JosJuice 91669c25fe
Merge pull request #9650 from leoetlino/consistent-build-binary-dirs
Put x86_64 Windows binaries in Binary/x64 for consistency with ARM64
2021-04-24 19:22:05 +02:00
Léo Lam 302e8136a3
Merge pull request #5624 from Orphis/cmake_windows
cmake: Windows fixes
2021-04-24 18:11:23 +02:00
Florent Castelli 6910fab63f
cmake: Replace /Zi with /Z7 for sccache support
Also allows better parallelization since there's no contention on a PDB
file with compiling.
2021-04-24 17:59:34 +02:00
Florent Castelli 712b078a5b
cmake: Search for sccache too in CCache module
sccache is a ccache rewrite that supports GCC, Clang and MSVC.
2021-04-24 17:59:34 +02:00
Léo Lam c812ab6a63
Jit: Optimize block link queries by using hash tables
Repeated erase() + iteration on a std::multimap is extremely slow.

Slow enough that it causes a 7 second long stutter during some
transitions in F-Zero X (a N64 VC game that triggers many, many icache
invalidations).

And slow enough that JitBaseBlockCache::DestroyBlock shows up on a
flame graph as taking >50% of total CPU time on the CPU-GPU thread:
https://i.imgur.com/vvqiFL6.png

This commit optimises those block link queries by replacing the
std::multimap (which is typically implemented with red-black trees)
with hash tables.

Master: https://i.imgur.com/vvqiFL6.png / 7s stutters
(starting from 5.0-2021 and with branch following disabled)

This commit: https://i.imgur.com/hAO74fy.png / ~0.7s stutters, which
is pretty close to 5.0 stable. (5.0-2021 introduced the performance
regression and it is especially noticeable when branch following
is disabled, which is the case for all N64 VC games since 5.0-8377.)
2021-04-24 17:20:59 +02:00
JMC47 18e84361d9
Merge pull request #9660 from ezio1900/master
VideoCommon: Fix scissorOffset, handle negative value correctly
2021-04-23 21:21:53 -04:00
ezio1900 97ea3a603e VideoCommon: Fix scissorOffset, handle negative value correctly
VideoCommon: Change the type of BPMemory.scissorOffset to 10bit signed: S32X10Y10
VideoBackends: Fix Software Clipper.PerspectiveDivide function, use BPMemory.scissorOffset instead of hard code 342
2021-04-24 08:46:21 +08:00
JosJuice be5775614c
Merge pull request #9619 from leoetlino/scoped-fd
IOS/FS: Add a scoped FD class to make it harder to leak FDs
2021-04-23 21:53:25 +02:00
JosJuice f0bd6b105f
Merge pull request #9663 from leoetlino/mios-hle-patch
Fix IPL crash when launching MIOS-patched games
2021-04-23 21:00:29 +02:00
JMC47 cfc4af76a9
Merge pull request #9321 from Pokechu22/sw-copyregion
Software: Fix out of bounds accesses in CopyRegion
2021-04-23 14:07:46 -04:00
JMC47 4ab92d4757
Merge pull request #9350 from Pokechu22/sw-viewport
Software: Invert backface test when viewport is positive
2021-04-23 14:06:02 -04:00
Léo Lam 1686b637df
MIOS: Fix SConfig::OnNewTitleLoad not being called
Oversight from #9545, which moved the "new game has been loaded" logic
to a separate OnNewTitleLoad function that has to be called explicitly
*after* a title has loaded.

Coupled with the commit that makes Dolphin not clobber 0x1800-0x3000
when using MIOS, this fixes Wind Waker and other MIOS-patched games
when they are launched from the System Menu.
2021-04-22 21:55:07 +02:00
Léo Lam 568428ca67
HLE: Do not clobber 0x1800-0x3000 when using MIOS to fix IPL crash
MIOS puts patch data in low MEM1 (0x1800-0x3000) for its own use.
Overwriting data in this range can cause the IPL to crash when
launching games that get patched by MIOS.
See https://bugs.dolphin-emu.org/issues/11952 for more info.

Not applying the Gecko HLE patches means that Gecko codes will not work
under MIOS, but this is better than the alternative of having specific
games crash.
2021-04-22 21:50:05 +02:00
JosJuice 4d37dad20d
Merge pull request #9659 from leoetlino/tp-korean-gameini
GameINI: Fix file path for RZDK01 INI
2021-04-20 08:01:12 +02:00
Léo Lam bfaed2b0b1
GameINI: Fix file path for RZDK01 INI 2021-04-20 01:43:02 +02:00
Léo Lam 34348fad1d
Merge pull request #9658 from lioncash/fallthrough
BlockingLoop/Jit64: Indicate explicit fallthrough where applicable
2021-04-20 00:28:13 +02:00
Lioncash adebc499f9 Jit64: Indicate explicit [[fallthrough]] within load helper 2021-04-19 17:37:44 -04:00
Lioncash e1dfcda8a6 BlockingLoop: Add explicit [[fallthrough]] annotations 2021-04-19 17:34:46 -04:00
Léo Lam cf80ed7f2d
Merge pull request #9653 from JosJuice/android-import-nand
Android: Add "Import BootMii NAND Backup"
2021-04-19 21:57:40 +02:00
JosJuice ceacd0930b Android: Add "Import BootMii NAND Backup" 2021-04-19 21:38:22 +02:00
Léo Lam ec5fbeb0d6
Merge pull request #9654 from JosJuice/android-12-early
Android: Early changes to adapt for Android 12
2021-04-19 21:25:18 +02:00
Léo Lam 045c5a1fdc
Merge pull request #9655 from PPLToast/ztp-korea-ini
GameINI: Add speed hack for Korean TP
2021-04-19 19:58:22 +02:00
Patrick A. Ferry f6a4368192 SW: Fix alignedWidth in TextureEncoder
This was causing issues in Software Renderer. Look at bug 11487
2021-04-19 02:06:52 +01:00
ash!! 43ceba4fef optimize TextureCacheBase::SerializeTexture, ::DeserializeTexture
texture serialization and deserialization used to involve many memory
allocations and deallocations, along with many copies to and from
those allocations. avoid those by reserving a memory region inside the
output and writing there directly, skipping the allocation and copy to
an intermediate buffer entirely.
2021-04-18 13:40:42 -07:00
PPLToast ac3c728f13
Add speed hack for Korean TP 2021-04-18 22:20:25 +03:00
JosJuice 5a1a642495 Android: Early changes to adapt for Android 12
We don't actually need to do this until we bump targetSdkVersion
to Android 12 (which we can't do yet since we're not compatible
with scoped storage), but I figured I'd get it out of the way early.

Not tested on Android 12, but tested to not break stuff on Android 10.
2021-04-18 18:33:43 +02:00
JMC47 821e51cda4
Merge pull request #7214 from stenzek/cp-access-sync
Fifo: Run/sync with the GPU on command processor register access
2021-04-18 11:17:25 -04:00
JosJuice dbd39ab2a0
Merge pull request #9642 from CrunchBite/xlink-bba-fix
Fix crash when stopping a game that does not use the BBA when XLink Kai BBA is selected in configuration
2021-04-18 10:43:32 +02:00
JosJuice 92308f5e34
Merge pull request #9645 from leoetlino/fifoplayer-optimization
FifoPlayer: Copy data with memcpy instead of one byte at a time
2021-04-18 10:43:26 +02:00