So to compensate lets bring back some speed to the emulation.
change a little the way the vertex are send to the gpu,
This first implementation changes dx9 a lot and dx11 a little to increase the parallelism between the cpu and gpu.
ogl: is my next step in ogl is a little more trickier so i have to take a little more time.
the original concept is Marcos idea, with my little touch to make it even more faster.
what to look for: SPEEEEEDDD :).
please test it a lot and let me know if you see any problem.
in dx9 the code is prepared to fall back to the previous implementation if your card does not support the amount of buffers needed.
So if you did not experience any speed gains you know where is the problem :).
for the ones with more experience and compression of the code please test changing the amount and size of the buffers to tune this for your specific machine.
The current values are the sweet spot for my machine.
All must Thanks Marcos, I hate him for giving good ideas when I'm full of work.
Most of the InvalidateICache calls are for a 32 bytes block: this is the
number of bytes invalidated by PowerPC dcb*/icb* instructions. Profiling
shows that a lot of CPU time is spent checking if there are any JIT blocks
covered by these 32 bytes (using std::map::lower_bound).
This patch adds a bitset containing the state of every 32 bytes block in
RAM (JIT cached/not JIT cached). Using that, a 32 bytes InvalidateICache
can check in the bitset if any JIT block might be invalidated. A bitset
check is a lot faster than an std::map::lower_bound operation, improving
performance of JitCache::InvalidateICache by more than 100%.
Some practical numbers:
* Xenoblade Chronicles (PAL)
56.04FPS -> 59.28FPS (+5.78%)
* The Last Story (PAL)
30.9FPS -> 32.83FPS (+6.25%)
* Super Mario Galaxy (PAL)
59.76FPS -> 62.46FPS (+4.52%)
This function still takes more time than it should - more optimization in
this area might be possible (specializing for 32 bytes blocks to avoid
useless memcpy, for example).
Very useful to compare performance between two builds, check the impact of
a configuration option, etc. FPS log is stored in User/Logs/fps.txt and is
reset each time you launch a game. Only enabled if you check the "Log FPS
to file" option in your graphics settings.
Could be improved a bit: currently logs only every 1s (so you can't really
see small variations), maybe output more infos to the fps.txt like
average/stddev (but Excel/Libreoffice/Google Docs can compute that easily
too).
These merges, while in theory improving emulation accuracy, cause issues
in other parts of the emulator based on invalid assumptions. memcard-delay
fixed some of these issues in the EXI memcard code, but several other
problems still exist and I don't have the time to debug that right now.
The problem here was the logic that detects SDL in the main CMakeLists.txt
is not the same as it is in DolphinWX/CmakeLists.txt to set libraries. When
using SDL from Externals it failed at link time because -lSDL was never set.
This fixes the problem by using the same condition logic to set the libs
as used when detecting SDL in the first place.
This was not needed for most games before because the external exception was
itself delayed. aram-dma-fixes changed that and made the external exception
happen a lot quicker, breaking games that relied on the memcard operations
delay.
Fixes issue 5583.
Reason:
- It's wrong, zcomploc can't be emulated perfectly in HW backends without severely impacting performance.
- It provides virtually no advantages over the previous hack while introducing lots of code.
- There is a better alternative: If people insist on having some sort of valid zcomploc emulation, I suggest rendering each primitive separately while using a _clean_ dual-pass approach to emulate zcomploc.
This reverts commit 0efd4e5c29.
This reverts commit b4ec836aca.
This reverts commit bb4c9e2205.
This reverts commit 146b02615c.
The GL EFB cache did not clamp correctly the coordinates when computing
the rectangle it needed to cache, leading to negative values being used
as indexes and often crashes.
Fixes issue 5510.
This adds support for drivers supporting sine, square and triangle
periodic haptic effects. This allows rumble to work on devices/drivers
supporting these effects, such as an xbox controller using the xpad
driver under Linux.
Dolphin code already builds against SDL2 but the build system never
checks for SDL2, which is the what latest SDL is called now. SDL2
replaces SDL 1.3. This allows Dolphin to be build against SDL2, which
activates certain new features such as the haptic interface.
To use, install OpenVPN's TAP device driver. Then create a network bridge between the TAP and your device connected to the internet.
TODO:
proper overlapped read - can look at qemu impl
non-windows impl
MTMSR is executed.
This commits fixes issue 617. WWE Day of Reckoning 1 and 2 are now playable
with Dolphin.
The changes are not implemented for JitIL yet.
the intent is to replace the haphazard scheduling and finger-crossing associated with saving/loading with the correct and minimal necessary wait for each thread to reach a known safe location before commencing the savestate operation, and for any already-paused components to not need to be resumed to do so.
* misc-speedups:
fixed and reenabled and slightly optimized the JIT version of fcmpo/fcmpu.
slightly more precise speed percent display (this is really minor)
a small thread synchronization speedup for dual core mode. it's most noticeable in games where the CPU is running behind compared to the GPU.
Conflicts:
Source/Core/Core/Src/PowerPC/Jit64/Jit.cpp
The Fifo.cpp changes from rdaefb3b550e2 was not merged as there was no performance benefit.
This branch reduces the number of useless state flushes in the video
emulation layer by checking whether a BP/XF change will have an effect
or not. Greatly reduces the number of GL calls per frame.
Thanks to degasus for his help!
When a game writes the same value that was already configured to a BP
register, Dolphin previously flushed the GPU pipeline and reconfigured
the internal video state (calling SetScissor/SetLineWidth/SetDepthMode).
Some of these useless writes still need to perform actions, for example
writes to the EFB copy trigger or the texture preload registers (which
need to reload the texture from memory).
This allows users to easily check whether their Wii dump is corrupted or not
using the Dolphin properties window. Right click on a game, Properties,
Filesystem tab, then right click on the game partition and select "Check
partition integrity".
This may have some false negatives due to the unused clusters heuristic (see
the comment in VolumeWiiCrypted.cpp). False positives are unlikely.
* JIT-Exceptions:
JitIL code cleanup
Changed the JIT code to make the FPU exception timing more accurate. The exception is now triggered at the first FP instruction instead of the start of the block. Rearranged the JIT exception code for a tiny speed-up. Only external exceptions are checked at the end of the block. All other exceptions are checked at the time they occur.
Fixes issue 5382.
Conflicts:
Source/Core/Core/Src/PowerPC/Jit64/Jit_LoadStore.cpp
* AudioStreaming:
Reset the stream playing flag on init.
force VolumeDirectory to align files to 32KB (only streaming audio files really need to be aligned...)
Removed the DTK Music option. It is now always enabled.
Added the response for audio streaming disc offset requests. Generate an AI interrupt at the end of the audio streaming loop. Fixes Pac-man Fever and the background music in Eternal Darkness.
Fixed the erroneous looping in audio streaming games like Eternal Darkness and Zoids: Battle Legends. Thanks for the tip, tueidj.
Rearranged the JIT exception code for a tiny speed-up. Only external exceptions are checked at the end of the block. All other exceptions are checked at the time they occur.
some how I neglected to remember that r+ requires the file to exist.
still should fix the issue with 0 byte memory cards.
This reverts commit 6bfb8c9597.
This adds an "Analyzer" tab to the fifoplayer dialog which allows to conveniently browse through all register pokes that are being sent by the game each frame.
There's also a search function, but it doesn't work all that well for anything but simple searches at the moment. However, I'm merging this anyway since I'm not sure if I'm going to finish this.
Note that due to recent fifo changes, it's not yet possible to run fifoplayer in dual-core mode.
please test for regressions, speed and for other issues fixed, as a example, the black color in water splash in super mario galaxy are fixed with this rev.
please as soon as yo find a bug let me know.
a little code cleaning to avoid duplicated execution of AlphaPreTest and a little correction to some comments from the previous commits.
this change must behave exactly like last revision, if something is broken please let me know
The decided way to find Wii Remotes is by their bluetooth name, so
this patch introduces common code to identify if a given string is a
valid Wiimote name.
On Mac, when scanning bluetooth, consult the function with all found
bluetooth devices.
There are two ways to send commands to Wii Remotes:
- On command channel, with a first byte of 0x52. This works on
Nintendo RVL-CNT-01, but not Nintendo RVL-CNT-01-TR wiimotes.
- On interrupt channel, with a first byte of 0xa2. This works on
Nintendo RVL-CNT-01 and Nintendo RVL-CNT-01-TR wiimotes.
This patch switches Mac from the former to the latter. Windows and
Linux remain unchanged.
This reverts commit c9dfcf8cf7.
That commit attempted to support all Wii Remotes on Mac OS X, but the
logic was incorrect, and as a result the original (non-TR) Wii Remotes
were broken by that change.
Future patches will address this problem in a better way.
/fp:fast was introducing FP precision problems, and mixing it with some
/fp:precise code caused strange game behaviors, DSI exceptions and freezes.
This commit should fix most of the issues introduced by 3.0-73 (r95517a97).
Thanks to hatarumoroboshi@hotmail.com for tracking a lot of these Win32 bugs.
Fixes issue 4906.
Fixes issue 5138.
Probably fixes (not tested) issue 5067.
to marcosvitali.
Added an external exception check when the CPU writes to the FIFO. This allows
the CPU time to service FIFO overflows. Fixes random hangs caused by FIFO
overflows and desyncs like in "The Last Story" and "Battalion Wars 2". Thanks
to marcosvitali for the research.
Added some code to unlink invalidated blocks so that the recompiled block can be
linked (speed-up).
This release still fixed the hangs produced by fifo overflow without sacrifice
performance. For example you can test Tutorial moves at the beginning of The last history now
is fluid 30/60.
Fixed possibles random hangs in DC mode.
Fixed hangs in DC mode in (Simpsons, Monkey Island, Pokemon XD, etc)
Implemented accurate management of Pixel Engine Interrupts. Now the GPU loop
is stopped when a PE Interrupt needs to be managed and resumed when Pixel Engine
finish.
Fixed Metroid Prime 3 and 2 desync. And other games with desync because of
FIFO Reset. That happens because FIFO_RW_DISTANCE_HI must be written first, for checking
fifo.CPReadWriteDistance == 0, so some fifo resets was not managed in the right
way.
Fixed Super Monkey Ball in some cases when the game write the
WriteReadDistance need to be safe like the SafeCPRead.
Improved the CheckException for the GatherPipe writes in JIT, now only the
External Exceptions are processed.
Fixed definitely Pokemon XD in dual core mode. This game is doing something
not allowed. It attach to CPU the same fifo attached to the GPU in multibuffer
mode. I added a check to prevent overwrite the GPU FIFO with the CPU FIFO. If
the game do that on breakpoint the solution can fail.
Fixed ReadWriteDistance calc when CPRead > CPWrite.
Added Token and Finish cause to GP Jit checking.
Additional cleanup in CommandProcessor.
Fixes issue 5209
Fixes issue 5055
Fixes issue 4889
Fixes issue 4061
Fixes issue 4010
Fixes issue 3902
Don't use isascii() - just do it ourselves
Bump required wxw version (for shared libs)
There still seems to be linking issues on some linux distros, I can't reproduce it though...