So to compensate lets bring back some speed to the emulation.
change a little the way the vertex are send to the gpu,
This first implementation changes dx9 a lot and dx11 a little to increase the parallelism between the cpu and gpu.
ogl: is my next step in ogl is a little more trickier so i have to take a little more time.
the original concept is Marcos idea, with my little touch to make it even more faster.
what to look for: SPEEEEEDDD :).
please test it a lot and let me know if you see any problem.
in dx9 the code is prepared to fall back to the previous implementation if your card does not support the amount of buffers needed.
So if you did not experience any speed gains you know where is the problem :).
for the ones with more experience and compression of the code please test changing the amount and size of the buffers to tune this for your specific machine.
The current values are the sweet spot for my machine.
All must Thanks Marcos, I hate him for giving good ideas when I'm full of work.
PS and VS making. Untested and won't work for now.
Add in program shader cache files.
Readd NativeVertexFormat stuffs.
Add in PS and VS cache things.
SetShaders in places.
Fixed EFB cache index computations in OpenGL renderer.
The previous computation was very likely to go out of array bounds,
which could result in crashes on EFB access.
Also, the cache size was rounded down instead of up. This is a problem
since EFB_HEIGHT (528) is not a multiple of EFB_CACHE_RECT_SIZE (64).
Very useful to compare performance between two builds, check the impact of
a configuration option, etc. FPS log is stored in User/Logs/fps.txt and is
reset each time you launch a game. Only enabled if you check the "Log FPS
to file" option in your graphics settings.
Could be improved a bit: currently logs only every 1s (so you can't really
see small variations), maybe output more infos to the fps.txt like
average/stddev (but Excel/Libreoffice/Google Docs can compute that easily
too).
Reason:
- It's wrong, zcomploc can't be emulated perfectly in HW backends without severely impacting performance.
- It provides virtually no advantages over the previous hack while introducing lots of code.
- There is a better alternative: If people insist on having some sort of valid zcomploc emulation, I suggest rendering each primitive separately while using a _clean_ dual-pass approach to emulate zcomploc.
This reverts commit 0efd4e5c29.
This reverts commit b4ec836aca.
This reverts commit bb4c9e2205.
This reverts commit 146b02615c.
the intent is to replace the haphazard scheduling and finger-crossing associated with saving/loading with the correct and minimal necessary wait for each thread to reach a known safe location before commencing the savestate operation, and for any already-paused components to not need to be resumed to do so.
This branch reduces the number of useless state flushes in the video
emulation layer by checking whether a BP/XF change will have an effect
or not. Greatly reduces the number of GL calls per frame.
Thanks to degasus for his help!
When a game writes the same value that was already configured to a BP
register, Dolphin previously flushed the GPU pipeline and reconfigured
the internal video state (calling SetScissor/SetLineWidth/SetDepthMode).
Some of these useless writes still need to perform actions, for example
writes to the EFB copy trigger or the texture preload registers (which
need to reload the texture from memory).
This adds an "Analyzer" tab to the fifoplayer dialog which allows to conveniently browse through all register pokes that are being sent by the game each frame.
There's also a search function, but it doesn't work all that well for anything but simple searches at the moment. However, I'm merging this anyway since I'm not sure if I'm going to finish this.
Note that due to recent fifo changes, it's not yet possible to run fifoplayer in dual-core mode.
please test for regressions, speed and for other issues fixed, as a example, the black color in water splash in super mario galaxy are fixed with this rev.
please as soon as yo find a bug let me know.
a little code cleaning to avoid duplicated execution of AlphaPreTest and a little correction to some comments from the previous commits.
this change must behave exactly like last revision, if something is broken please let me know