So to compensate lets bring back some speed to the emulation.
change a little the way the vertex are send to the gpu,
This first implementation changes dx9 a lot and dx11 a little to increase the parallelism between the cpu and gpu.
ogl: is my next step in ogl is a little more trickier so i have to take a little more time.
the original concept is Marcos idea, with my little touch to make it even more faster.
what to look for: SPEEEEEDDD :).
please test it a lot and let me know if you see any problem.
in dx9 the code is prepared to fall back to the previous implementation if your card does not support the amount of buffers needed.
So if you did not experience any speed gains you know where is the problem :).
for the ones with more experience and compression of the code please test changing the amount and size of the buffers to tune this for your specific machine.
The current values are the sweet spot for my machine.
All must Thanks Marcos, I hate him for giving good ideas when I'm full of work.
Very useful to compare performance between two builds, check the impact of
a configuration option, etc. FPS log is stored in User/Logs/fps.txt and is
reset each time you launch a game. Only enabled if you check the "Log FPS
to file" option in your graphics settings.
Could be improved a bit: currently logs only every 1s (so you can't really
see small variations), maybe output more infos to the fps.txt like
average/stddev (but Excel/Libreoffice/Google Docs can compute that easily
too).
Reason:
- It's wrong, zcomploc can't be emulated perfectly in HW backends without severely impacting performance.
- It provides virtually no advantages over the previous hack while introducing lots of code.
- There is a better alternative: If people insist on having some sort of valid zcomploc emulation, I suggest rendering each primitive separately while using a _clean_ dual-pass approach to emulate zcomploc.
This reverts commit 0efd4e5c29.
This reverts commit b4ec836aca.
This reverts commit bb4c9e2205.
This reverts commit 146b02615c.
The GL EFB cache did not clamp correctly the coordinates when computing
the rectangle it needed to cache, leading to negative values being used
as indexes and often crashes.
Fixes issue 5510.
the intent is to replace the haphazard scheduling and finger-crossing associated with saving/loading with the correct and minimal necessary wait for each thread to reach a known safe location before commencing the savestate operation, and for any already-paused components to not need to be resumed to do so.
please test for regressions, speed and for other issues fixed, as a example, the black color in water splash in super mario galaxy are fixed with this rev.
please as soon as yo find a bug let me know.
to marcosvitali.
Added an external exception check when the CPU writes to the FIFO. This allows
the CPU time to service FIFO overflows. Fixes random hangs caused by FIFO
overflows and desyncs like in "The Last Story" and "Battalion Wars 2". Thanks
to marcosvitali for the research.
Added some code to unlink invalidated blocks so that the recompiled block can be
linked (speed-up).
This release still fixed the hangs produced by fifo overflow without sacrifice
performance. For example you can test Tutorial moves at the beginning of The last history now
is fluid 30/60.
Fixed possibles random hangs in DC mode.
Fixed hangs in DC mode in (Simpsons, Monkey Island, Pokemon XD, etc)
Implemented accurate management of Pixel Engine Interrupts. Now the GPU loop
is stopped when a PE Interrupt needs to be managed and resumed when Pixel Engine
finish.
Fixed Metroid Prime 3 and 2 desync. And other games with desync because of
FIFO Reset. That happens because FIFO_RW_DISTANCE_HI must be written first, for checking
fifo.CPReadWriteDistance == 0, so some fifo resets was not managed in the right
way.
Fixed Super Monkey Ball in some cases when the game write the
WriteReadDistance need to be safe like the SafeCPRead.
Improved the CheckException for the GatherPipe writes in JIT, now only the
External Exceptions are processed.
Fixed definitely Pokemon XD in dual core mode. This game is doing something
not allowed. It attach to CPU the same fifo attached to the GPU in multibuffer
mode. I added a check to prevent overwrite the GPU FIFO with the CPU FIFO. If
the game do that on breakpoint the solution can fail.
Fixed ReadWriteDistance calc when CPRead > CPWrite.
Added Token and Finish cause to GP Jit checking.
Additional cleanup in CommandProcessor.
Fixes issue 5209
Fixes issue 5055
Fixes issue 4889
Fixes issue 4061
Fixes issue 4010
Fixes issue 3902
For example you can test Tutorial moves at the beginning of The last history now is fluid 30/60.
Shuffle2: I've delete the hacky line, I think is not necessary anymore. Additional some clean in CommandProcessor.
Please test The Last Story and others games affected in the previous commits and give me a feedback.