The code is now a mirror of the ::add. So 1 insert == 1 erase
This way it won't crash on future update. And it will support future GS
memory wrapping improvement.
ZoE2:
RemoveAt overhead plummet to 0.5%. It was 17% !
However insertion is a bit slower. Due to the begin() after the push_front
v2: use std:: for lists and arrays
Removes Alpine Racer 3 hack. Issue has been resolved.
Moves NanoBreaker hack. Issue has been resolved for OpenGL and hack has
been moved to DX only.
Moves Tri-Ace games hacks. Hacks are also necessary for OpenGL with "Partial" CRC Hack Level to prevent massive slowdown.
Move Tales Of Legendia hack back as it's also necessary for OpenGL with "Partial" CRC Hack Level to prevent graphical issues.
Close: https://github.com/PCSX2/pcsx2/issues/1698
Added PAL and NTSC-U CRC's for Ar tonelico II.
Unfortunately it requires at least GCC6. If a nice guy can check the generated code on GCC6.
I don't know clang status.
Here the only example, I have found on the web
https://developers.redhat.com/blog/2016/02/25/new-asm-flags-feature-for-x86-in-gcc-6/
Current generated code in GSTextureCache::SourceMap::Add
38b3: bsf eax,esi
38b6: add esp,0x10
38b9: test esi,esi
38bb: jne 387e <GSTextureCache::SourceMap::Add(GSTextureCache::Source*, GIFRegTEX0 const&, GSOffset*)+0x6e>
BSF already set the Z flag when input (esi) is 0. So it would be better
to not put a silly add before the jump and to skip the test operation.
OMG, Zone of Ender got a speed boost from 11 fps to 45 fps
Seriously, the goal is to allow benchmarking GSdx without too much overhead of the main renderer draw call
Note: unlike the null renderer, texture/vertex uploading, 2D draw, texture conversions are still done.
* move the post-processing frame into the OSD tab
* Rename Global Settings to Renderer Settings
* put monitor and indicator check box on the same line
At least we have a similar number of options by tab
This reverts commit f77c1900fa.
Conflicts:
plugins/GSdx/GSTextureCache.cpp
Another fix was done later for Jak cut scene (or FMV). One game got a regression (don't remember which)
It's only ever updated after the queue is updated, so its state will
always lag slightly behind it. It's sufficient to just use empty().
This seems to fix some caching issues that were noticeable on Skylake
CPUs (#998).
In the previous code, the worker thread would notify the MTGS thread
while the mutex is still locked, which could cause the MTGS thread to
wake up and immediately go back to sleep again since it can't lock the
mutex.
Use a separate mutex for waiting, which avoids the issue.
Some PSX games seem to store image data of the drawing results in an undeterminate area out of range from the current context buffer. At such cases, calculate the height of both the frame memory rectangles combined.
What happens on "Crash bash" -
* At first draw, scissoring is limited to SCAY0- 0 & SCAY1- 255
* At second draw, scissoring is limited to SCAY0- 255 & SCAY0-511
Previously, we limited the height to the value of one single output texture, so instead of that let's calculate the total height of both the two buffers combined to prevent such issues.
Previously, we only calculated the width of a single output circuit which lead to missing a single pixel from the other output circuit which in turn causes offset issues in Persona games, I have customized GetDisplayRect() to now also calculate the dimensions of the merged rectangle when both the output circuits are enabled through the PMODE register, so this hack is no longer needed. :)
TL;DR - The above commit of mine accurately handles the offset issues by calculating union of the rects, removing this stupid hack. (not insulting any other developers, this stupid hack was mine :)
Passes the merged output circuit as the base size for texture cache scaling code. Helps fixing scaling issues where games use both of the output circuits for rendering.
Future Note: Alter the behavior of IsEnabled() check always preferring the second output circuit for some weird reason. I plan on changing it to a better auto-output circuit selection mechanism but that could probably be done some time in the future.
* Upgrade the counter to signed 32 bits. 16 bits is too small to contains the 64K value.
* Read ThreadProc/m_count when the mutex is locked
* Use old value of the fetch instead to read back the new value