Unfortunately it requires at least GCC6. If a nice guy can check the generated code on GCC6.
I don't know clang status.
Here the only example, I have found on the web
https://developers.redhat.com/blog/2016/02/25/new-asm-flags-feature-for-x86-in-gcc-6/
Current generated code in GSTextureCache::SourceMap::Add
38b3: bsf eax,esi
38b6: add esp,0x10
38b9: test esi,esi
38bb: jne 387e <GSTextureCache::SourceMap::Add(GSTextureCache::Source*, GIFRegTEX0 const&, GSOffset*)+0x6e>
BSF already set the Z flag when input (esi) is 0. So it would be better
to not put a silly add before the jump and to skip the test operation.
OMG, Zone of Ender got a speed boost from 11 fps to 45 fps
Seriously, the goal is to allow benchmarking GSdx without too much overhead of the main renderer draw call
Note: unlike the null renderer, texture/vertex uploading, 2D draw, texture conversions are still done.
* move the post-processing frame into the OSD tab
* Rename Global Settings to Renderer Settings
* put monitor and indicator check box on the same line
At least we have a similar number of options by tab
This reverts commit f77c1900fa.
Conflicts:
plugins/GSdx/GSTextureCache.cpp
Another fix was done later for Jak cut scene (or FMV). One game got a regression (don't remember which)
It's only ever updated after the queue is updated, so its state will
always lag slightly behind it. It's sufficient to just use empty().
This seems to fix some caching issues that were noticeable on Skylake
CPUs (#998).
In the previous code, the worker thread would notify the MTGS thread
while the mutex is still locked, which could cause the MTGS thread to
wake up and immediately go back to sleep again since it can't lock the
mutex.
Use a separate mutex for waiting, which avoids the issue.
Some PSX games seem to store image data of the drawing results in an undeterminate area out of range from the current context buffer. At such cases, calculate the height of both the frame memory rectangles combined.
What happens on "Crash bash" -
* At first draw, scissoring is limited to SCAY0- 0 & SCAY1- 255
* At second draw, scissoring is limited to SCAY0- 255 & SCAY0-511
Previously, we limited the height to the value of one single output texture, so instead of that let's calculate the total height of both the two buffers combined to prevent such issues.
Previously, we only calculated the width of a single output circuit which lead to missing a single pixel from the other output circuit which in turn causes offset issues in Persona games, I have customized GetDisplayRect() to now also calculate the dimensions of the merged rectangle when both the output circuits are enabled through the PMODE register, so this hack is no longer needed. :)
TL;DR - The above commit of mine accurately handles the offset issues by calculating union of the rects, removing this stupid hack. (not insulting any other developers, this stupid hack was mine :)
Passes the merged output circuit as the base size for texture cache scaling code. Helps fixing scaling issues where games use both of the output circuits for rendering.
Future Note: Alter the behavior of IsEnabled() check always preferring the second output circuit for some weird reason. I plan on changing it to a better auto-output circuit selection mechanism but that could probably be done some time in the future.
* Upgrade the counter to signed 32 bits. 16 bits is too small to contains the 64K value.
* Read ThreadProc/m_count when the mutex is locked
* Use old value of the fetch instead to read back the new value
Avoid reading past the end of the disk.
Avoid waiting when there are prefetches remaining.
Fix the maths so that the first prefetch after a request attempts to
read the next block of sectors and not the block of sectors that was
just read (which will just be skipped anyway because the data has just
been cached).
Avoid potential prefetch after disk is swapped (though disc swap doesn't
work properly if you just eject and insert a different disk).
Stop prefetching on disk read failure (Suikoden hits this case - 2048
byte reads are requested, but only 2352 byte reads will succeed).
Also reduce the read retry count to 2.
Use a reinterpret_cast instead of casting the function pointer address
to a void** and dereferencing it.
Also remove an unnecessary (void) and avoid including stdafx.h.
Don't adjust 'image' and just use an additional offset.
'success' was kinda unnecessary when true or false could just be
directly returned.
Move 'compression' clamping out to GSPng::Save instead.
And throw in a whole bunch of const for good measure.
Adds a device select option that hides bindings and disables binding new
inputs from all non-selected devices on the bindings list. This also
avoids input conflict issues when one controller is recognized as
several devices through different APIs.
Updates the UI by reducing the height of the plugin window. This has
been achieved by removing some buttons below the diagnostics and
bindings list and incorporating those functions into the
lists(accessible by right-clicking in the list). The binding
configurations on the Pad tabs have been moved to a separate page, like
the Forcefeedback bindings, to separate the configuration from the
bindings.
Adds a skip deadzone option to the Pad tabs.
With the normal deadzone, if the control input value is below the
deadzone threshold, the input is ignored.
However, some controllers also benefit from shortening the input range
by skipping a deadzone.
JMMT uses a bigger display height on NTSC progressive scan mode, which is not really unusual hence adjust the saturation hack to only take effect on interlaced NTSC mode.
However, the whole double screen issue on FMV still exists. As a bit of information, this game has the second output disabled but seems to have some valid data inside of it, maybe the second output data is leaked into the first one? most likely a bug in the frambuffer data management rather than a CRTC issue (needs to be investigated)
Previously, the height of the frame offset was also considered for the total height of the texture which was obviously wrong as the portion before the offset value isn't part of the frame memory.
In the previous code, the threads were created and destroyed in the base
class constructor and destructor, so the threads could potentially be
active while the object is in a partially constructed or destroyed state.
The thread however, relies on a virtual function to process the queue
items, and the vtable might not be in the desired state when the object
is partially constructed or destroyed.
This probably only matters during object destruction - no items are in
the queue during object construction so the virtual function won't be
called, but items may still be queued up when the destructor is called,
so the virtual function can be called. It wasn't an issue because all
uses of the thread explicitly waited for the queues to be empty before
invoking the destructor.
Adjust the constructor to take a std::function parameter, which the
thread will use instead to process queue items, and avoid inheriting
from the GSJobQueue class. This will also eliminate the need to
explicitly wait for all jobs to finish (unless there are other external
factors, of course), which would probably make future code safer.