* Avoid the generation of memory barrier (mfence)
* Based on the fact that it used to work on previous code without any
barrier
v2:
* keep basic code in reset path
* use relaxed access for isBusy. The variable doesn't carry load/store
dependency but is instead an hint to optimize semaphore post
Read pointer is only written by the MTGS thread so it can be relaxed-read
by the MTGS thread
Write pointer is only written by the EE thread so it can be relaxed-read
by the EE thread
Remove volatile stuff
* Avoid the generation of memory barrier (mfence)
* Based on the fact that it used to work on previous code without any
barrier
v2:
* keep "slow" and safe atomic on reset path
* use relaxed atomic for m_RingBufferIsBusy
The signal doesn't carry dependency with others load/store. Instead it is used
as an hint to awake the semaphore
As a side, there is a potential thread bug
T1 do
* wait sema
* busy = true;
* while (!queue.empty) do work...
* busy = false;
* go back to wait sema
T2 do
* post sema even if busy is false
If T1 stop after the while queue but before the busy, T2 won't post the
event. When T1 will wake up, it will block on the semaphore
Use a relaxed atomic to read the exit variable in the hot path
Wait that exit is deasserted in the destructor, so we are sure the
thread will "soon" return
Value could range from 1 to 9. Default is 4 and it is potentially the
best option. Feel free to test some values on your system, behavior
might depends on the core number and thread number
Value is exponential so 4 is 2 times more pixels than 3.
Small value increased thread overhead, big value increase wait/sync latency
memory overhead by thead is only 256KB
However it will reduce the probability to block the push thread to nearly 0
I tested a couple of dumps and only manage 4000 element with 1 extrathread.
Add a factor 2 on the VRAM to get the quantity of available memory for the textures.
The driver is allowed to put some textures in RAM. Of course it is bad for performance
but it won't crash.
Due to the 4GB by process limit, I keep a (reasonable) maximum of 3.8GB.
In order to avoid a crash when memory is too low an exception will be risen
with no guarantee on rendering and big performance impact. In this situation
you ought to reduce upscaling/disable large framebuffer.
* Fixes a bug where NTSC VideoMode was automatically used when videomode is uninitialized. the bug was only temporary till the SMODE register was written.
Technically there's no term called "RegionMode" and the values obtained through the SMODE1 register is actually used for identifying the video mode of the game not any region modes.
* Convert "GS_VideoMode" into an enum class
* Does the first vsync (start counter) after the sleep
* Dump data after the rendering, avoid to count extra destructor,sleep time
* Dump data into a basic csv file (if people want nice graph)
There is only a single event queue, so you need to detect the pad based
on the configuration
Mouse/Wiimote is limited to first pad
Related to issue #1441
In file included from GSRenderer.cpp:23:0:
GSRenderer.h: In constructor ‘GSRenderer::GSRenderer()’:
GSRenderer.h:58:12: warning: ‘GSRenderer::m_dev’ will be initialized after [-Wreorder]
GSDevice* m_dev;
^
GSRenderer.h:52:13: warning: ‘GSVector2i GSRenderer::m_real_size’ [-Wreorder]
GSVector2i m_real_size;
^
GSRenderer.cpp:32:1: warning: when initialized here [-Wreorder]
Grouping bytes in debugger memory window, following pointers and history
Goto in register view
Printing strings in pointer registers
Memory view can be resized correctly by ctrl+wheel
Improvement in function 5900DebugInterface::isValidAddress
usb-kbd: Remove unused variable.
usb-ohci: Add proper casts for the variables.
vl: Add proper casts for the variables.
USB: Add proper casts for the variables.
GLLoader: cast passed parameters to required type.
GSDeviceOGL: cast variables to required type and silence warnings.
GSRendererOGL: cast variables to required type and silence warnings.
x86emitter : Convert variable type from u8 to bool.
recVTLB: Cast "sign" to bool to prevent a warning.
R5900OpcodeImpl: Cast all the values in array to u64 instead of s64.
It creates some slowdowns for unknown reason. My best hypothesis is
that stencil will be cleared too which is slow.
Let's keep the code for the future when stencil will be dropped.
Fix#1420
If baseline and display rectangle offsets differ by small values then consider the status of frame memory offsets, prevents blurring on Tenchu: Fatal Shadows, Worms 3D, ProStroke Golf, Vexx
* Improve frame buffer height management on custom resolution. Width seems to be fine with the same size as scaled image output.
* Prevent offset issues on Persona 3 based on the data from merge circuit.
Note: Fixes custom resolution upscaling on ICO 50Hz/60Hz mode when large frame buffer is enabled. previously 60Hz mode only displayed half of the screen and 50Hz mode only worked due to the scissor hack.
The TLS buffers used by the FastFormatUnicode and FastFormatAscii
classes seem to be responsible for PCSX2 not terminating properly on
Windows under certain conditions (using MTVU before commit
1111e03901, using CDVDgigaherz without a
disc, possibly other conditions).
When PCSX2 shut downs and the FastFormatBuffers are being cleaned up,
the call to pthread_key_delete() would end up calling
WaitForSingleObject(e, INFINITE) and waiting indefinitely for an event
to trigger. It never does get triggered (for reasons unknown) and
therefore PCSX2 doesn't terminate properly.
Remove the usage of TLS buffers in the FastFormatString classes - it
fixes the termination issue on Windows and doesn't seem to have much
effect on performance.