In FB_ONLY mode the alpha test impacts (discard) only the depth value.
If there is no depth buffer, we don't care about depth write. So alpha
test is useless and we can do the draw with a single draw call and no program
switch
Extend GSVector to support float move
Initial code likely used integer move for performance reason. However due to
the nan correction, register is now in float domain.
CID 168626 (#1 of 1): Uninitialized scalar field
uninit_member: Non-static class member m_end_block is not initialized in this constructor nor in any functions that it calls.
Fix rendering issue on letters on Kengo/burnout 3/...
Default algo will execute the alpha test in 2 passes. However due to blending
you can't handle accurately the color.
Fortunately for us, the rendering uses an always pass depth test so you
can execute first all the color rendering (which doesn't depends on the alpha test)
And then the depth part which depends on the alpha test.
* Code was factorized a bit with the help of max_z
* Add an extra optimization if test is ZTST_GEQUAL and min z value is
the biggest value. Z test will always be pass.
Note: due to float rounding (23 bits mantissa vs 24 bits depth) the test
is done against 0xFF_FFFE and not 0xFF_FFFF. It is wrong but GPU will
also use float so impact will be null.
The solution files are unused and for ancient Visual Studio versions -
GSDumpGUI has its own solution file, and bin2cpp is included in the main
solution file.
The property sheets have either fallen out of use or were never used in
the first place.
Fixes a regression introduced by 46ba9aa117,
where the Linux GS replayer would always use the options in inis/GSdx.ini
(or use the default options if that doesn't exist) to replay the dump,
instead of using the GSdx.ini from the specified ini folder.
Update done on f712c5c6d0
Previous code use the size of the draw to compute latest block. I
don't know why I use .x/.y which are the origin offset so the start of the block.
Check the instruction set first in GPUinit, GPUconfigure and GPUtext
to prevent unsupported vector instructions from being executed.
Move the vector initialisation in GPUinit to a separate function - it
avoids a vzeroupper instruction.
vector push_back causes a SIGILL signal on a Nehalem (SSE4.2) QEMU VM
when compiled with GCC 6.1.1.
However, an empty constructor causes illegal instruction exceptions to be
generated on a Windows VM.
So here's an inbetween that looks stupid but works on what I've tested.
Fix#1457 (GTA)
The game uses a depth format for a pure color buffer (cokes do ravage
in gaming industry)
However I'm really afraid that it migth break another effect in other games.
Combine all the different configurations together so the project files
are more generic and maintainable.
Also standardise the layout so all the project files will be similar and
all have the same standard elements (even if empty).
Add 64-bit configurations.
Additional specifics:
spu2-x: FLOAT_SAMPLES preprocessor definition removed since it's unused.
Use a relaxed atomic to read the exit variable in the hot path
Wait that exit is deasserted in the destructor, so we are sure the
thread will "soon" return
Value could range from 1 to 9. Default is 4 and it is potentially the
best option. Feel free to test some values on your system, behavior
might depends on the core number and thread number
Value is exponential so 4 is 2 times more pixels than 3.
Small value increased thread overhead, big value increase wait/sync latency
memory overhead by thead is only 256KB
However it will reduce the probability to block the push thread to nearly 0
I tested a couple of dumps and only manage 4000 element with 1 extrathread.
Add a factor 2 on the VRAM to get the quantity of available memory for the textures.
The driver is allowed to put some textures in RAM. Of course it is bad for performance
but it won't crash.
Due to the 4GB by process limit, I keep a (reasonable) maximum of 3.8GB.
In order to avoid a crash when memory is too low an exception will be risen
with no guarantee on rendering and big performance impact. In this situation
you ought to reduce upscaling/disable large framebuffer.
* Does the first vsync (start counter) after the sleep
* Dump data after the rendering, avoid to count extra destructor,sleep time
* Dump data into a basic csv file (if people want nice graph)
In file included from GSRenderer.cpp:23:0:
GSRenderer.h: In constructor ‘GSRenderer::GSRenderer()’:
GSRenderer.h:58:12: warning: ‘GSRenderer::m_dev’ will be initialized after [-Wreorder]
GSDevice* m_dev;
^
GSRenderer.h:52:13: warning: ‘GSVector2i GSRenderer::m_real_size’ [-Wreorder]
GSVector2i m_real_size;
^
GSRenderer.cpp:32:1: warning: when initialized here [-Wreorder]
GLLoader: cast passed parameters to required type.
GSDeviceOGL: cast variables to required type and silence warnings.
GSRendererOGL: cast variables to required type and silence warnings.
It creates some slowdowns for unknown reason. My best hypothesis is
that stencil will be cleared too which is slow.
Let's keep the code for the future when stencil will be dropped.
Fix#1420
If baseline and display rectangle offsets differ by small values then consider the status of frame memory offsets, prevents blurring on Tenchu: Fatal Shadows, Worms 3D, ProStroke Golf, Vexx
* Improve frame buffer height management on custom resolution. Width seems to be fine with the same size as scaled image output.
* Prevent offset issues on Persona 3 based on the data from merge circuit.
Note: Fixes custom resolution upscaling on ICO 50Hz/60Hz mode when large frame buffer is enabled. previously 60Hz mode only displayed half of the screen and 50Hz mode only worked due to the scissor hack.
* Ignore Frame memory offsets for calculating dimensions value of display rectangle.
* Remove hack which limited scaling size based on the scissor value.
Note: With the following commit, SilentHill 2 now properly outputs the desired resolution by the users on custom resolution. Previously if we set 1024 x 1024 , it'll output a lower height value which was caused by the hack removed on this commit.
It ought to be the same in performance but code will be easier this way
v2: print the sync status
v3: use a performance print so it doesn't spam the console