Also, some tab/space mismatches removed from VideoOGL, and some places I missed in VideoDX[number] projects.
Now, the Core is literally the only project with tab/space mismatches (on a large scale).
See Render.cpp, PixelShaderGen.cpp, and PixelShaderManager.cpp for most of the changes.
See VertexShaderManager.cpp for a logging typo fix.
See SWRenderer.cpp for a small typo fix for a message that gets swprintf'd in DrawDebugText.
See SWVertexLoader.cpp for a typo fix of an assert message.
Should slightly improve the readability of some of those files.
Sorry for a direct commit to the main branch but i need fast feedback, and i don't want to leave problematic code in the main branch for a long time.
if this approach does not work for the drivers with problems will transform dual source blend to an option in the D3D9 backend.
I appreciate the help of the people that tested my last commit and thanks to neobrain for pointing this solution.
this implementation does not work in windows xp (sorry no support for dual source blending there).
this should improve speed on older hardware or in newer hardware using super sampling.
disable partial fix for 4x supersampling as I'm interested in knowing the original issue with the implementation to fix it correctly.
remove the deprecation label from the plugin while I'm working on it.
Conflicts:
Source/Core/VideoCommon/Src/PixelShaderGen.cpp
Source/Plugins/Plugin_VideoSoftware/Src/SWRenderer.cpp
immediate-removal is a new created branch seperated from master but reverted the revert of immediate-removal
so we get less conflicts by merging
the amount and size of the buffer is now changed to "new hardware" frienly values and will fall back to the right values if hardware does not support them.
my next commit will be to a branch, with my ogl work.
So to compensate lets bring back some speed to the emulation.
change a little the way the vertex are send to the gpu,
This first implementation changes dx9 a lot and dx11 a little to increase the parallelism between the cpu and gpu.
ogl: is my next step in ogl is a little more trickier so i have to take a little more time.
the original concept is Marcos idea, with my little touch to make it even more faster.
what to look for: SPEEEEEDDD :).
please test it a lot and let me know if you see any problem.
in dx9 the code is prepared to fall back to the previous implementation if your card does not support the amount of buffers needed.
So if you did not experience any speed gains you know where is the problem :).
for the ones with more experience and compression of the code please test changing the amount and size of the buffers to tune this for your specific machine.
The current values are the sweet spot for my machine.
All must Thanks Marcos, I hate him for giving good ideas when I'm full of work.
Very useful to compare performance between two builds, check the impact of
a configuration option, etc. FPS log is stored in User/Logs/fps.txt and is
reset each time you launch a game. Only enabled if you check the "Log FPS
to file" option in your graphics settings.
Could be improved a bit: currently logs only every 1s (so you can't really
see small variations), maybe output more infos to the fps.txt like
average/stddev (but Excel/Libreoffice/Google Docs can compute that easily
too).
Reason:
- It's wrong, zcomploc can't be emulated perfectly in HW backends without severely impacting performance.
- It provides virtually no advantages over the previous hack while introducing lots of code.
- There is a better alternative: If people insist on having some sort of valid zcomploc emulation, I suggest rendering each primitive separately while using a _clean_ dual-pass approach to emulate zcomploc.
This reverts commit 0efd4e5c29.
This reverts commit b4ec836aca.
This reverts commit bb4c9e2205.
This reverts commit 146b02615c.