rogerman
f97c633441
OpenGL Renderer: Okay, let's try using GL_AMD_conservative_depth for those AMD drivers that outright lie about supporting GL_ARB_conservative_depth. (Related to commit 4d6a132
and commit 39f9483.)
2018-12-30 02:12:54 -08:00
rogerman
39f9483034
OpenGL Renderer: Also require GLSL 4.00 when using the GL_ARB_conservative_depth extension. (Related to commit 4d6a132.)
2018-12-30 01:36:54 -08:00
rogerman
4d6a132116
OpenGL Renderer: Mitigate some of the performance penalty of using the NDS Style Depth Calculation option.
...
- GPUs that support the GL_ARB_conservative_depth extension will benefit more from this commit. (OpenGL 3.2 only.)
- Also fix some miscellaneous bugs.
2018-12-29 22:37:37 -08:00
rogerman
0c0bd5144e
Cocoa Port: Do a small optimization when doing video output framebuffer fetches for Metal display views.
2018-12-28 15:39:09 -08:00
rogerman
aeea0ea46a
OpenGL Renderer: Remove the material_6bit_to_float LUT, since we already have an equivalent existing LUT -- divide6bitBy63_LUT.
2018-12-26 22:35:34 -08:00
rogerman
c1357c1451
OpenGL Renderer: Do some minor performance improving tweaks.
...
- Most notably, fix a performance regression where polygon drawing was no longer getting batched due to an incorrect polygon-facing test. (Regression from commit dab414c.)
2018-12-26 19:48:22 -08:00
rogerman
062d9a65a7
Cocoa Port: Do a minor optimization for Metal display views running on macOS v10.13 High Sierra and later.
2018-12-24 21:35:17 -08:00
rogerman
022cf3c702
Cocoa Port: Looks like all macOS versions 10.13 High Sierra and later don't support P-Buffers, so properly handle this error condition and disable the OpenGL 3D renderer when trying to run it without FBOs on these newer macOS versions.
2018-12-22 14:53:19 -08:00
rogerman
7bb438020b
OpenGL Renderer: Fix bug where the OpenGL renderer would completely fail to run if the user's ancient GPU doesn't support shaders. (Regression from commit 7080e21.)
...
- Also do some minor improvements to the code robustness when creating an OpenGLRenderer object.
2018-12-21 16:21:44 -08:00
rogerman
589524823b
OpenGL Renderer: Oops! Finish doing the shader rework started in commit 7080e21
for legacy OpenGL so that it works the way its supposed to. Doing this now fixes legacy OpenGL for (hopefully) all GPU drivers and also allows for all of the same shader optimizations as OpenGL 3.2.
2018-12-18 20:21:18 -08:00
rogerman
d3e4b6010c
OpenGL Renderer: Eliminate the requirement for 66 varying floats in the Fog shader by replacing the varying floats with constants. Also fixes an issue with the geometry shader in legacy OpenGL. (Regressions from commit 7080e21
and commit 37afaef
. Fixes #240.)
2018-12-18 16:47:53 -08:00
zeromus
30212212b5
winport - set PreferredToolArchitecture to x64
2018-12-18 13:53:42 -05:00
rogerman
37afaefa2f
OpenGL Renderer: Replace the accuracy/performance tradeoff "Enable Depth Equals Test Tolerance" with "Enable NDS-Style Depth Calculation", where disabling this option allows the host GPU to natively calculate depth which significantly improves performance in many games.
...
- New Behavior: In addition to emulating the existing Depth Equals Test Tolerance, NDS-Style Depth Calculation accounts for all NDS depth calculations within the fragment shader. Most notably, disabling this option forgoes the W-depth / Z-depth differentiation that the NDS uses, instead preferring the GPU's native Z-depth calculation. Using the GPU's native depth calculation significantly improves performance, but many games use W-depth calculations or are sensitive to subtleties in the Z-depth calculation, and so this option must remain ON by default for compatibility's sake.
- Also fixes a shader initialization issue on the Windows port. (Regression from commit 7080e21.)
2018-12-18 10:50:41 -08:00
rogerman
7080e2156b
OpenGL Renderer: Rework the rendering shaders so that the shader program code is more dynamically generated. This may yield some performance improvement for certain 3D rendering cases, especially when running on lesser GPUs with fewer and/or slower shader execution units.
2018-12-17 16:16:50 -08:00
rogerman
ae8fb2c3bb
GPU: Fix a bug where using VRAM as a display capture source would sometimes cause graphical glitches under certain conditions. (Regression from commit 2c6a5f9.)
2018-12-17 15:33:16 -08:00
zeromus
88d930ce82
winport - fix loading files named things like Splookékrong from commandline, fixes #238
2018-12-15 17:26:28 -05:00
rogerman
7ff5c5eece
Render 3D: Improve the overall rendering accuracy of Edge Mark. Most notably, Edge Mark now properly renders at screen edges. As of now, the current algorithm is as accurate as its ever going to get under our current 3D rendering engine.
2018-12-14 17:08:16 -08:00
rogerman
e6d6f2e10d
Cocoa Port: Fix a bug where clipboard copies and screenshots taken from Metal display views will cause the image to be Y-flipped.
2018-12-14 15:41:05 -08:00
rogerman
2c6a5f9868
GPU: Do some code cleanup of the display capture code.
2018-12-12 18:34:47 -08:00
zeromus
e604631413
winport - fix things named like Blorkénflarge in the recent roms menu ( #238 )
2018-12-12 15:50:52 -05:00
rogerman
8c2379f6f8
Firmware: Fix various endianness issues. Most importantly, this fixes a bug with touch input not working correctly on big-endian systems. (Regression from commit bb38022.)
2018-12-12 02:49:44 -08:00
rogerman
471f53e506
Cocoa Port: Fix various issues on the PPC build.
...
- Fix compiling issues for big-endian systems.
- Fix bug where the Recent ROMs menu and also launching the app while loading a ROM file would fail to load the ROM on macOS v10.5 Leopard.
- Fix bug where GPU main memory display mode would show incorrect pixels on big-endian systems when running at 15-bit color depth.
- As an unintended collateral improvement, GPUEngineA::_HandleDisplayModeMainMemory() now has SSE2-accelerated versions for 18-bit and 24-bit color depths. This was done less for its performance benefit (main memory display mode is an extremely rare feature) and more for better code consistency and code completeness.
2018-12-11 17:45:36 -08:00
zeromus
e56059872f
winport - fix loading games named things like "Yokémorp". It was probably only open through drag and drop.
...
Probably broke japanese. If I did, write a bug so I can fix japanese and break latin characters again.
2018-12-11 18:20:39 -05:00
zeromus
b5477b608b
Merge pull request #236 from NetwideRogue/master
...
don't clobber existing screenshots
2018-12-06 08:08:40 -06:00
Declan Hoare
a3eebbac21
don't clobber existing screenshots
2018-12-06 23:59:18 +11:00
rogerman
35e834ff2c
GFX3D: Revert the polygon sorting code back to its original state, which should result in a minor performance improvement for high polygon-count scenes.
...
- After years of testing, no one has reported running into the assert in gfx3d_ysort_compare() so I think we should be safe in reverting std::stable_sort() back to std::sort().
- For the sorting function, use gfx3d_ysort_compare_orig() since this function compiles down to fewer instructions than gfx3d_ysort_compare_kalven() does, resulting in better sorting performance.
- Of note, I'm pretty sure that SF commit r5132 is what fixed the original bug (see SF#1461 for more details) by getting rid of the NaN comparisons that were tripping up std::sort(). In the future, we should research why we're dividing by 0 in the first place, since r5132 is clearly a hack of a fix.
2018-12-05 14:37:33 -08:00
zeromus
d80a84b762
Merge pull request #207 from cosmo-ray/fix-linux-gcc8.2-warnings
...
Fix gcc8.2 warnings on linux
2018-12-05 14:10:52 -06:00
zeromus
9ea1b5cbda
Merge pull request #223 from intact/gtk-fix-screenshot-path
...
Gtk+ Port: Use Desktop or Home as fallback directory for screenshots
2018-12-05 14:10:42 -06:00
rogerman
355e4a0fb4
OpenGL Renderer: Remove a now defunct framebuffer texture, significantly reducing VRAM usage at the higher resolutions.
2018-12-04 21:49:09 -08:00
rogerman
3d573e150f
OpenGL Renderer: Properly clear the framebuffer during a power-off condition, just like how SoftRasterizer does it. (Related to commit 66b5da1
and commit 759a039
. Fixes #234.)
...
- Also do a minor performance optimization by only doing the framebuffer clear once for each power-off condition, rather than repeatedly and unnecessarily clearing the framebuffer for each and every V-blank.
2018-12-04 21:23:58 -08:00
rogerman
df22c6e14d
Cocoa Port: Fix some intermittent issues related to launching the app while loading a ROM file (i.e. double-clicking an NDS ROM file to launch DeSmuME.app).
2018-12-04 01:26:27 -08:00
rogerman
b9a8bafe8b
GPU: Do some code cleanup of the display capture code.
2018-12-03 16:22:12 -08:00
zeromus
fe93b70de8
Merge pull request #233 from Jules-A/hideConsole
...
[Windows] Hide the console window in the current session when disabling in menu.
2018-12-02 11:00:52 -06:00
Jules.A
157717a61f
Hide the console window before freeing so if the console is attachted to another process (currently occuring under Win10) it is at least hidden.
2018-12-03 00:52:09 +08:00
rogerman
c3614a7e95
GPU: Do some minor code cleanup.
2018-12-01 21:56:32 -08:00
rogerman
fb8d937239
GPU: Fix bug where GPUEngineA::RenderLine_Layer3D() was trying to run with uninitialized values. (Regression from commit 37a8ca0.)
2018-12-01 21:38:03 -08:00
rogerman
37a8ca0983
GPU: Do a bunch of minor tweaks and code cleanup to the various pixel compositor methods. Significantly reduces the compiled code size.
...
- Of note, when running at custom resolutions, we are now being more aggressive in performing early tests for rejecting pixels as soon as possible. This may yield a minor performance improvement in some very specific rendering scenarios that require the window test.
2018-12-01 20:44:45 -08:00
rogerman
6a1d9e4848
GPU: Rendering complete OBJ layer lines is now SSE2-accelerated at the native resolution. This change is less of a performance enhancement and more of improving the code consistency. As of now, ALL complete OBJ layer lines, whether internally generated or from read from VRAM, whether rendering at native resolution or custom resolution, should now be SSE2-accelerated. This commit finalizes this concept. (Related to commit 8e9e7c4
and commit 60c01bd.)
2018-12-01 15:46:23 -08:00
rogerman
60c01bd63a
GPU: Do a few more minor optimizations to rendering complete OBJ layer lines. Most notably, all complete OBJ layer lines, not just ones reading directly from custom VRAM, now benefit from the SSE2-accelerated code.
2018-11-30 20:12:58 -08:00
rogerman
9a53e8be69
GPU: Fix bug in GPUEngineBase::_PixelUnknownEffectWithMask16_SSE2() where blending effects for OBJ layers were being handled incorrectly. This bugfix only affects SSE2-enabled systems. (Regression from commit 8e9e7c4
. Fixes #232.)
2018-11-30 17:10:56 -08:00
rogerman
2c5c2f6186
GPU: Use the same technique in the commit 6bcd19b
GPUEngineBase::_CompositeVRAMLineDeferred() bug fix in order to do a tiny optimization to GPUEngineBase::_CompositeLineDeferred(). Also makes the code more consistent as well.
2018-11-29 22:25:37 -08:00
rogerman
d0330fc96e
Cocoa Port: Upgrade Interface Builder .xib files to 3.2 format with minimum deployment target of macOS 10.6. This change effectively drops support for building DeSmuME directly from a PowerPC Mac. Fixes #231 .
...
- It is still possible to create a PowerPC binary, but this now requires some extra steps. From now on, you must use an Intel Mac running Mavericks or earlier to re-save the .xib files with a deployment target of macOS 10.5 in Interface Builder 3.2, and then use Xcode 3 to build a PowerPC binary using the Xcode 3 project file.
2018-11-29 22:02:07 -08:00
zeromus
3e37352bee
winport - menu option to control CLI console visibility
...
fixes #230
2018-11-29 16:25:59 -06:00
rogerman
6bcd19b3cb
GPU: Fix a bug in GPUEngineBase::_CompositeVRAMLineDeferred() where compInfo.target.xCustom was overstepping its bounds in X-dimension only custom buffers. This had the effect of causing undefined coloring when running at custom resolutions. (Regression from commit 8e9e7c4
. Fixes #228 and fixes #229.)
2018-11-29 13:12:56 -08:00
zeromus
f6938dc80a
Merge pull request #217 from SuuperW/paths
...
Paths
2018-11-29 15:04:31 -06:00
rogerman
8e9e7c4a2a
GPU: Enable SSE2-accelerated custom-sized VRAM reads through the OBJ layer. This significantly improves the performance of many games, such as those that make use of dual-screen 3D, when running at the higher resolutions.
2018-11-29 02:00:21 -08:00
rogerman
6fc6ceb294
SoftRasterizer: For SSE2-enabled systems only, fix a rare graphical glitch that can sometimes occur in some games. (Regression from commit 21f04c9.)
2018-11-28 17:31:49 -08:00
rogerman
4f543aa8ca
Cocoa Port: Yet another attempt at eliminating microstuttering in Metal display views. While it hasn't been completely eliminated yet, it shouldn't be as bad now.
2018-11-28 13:36:02 -08:00
rogerman
1f9b9e02a4
SoftRasterizer: Fix build issues on Windows. (Regression from commit 21f04c9.)
2018-11-23 15:30:11 -08:00
rogerman
21f04c9ef2
SoftRasterizer: Do some minor improvements to both performance and code size.
2018-11-23 14:59:13 -08:00