Commit Graph

6126 Commits

Author SHA1 Message Date
rogerman d50c8f8e3e GPU:
- Nope! Apparently, GPUEngineBase::_RenderPixel_CheckWindows16_SSE2() does need to be forced inline, or else performance will drop! (Regression from r5485.)
2016-07-03 02:17:42 +00:00
rogerman b314a49dee GPU:
- GPUEngineBase::_RenderPixel_CheckWindows16_SSE2() no longer need to be forced inline.
- Do some code cleanup.
2016-07-03 01:46:50 +00:00
rogerman a815b1c458 Cocoa Port:
- Fix another tiny blending bug in the 5xBRZ fragment shader. (Related to r5379.)
2016-07-02 22:07:26 +00:00
rogerman 2953799788 Linux Port (CLI / GTK / Glade):
- Fix builds that were broken due to new libretro-common API additions. (Regression from r5398.)
- KNOWN REGRESSION: In order to hasten the process of restoring the ability to build the Linux ports, the additional command-line options that are available in the Linux ports have been disabled. Maybe someone else can restore their functionality.
2016-07-02 20:23:56 +00:00
rogerman 3c5461e786 Core:
- Fix some issues with building on older compilers.
2016-07-02 19:47:13 +00:00
rogerman 9683b7e070 GPU:
- Once again, tell the 3D renderer which framebuffers need to be flushed per frame so that we can avoid flushing unneeded framebuffers. This fixes a performance regression with many 3D games. (Regression from r5383.)
2016-07-01 20:19:13 +00:00
rogerman e6dac5ec96 GPU:
- Further optimize the SSE2 versions of ConvertColor555To6665Opaque() and ConvertColor555To8888Opaque().
2016-07-01 18:32:10 +00:00
rogerman a05ddab710 Texture Handler:
- Include SSSE3 versions for unpacking the following texture types: I2, I4, and A5I3.
- As a side-effect of working on these optimizations, the SSE2 versions of ConvertColor555To6665Opaque() and ConvertColor555To8888Opaque() are now a little faster.
2016-07-01 10:15:57 +00:00
rogerman 0d9d59455f GPU:
- Remove GPUEngineBase::_RenderPixel_CheckWindows8_SSE2() and GPUEngineBase::_RenderPixel8_SSE2(). I don’t see us ever needing to use these methods in the future.
- Replace patterns of por(pand,pandn) with pblendvb where appropriate. (Requires SSE4.1)
2016-07-01 00:13:32 +00:00
rogerman a35edad4fc GPU:
- The need to read the 3D framebuffer is now checked on a per-line basis instead of solely at line 0. Once more, this fixes the map rendering in Advance Wars: Dual Strike during some conversations. (Regression from r5429.)
2016-06-29 19:16:23 +00:00
rogerman 031f65fe83 GPU:
- Fix a bug in window processing. Fixes the appearance of chips in the Gear Matrix of Kingdom Hearts Re:coded. (Regression from r5473.)
2016-06-29 16:13:34 +00:00
rogerman bce02d9cc8 Render3D:
- Remove duplicate lookup table.
- Better optimize Render3D_SSE2::ClearFramebuffer(). Should improve performance for games that do their clears using image buffers.
2016-06-29 06:29:11 +00:00
rogerman 558e405511 GPU:
- Ensure that window states are updated when the framebuffer size changes. (Regression from r5473.)
2016-06-29 01:03:11 +00:00
rogerman 8cfea593f8 GPU:
- Improve the SSE2 optimizations in the compositor.
2016-06-29 00:28:02 +00:00
rogerman 425643bf92 GPU:
- GPUEngineBase::_ColorEffectBlend() now supports RGB666 and RGB888 color formats.
- Use some SSSE3-specific optimizations in GPUEngineA::_RenderLine_DispCapture_BlendFunc_SSE2().
- Do some minor cleanup.
2016-06-27 06:11:39 +00:00
rogerman edf1d305d4 GPU:
- Fix a small bug where uninitialized variables were being used. (Related to r5470.)
2016-06-26 08:04:32 +00:00
rogerman d04c8eeae7 GPU:
- Continue rework towards supporting RGB666 and RGB888 color formats. (Related to r5433. This rework is still incomplete.)
- More basic blending methods now support RGB666 and RGB888 color formats.
- Don’t reset some sprite-related state buffers if the OBJ layer is disabled.
- Replace instances of std::min() with ternary operators.
- Better optimize SSE2 versions of ConvertColor8888To5551() and ConvertColor6665To5551().
- Use some SSSE3-specific optimizations in GPUEngineBase::_ColorEffectBlend() and GPUEngineBase::_ColorEffectBlend3D().
- Fix some compiling issues with some SSE2 color conversion functions on older compilers.
2016-06-26 05:46:42 +00:00
zeromus cdd5892c60 fix vs2010 compiling. gpu.cpp compling is slow... :( 2016-06-24 18:29:00 +00:00
rogerman dde0da24ab GPU:
- Avoid generating autovectorized SSE2 code for loops where a hand-coded SSE2 loop already exists. (MSVC and Clang only.)
2016-06-23 20:30:24 +00:00
rogerman 3f895b85fb Cocoa Port:
- Fix a performance issue where if the status bar is hidden while Vertical Sync is enabled, then status text updates will cause a severe slowdown due to conflicting vertical syncs. (Fixed by setting the ‘hidden’ flag of the statusText control to YES while the status bar is hidden.)
2016-06-23 01:37:07 +00:00
rogerman 051e58a4fd GPU:
- Reorder some functions to fix building on older compilers.
2016-06-23 01:32:53 +00:00
rogerman 314bb2130d OpenGL Renderer:
- Revert a change in setting the fog render bit for translucent fragments. Fixes the appearance of the Air Robo GP in Solatorobo: Red the Hunter. (Regression from r5464.)
2016-06-22 17:11:54 +00:00
rogerman 0cb3bd723f OpenGL Renderer:
- Fix a bug with depth writes, which also fixes bugs with fog and edge mark. (Fixes bug #1522.)
2016-06-22 09:27:52 +00:00
rogerman 4ae207fb03 GPU:
- Reduce overall register contention in some color blending methods.
2016-06-21 20:30:52 +00:00
rogerman 03d8ee62aa Cocoa Port:
- In the OpenGL blitter, only allow source filters (such as Deposterize) to run on native-sized framebuffers. This is being done since the visual impact on custom-sized framebuffers, even those at 2x size, is not enough to warrant the additional GPU load. This behavior is now consistent with the pixel scalers, which only run on native-sized framebuffers and not on custom-sized framebuffers.
- Fix a bug in the OpenGL blitter where the Deposterize filter wouldn’t run if the pixel scaler was set to None.
2016-06-20 21:22:51 +00:00
rogerman 4d2307538d GPU:
- Add 555-to-6665 opaque color conversion.
- Add UNALIGNED switch to 555-to-8888, 555-to-6665, 8888-to-5551, and 6665-to-5551 color buffer conversion functions, allowing clients to inform these functions that the incoming buffer pointers may not be 16-byte aligned.
- Rendered lines from GPUEngineBase::_HandleDisplayModeOff(), GPUEngineA::_HandleDisplayModeVRAM(), and GPUEngineA::_HandleDisplayModeMainMemory() now output colors with the alpha bits filled in. This is working towards a time when clients that work directly in 16-bit and 32-bit colorspaces don’t have to fill in the alpha bits themselves.
- Unify more color conversion code.
2016-06-20 18:47:45 +00:00
rogerman d1a8663acb GPU:
- In the SSE2 version of ConvertColor555To8888Opaque(), change the algorithm to use computation instead of memory lookups. Although memory lookups are faster on newer CPUs, computation is much faster on older CPUs, which have smaller caches and longer memory latencies. I believe this is the correct decision, since older CPUs are the ones that need as much performance as they can get.
2016-06-18 22:20:07 +00:00
rogerman 0110fe22d6 Windows Port:
- Oops! Missed a small typo that still caused compiling on Windows to fail. (Related to r5458.)
2016-06-18 01:44:15 +00:00
rogerman 9e07cc95b4 Windows Port:
- Fix compiling on Windows due to new color conversion code. (Regression from r5455.)

GPU:
- The SSE2 version of ConvertColor555To8888Opaque() now uses memory lookups instead of calculating things through.
2016-06-18 01:38:51 +00:00
rogerman 29ff68cda9 GPU:
- Add color 555 to 8888-opaque conversions.
- In the new color buffer conversion functions, change the FragmentColor data types to u32. (Related to r5455.)
2016-06-17 22:36:56 +00:00
rogerman 0d162bdb9f Cocoa Port:
- Change gpuColorFormat property data type from UInt32 to NSUInteger.
2016-06-17 21:33:43 +00:00
rogerman f8e0585d26 GPU:
- Unify all colorspace conversion code.
- Fix bug with VRAM-to-VRAM capture.

OpenGL Renderer:
- Try and fix a possible bug with applying fog to transparent fragments.
2016-06-17 04:22:51 +00:00
rogerman b543e309c5 Cocoa Port:
- Fix bug where the texture smoothing option was not getting saved to the user defaults file. (Related to r5451.)
2016-06-10 21:11:51 +00:00
rogerman 5d2b5054ba OpenGL Renderer:
- Remove some code duplication in OpenGLRenderer::SetupTexture().
2016-06-10 20:48:03 +00:00
rogerman 41d9061ce4 OpenGL Renderer:
- Fix building on platforms that aren’t OS X. (Regression from r5450. Fixes bug #1561.)
2016-06-10 18:20:55 +00:00
rogerman 490e34f81e Cocoa Port:
- Add support for texture smoothing. (Related to r5450.)
2016-06-10 04:16:28 +00:00
rogerman 9a9f006397 OpenGL Renderer:
- Texture sampling now works with bilinear filtering, mipmapping, and anisotropic filtering! These texture smoothing features can be used by enabling the new CommonSettings.GFX3D_Renderer_TextureSmoothing flag.
2016-06-10 03:57:32 +00:00
rogerman ce5765006f Cocoa Port:
- Fix some possible issues with HUD text rendering.
2016-06-09 18:47:54 +00:00
rogerman d682d15142 Cocoa Port:
- Fix builds that were broken due to new libretro-common API additions. (Regression from r5438.)
2016-06-09 18:46:55 +00:00
zeromus b157132dbc change build system to support dev+ with gdb stub enabled. I think that's basically where it was at historically 2016-06-02 18:17:22 +00:00
zeromus d24883ee85 more helpful --help for arm9gdb etc 2016-06-02 18:15:22 +00:00
zeromus 82904b4a74 fix bug entering cheats with values > 7FFFFFFF 2016-05-25 05:09:44 +00:00
zeromus 9767f79346 support cheats to any address, not just main memory. 2016-05-23 17:11:33 +00:00
zeromus 9b33859c68 fix a bug making vs2015 builds unable to open roms on XP systems 2016-05-14 05:00:39 +00:00
rogerman 45b559eae6 Cocoa Port:
- Simplify some drawing code in the OpenGL blitter.
2016-05-13 06:26:38 +00:00
zeromus afb63d0b2f fix crashes in bilinear final filter + HD prescaling (buffer overflows in sloppy filter code, as usual) 2016-05-09 22:23:54 +00:00
zeromus 13032f6712 fix garbage polygon rendering (error in gfx3d matrix math overflows) in spectrobes: beyond the portals 2016-04-24 19:14:07 +00:00
zeromus caec37ef25 fix newish crash on windows when shutting down with --num-cores 1 2016-04-24 19:13:07 +00:00
zeromus 73f5067ebc VFAT: use retro_dir and retro_stat instead of additional fs- layer 2016-04-22 01:38:45 +00:00
zeromus b03347dc48 retro_dirent and retro_stat tidy and bugfixes: windows retro_dir would have missed the first entry; retro_stat wasn't extern "C"'d; retro_dirent_is_dir didn't need a path argument (path can always be gotten from RDIR in a trivial operation) 2016-04-22 01:38:26 +00:00