- In the OpenGL blitter, replace some calls to glBufferSubDataARB() with glMapBufferARB(). This, maybe, possibly, fixes an intermittent crash that can occur with the Intel HD Graphics 3000 OpenGL driver.
- Move towards completing support for changing the output framebuffer color format to RGB666 or RGB888. Significantly increases the generated code size, but this is necessary for performance. (Related to r5433. This rework is still incomplete.)
- Parse and cache the WININ and WINOUT registers, instead of using them directly.
- Parse and cache the Target1 bits of the BLDCNT register.
- Remove some template parameters which are now suspected to no longer improve performance, most notably LAYERID. Should significantly reduce the generated code size.
- Do a tiny optimization for GPUEngineBase::_RenderPixel16_SSE2().
- Fix a bug where the 3D layer would fail to draw correctly on non-SSE2 systems if the output framebuffer’s color format is RGB666 or RGB888. (Regression from r5492.)
- Do some minor code cleanup.
- 2D layer compositing now supports RGB666 and RGB888 color formats. (Related to r5433. This rework is still incomplete.)
- Fix a couple of bugs in GPUEngineBase::_ColorEffectBlend3D() when dealing with RGBA6665 or RGBA8888 color formats.
- Establish some assumptions about what the 3D layer’s color format will be with respect to the output framebuffer’s color format. This is being done in order to simplify the code.
- The new rules are as follows: If the output framebuffer’s color format is RGB666 or RGB888, then the 3D layer’s color format will be RGBA6665 and RGBA8888, respectively. If the output framebuffer’s color format is RGB555, then the 3D layer’s color format will be RGBA6665.
- 3D layer compositing now supports RGB666 and RGB888 color formats. (Related to r5433. This rework is still incomplete.)
- Fix a bug in GPUEngineBase::_ColorEffectBlend3D() where variables were left undefined when the source and destination color formats were mismatched.
- Partially fix a bug with affine and extended BG layers on big-endian systems. Such layers that perform rotation or scaling aren’t fixed yet.
- Loosen a restriction on taking the faster code path in GPUEngineBase::_RenderPixelIterate_Final().
- Silence a compiler warning on non-SSE2 systems.
- Fix a bug where if both flipping and colorspace conversion occur on the CPU, then the 3D framebuffer would flush incorrectly. (Regression from r5455.)
- Nope! Apparently, GPUEngineBase::_RenderPixel_CheckWindows16_SSE2() does need to be forced inline, or else performance will drop! (Regression from r5485.)
- Fix builds that were broken due to new libretro-common API additions. (Regression from r5398.)
- KNOWN REGRESSION: In order to hasten the process of restoring the ability to build the Linux ports, the additional command-line options that are available in the Linux ports have been disabled. Maybe someone else can restore their functionality.
- Once again, tell the 3D renderer which framebuffers need to be flushed per frame so that we can avoid flushing unneeded framebuffers. This fixes a performance regression with many 3D games. (Regression from r5383.)
- Include SSSE3 versions for unpacking the following texture types: I2, I4, and A5I3.
- As a side-effect of working on these optimizations, the SSE2 versions of ConvertColor555To6665Opaque() and ConvertColor555To8888Opaque() are now a little faster.
- Remove GPUEngineBase::_RenderPixel_CheckWindows8_SSE2() and GPUEngineBase::_RenderPixel8_SSE2(). I don’t see us ever needing to use these methods in the future.
- Replace patterns of por(pand,pandn) with pblendvb where appropriate. (Requires SSE4.1)
- The need to read the 3D framebuffer is now checked on a per-line basis instead of solely at line 0. Once more, this fixes the map rendering in Advance Wars: Dual Strike during some conversations. (Regression from r5429.)
- Remove duplicate lookup table.
- Better optimize Render3D_SSE2::ClearFramebuffer(). Should improve performance for games that do their clears using image buffers.
- GPUEngineBase::_ColorEffectBlend() now supports RGB666 and RGB888 color formats.
- Use some SSSE3-specific optimizations in GPUEngineA::_RenderLine_DispCapture_BlendFunc_SSE2().
- Do some minor cleanup.
- Continue rework towards supporting RGB666 and RGB888 color formats. (Related to r5433. This rework is still incomplete.)
- More basic blending methods now support RGB666 and RGB888 color formats.
- Don’t reset some sprite-related state buffers if the OBJ layer is disabled.
- Replace instances of std::min() with ternary operators.
- Better optimize SSE2 versions of ConvertColor8888To5551() and ConvertColor6665To5551().
- Use some SSSE3-specific optimizations in GPUEngineBase::_ColorEffectBlend() and GPUEngineBase::_ColorEffectBlend3D().
- Fix some compiling issues with some SSE2 color conversion functions on older compilers.
- Fix a performance issue where if the status bar is hidden while Vertical Sync is enabled, then status text updates will cause a severe slowdown due to conflicting vertical syncs. (Fixed by setting the ‘hidden’ flag of the statusText control to YES while the status bar is hidden.)
- Revert a change in setting the fog render bit for translucent fragments. Fixes the appearance of the Air Robo GP in Solatorobo: Red the Hunter. (Regression from r5464.)