- Remove duplicate lookup table.
- Better optimize Render3D_SSE2::ClearFramebuffer(). Should improve performance for games that do their clears using image buffers.
- GPUEngineBase::_ColorEffectBlend() now supports RGB666 and RGB888 color formats.
- Use some SSSE3-specific optimizations in GPUEngineA::_RenderLine_DispCapture_BlendFunc_SSE2().
- Do some minor cleanup.
- Continue rework towards supporting RGB666 and RGB888 color formats. (Related to r5433. This rework is still incomplete.)
- More basic blending methods now support RGB666 and RGB888 color formats.
- Don’t reset some sprite-related state buffers if the OBJ layer is disabled.
- Replace instances of std::min() with ternary operators.
- Better optimize SSE2 versions of ConvertColor8888To5551() and ConvertColor6665To5551().
- Use some SSSE3-specific optimizations in GPUEngineBase::_ColorEffectBlend() and GPUEngineBase::_ColorEffectBlend3D().
- Fix some compiling issues with some SSE2 color conversion functions on older compilers.
- Fix a performance issue where if the status bar is hidden while Vertical Sync is enabled, then status text updates will cause a severe slowdown due to conflicting vertical syncs. (Fixed by setting the ‘hidden’ flag of the statusText control to YES while the status bar is hidden.)
- Revert a change in setting the fog render bit for translucent fragments. Fixes the appearance of the Air Robo GP in Solatorobo: Red the Hunter. (Regression from r5464.)
- In the OpenGL blitter, only allow source filters (such as Deposterize) to run on native-sized framebuffers. This is being done since the visual impact on custom-sized framebuffers, even those at 2x size, is not enough to warrant the additional GPU load. This behavior is now consistent with the pixel scalers, which only run on native-sized framebuffers and not on custom-sized framebuffers.
- Fix a bug in the OpenGL blitter where the Deposterize filter wouldn’t run if the pixel scaler was set to None.
- Add 555-to-6665 opaque color conversion.
- Add UNALIGNED switch to 555-to-8888, 555-to-6665, 8888-to-5551, and 6665-to-5551 color buffer conversion functions, allowing clients to inform these functions that the incoming buffer pointers may not be 16-byte aligned.
- Rendered lines from GPUEngineBase::_HandleDisplayModeOff(), GPUEngineA::_HandleDisplayModeVRAM(), and GPUEngineA::_HandleDisplayModeMainMemory() now output colors with the alpha bits filled in. This is working towards a time when clients that work directly in 16-bit and 32-bit colorspaces don’t have to fill in the alpha bits themselves.
- Unify more color conversion code.
- In the SSE2 version of ConvertColor555To8888Opaque(), change the algorithm to use computation instead of memory lookups. Although memory lookups are faster on newer CPUs, computation is much faster on older CPUs, which have smaller caches and longer memory latencies. I believe this is the correct decision, since older CPUs are the ones that need as much performance as they can get.
- Fix compiling on Windows due to new color conversion code. (Regression from r5455.)
GPU:
- The SSE2 version of ConvertColor555To8888Opaque() now uses memory lookups instead of calculating things through.
- Add color 555 to 8888-opaque conversions.
- In the new color buffer conversion functions, change the FragmentColor data types to u32. (Related to r5455.)
- Unify all colorspace conversion code.
- Fix bug with VRAM-to-VRAM capture.
OpenGL Renderer:
- Try and fix a possible bug with applying fog to transparent fragments.
- Texture sampling now works with bilinear filtering, mipmapping, and anisotropic filtering! These texture smoothing features can be used by enabling the new CommonSettings.GFX3D_Renderer_TextureSmoothing flag.