- Begin unifying pixel rendering. Rendering the BG and OBJ layers now use the same method.
- Pass the destination buffer pointer and line index by means of function parameters, instead of using object variables.
- Rendering a BG layer (for debugging purposes) is now completely handled in the core code.
- Do some other code cleanup.
- Clearing to the backdrop color has been changed from a pixel operation to a scanline operation.
- Clearing to black when the GPU engine is disabled has been changed from a scanline operation to a framebuffer operation.
- Applying the master brightness has been changed from a scanline operation to a framebuffer operation.
- Resetting the BGnX and BGnY registers now occurs at the end of line 191 instead of at the start of line 0.
- Per zeromus’ suggestion, remove GetNativeFramebuffer() and GetCustomFramebuffer() from the GPUSubsystem class. Users must parse the NDSDisplayInfo struct returned from GetDisplayInfo() instead.
- Per zeromus’ suggestion, rename Get/SetWillAutoBlitNativeToCustomBuffer() to Get/SetWillAutoResolveToCustomBuffer().
- Add some more notes to the NDSDisplayInfo struct to help clarify the meaning of each field.
- Fix bug where 3D rendering may not always finish on line 0, causing lingering 3D artifacts in certain games. Now it is always forced to finish. (Regression from r5255.)
- Bring back the backdrop clearing optimization from r5198 when rendering in the native resolution.
- Do some minor code cleanup.
- Fix possible crash when doing a direct-color sprite render due to aligned access, since incoming sprite coordinates can cause access to become unaligned. (Regression from r5256.)
- Do SSE2 optimization for direct-color sprite renders.
- Make ARM9_LCD cache-aligned. Allows for SSE2 to perform aligned load/stores on certain operations, improving performance.
- Further templatize some methods.
- Do some misc. code cleanup.
- Do heavy code cleanup.
- Split the engine-specific functionality of the main and sub engines into the new GPUEngineA and GPUEngineB subclasses.
- Templatize some parameters. Greatly increases the generated code size, but restores (and possibly improves) performance from r5251.
- Be smarter about manually inlining functions. Greatly reduces the generated code size, and fixes making optimized builds on MSVC. (Regression from r5248.)
- This change may affect performance. This will need additional testing.
- Add support for handling combination native/custom rendering sizes.
- As a side-effect of supporting this feature, pixel scalers now work as intended when high-resolution rendering is enabled (but only if the incoming display framebuffer is at the native size).
- Finish support for combination native/custom rendering sizes. Can give a significant performance improvement when running the GPUs at a custom size, but only for frontends that support this feature.
- Cleanup and optimize OAM attributes handling. (Special thanks to Twinaphex from libretro for pointing this out to us.)
- Add SSE2 optimizations to display capture operations.
- Do a whole bunch more code cleanup.