- Revert the last resort execution of workFunc in Task::Impl::finish(). Windows now has much better compliance with the behavior of pthread_cond_wait(), so the last resort execution is no longer necessary.
- Add additional checks for workFunc in Task::Impl::execute() and Task::Impl::finish() to make their reentrancy more robust on Windows.
- Add a last resort execution of workFunc in Task::Impl::finish() in the case where taskProc() misses the wake up signal from Task::Impl::execute() when running on Windows.
- EXPERIMENTAL: Revert task.cpp and pthreads.c to what they were back in r5538, but change scond_wait() to explicitly unlock the mutex before calling WaitForSingleObject().
- When shutting down, ensure that the existing task is finished if its running before continuing with the shutdown process.
- Explicitly declare thunkTaskProc() as static.
- If a GPU engine is disabled or has master brightness at full intensity, fill the output framebuffer on line 191 instead of on line 0.
- Replace global variable Render3DFramesPerSecond with accessor method GPUSubsystem::GetFPSRender3D().
- Factor out the generic colorspace handling routines out of GPU.cpp/GPU.h into their own separate files.
- Add vectorized routines using AVX2 and AltiVec.
- Fix bug where the OBJ layer wasn’t doing the window test. Fixes graphical issues in Mario Kart DS. (Regression from r5515. Fixes bug #1572 and #1574.)
- The NOWINDOWSENABLEDHINT template parameter is no longer an optional hint; it is now required functionality. It has been renamed to WILLPERFORMWINDOWTEST to reflect this change.
- Window testing is now a per-scanline operation instead of a per-pixel operation. Removes the performance penalty of window testing at larger framebuffer sizes.
- In the OpenGL blitter, replace some calls to glBufferSubDataARB() with glMapBufferARB(). This, maybe, possibly, fixes an intermittent crash that can occur with the Intel HD Graphics 3000 OpenGL driver.
- Move towards completing support for changing the output framebuffer color format to RGB666 or RGB888. Significantly increases the generated code size, but this is necessary for performance. (Related to r5433. This rework is still incomplete.)
- Parse and cache the WININ and WINOUT registers, instead of using them directly.
- Parse and cache the Target1 bits of the BLDCNT register.
- Remove some template parameters which are now suspected to no longer improve performance, most notably LAYERID. Should significantly reduce the generated code size.