- The 3D renderers are now responsible for managing the texture unpack buffers instead of relying on the TexCacheItem itself to do it.
- The OpenGL 3D renderer now uses a fixed 4MB buffer for unpacking textures, instead of maintaining extra copies of each unpacked texture in main memory even after they’ve been uploaded to the GPU.
- Rework TexCacheItem::GetTexture() so that instantiating a new object, dumping the packed data, and dumping the palette are performed as separate operations.
- Invalid OpenGL textures are now updated instead of being completely replaced.
- NDSTextureUnpack4x4() now uses the srcIndex pointer parameter instead of recalculating the palette address.
- Delete the now obsolete MemSpan-based texture unpacking functions.
- Texture items in cache are now searched using std::map instead of std::multimap.
- Texture item search keys now ignore the render-specific bits of the texture attributes (repeat mode, flip mode, and coordinate transformation mode bits are ignored). This is to help reduce the number of duplicate textures in the cache.
- Searching a texture and unpacking a texture are now performed as separate operations.
- Texture unpacking functions now use restrict pointers instead of normal pointers.
- Revert the last resort execution of workFunc in Task::Impl::finish(). Windows now has much better compliance with the behavior of pthread_cond_wait(), so the last resort execution is no longer necessary.
- Add additional checks for workFunc in Task::Impl::execute() and Task::Impl::finish() to make their reentrancy more robust on Windows.
- Add a last resort execution of workFunc in Task::Impl::finish() in the case where taskProc() misses the wake up signal from Task::Impl::execute() when running on Windows.
- EXPERIMENTAL: Revert task.cpp and pthreads.c to what they were back in r5538, but change scond_wait() to explicitly unlock the mutex before calling WaitForSingleObject().
- When shutting down, ensure that the existing task is finished if its running before continuing with the shutdown process.
- Explicitly declare thunkTaskProc() as static.
- If a GPU engine is disabled or has master brightness at full intensity, fill the output framebuffer on line 191 instead of on line 0.
- Replace global variable Render3DFramesPerSecond with accessor method GPUSubsystem::GetFPSRender3D().
- Factor out the generic colorspace handling routines out of GPU.cpp/GPU.h into their own separate files.
- Add vectorized routines using AVX2 and AltiVec.