To do so I had to re-add the casting bloat removed in revision 6102. Also, for some odd reason the NVidia OpenCL drivers don't like 8 bit rotations, but are okay with 2, 4 bit rotations. These are apparently bugs in the NVidia drivers that are hopefully fixed in future versions.
Also, on linux make sure the TextureDecoder.cl file is copied from the shared data directory to the users directory.
git-svn-id: https://dolphin-emu.googlecode.com/svn/trunk@6611 8ced0084-cf51-0410-be5f-012b33b47a6e
- RGBA8 (DX9/OGL): 10x speed up on Radeon 5450, 2x speed up on other cards due to swizzle registers.
- RGB565: 2-3x speed up on all hardware
- Removed OpenCL compiler warnings (eg. redefine).
OpenCL is now optimally complete for DX9/OGL. The code is very fast on all supported hardware. No more updates are needed unless the spec changes or drivers improve. When I started, the OpenCL code was as slow or slower than CPU. Now, using the lowest end radeon that supports the code: a Radeon 45xx mobility, I experience a substantial 2-10x speedup over CPU. The benefits are more pronounced with modern hardware. A Radeon 5870 runs this code 20x faster than a 4550. Even ignoring speedups, the code benefits users by not using CPU for intermittent texture loads (unless GPU is your bottleneck). Instead, the CPU is able to do more important tasks.
git-svn-id: https://dolphin-emu.googlecode.com/svn/trunk@5775 8ced0084-cf51-0410-be5f-012b33b47a6e
Changes:
- IA4: 2x Speed up for all hardware and ATI glitch fixed (blocky text)
- IA8: 2x Speed up for all hardware
- New DX11 OCL Textures: I4, I8, IA4, IA8
git-svn-id: https://dolphin-emu.googlecode.com/svn/trunk@5766 8ced0084-cf51-0410-be5f-012b33b47a6e
Changes:
- Strict casting as required by NVidia. Now NVidia cards should work.
- Fixed Alpha CMPR bug.
Please tell me if you find any bugs. Current known bug is the 'Press' texture in Paper Mario that is meant to flash rainbow colours appears black. Other than that, everything should work on every videocard.
git-svn-id: https://dolphin-emu.googlecode.com/svn/trunk@5759 8ced0084-cf51-0410-be5f-012b33b47a6e
New OpenCL updates:
- OpenCL bug with ATI SDK (GPU or CPU) fixed.
- IA4 texture loop unrolled. 12x speed up on 4xxx series.
- Completed rewriting RGB5A3 texture decode. 20% faster.
- Redundant code removed from CMPR and RGB5A3 (Alpha, shift).
- Made use of optimised OpenCL functions (upsample, bitselect).
- Cleaner code.
Tested and working with DX9 plugin. DX11 plugin will NOT work due to a recent commit affecting VideoCommon. You can use this file with an older DX11 plugin (~r5730), however.
git-svn-id: https://dolphin-emu.googlecode.com/svn/trunk@5753 8ced0084-cf51-0410-be5f-012b33b47a6e
enable newline normalization
get revision number via `hg svn info` for svnrev.h
ignore incremental/generated binary files (windows/VS at least)
leave a comment if some files need native eol set in svnprops
git-svn-id: https://dolphin-emu.googlecode.com/svn/trunk@5637 8ced0084-cf51-0410-be5f-012b33b47a6e
Fixed RGB5A3 decoding with alpha
New CMPR decoding, blocks with no alpha are great, still have to figure the problems with transparent blocks. Disabled for now.
Added a better error reporting to the base OpenCL functions
git-svn-id: https://dolphin-emu.googlecode.com/svn/trunk@4439 8ced0084-cf51-0410-be5f-012b33b47a6e