pcsx2/plugins/GSdx_legacy/res/glsl/nvidia_throughput.txt

27 lines
2.3 KiB
Plaintext

Table 2. Throughput of Native Arithmetic Instructions. (Number of Operations per Clock Cycle per Multiprocessor) Compute Capability
Architecture , FER , FER , KPL , MAX
32-bit floating-point add multiply multiply-add , 32 , 48 , 192 , 128
64-bit floating-point add multiply multiply-add , 16 , 4 , 8 , 1
32-bit floating-point reciprocal reciprocal square root log2f/exp2f/sine/cosine , 4 , 8 , 32 , 32
32-bit integer add extended-precision add subtract extended-precision subtract , 32 , 48 , 160 , 128
32-bit integer multiply multiply-add extended-precision multiply-add , 16 , 16 , 32 , Multiple instructions
32-bit integer shift , 16 , 16 , 32 , 64
compare minimum maximum , 32 , 48 , 160 , 64
32-bit integer bit reverse bit field extract/insert , 16 , 16 , 32 , 64
32-bit bitwise AND / OR / XOR , 32 , 160 , 160 , 128
count of leading zeros most significant non-sign bit , 16 , 16 , 32 , Multiple instructions
population count , 16 , 16 , 32 , 32
warp shuffle , N/A , N/A , 32 , 32
sum of absolute difference , 16 , 16 , 32 , 64
SIMD video instructions vabsdiff2 , N/A , N/A , 160 , Multiple instructions
SIMD video instructions vabsdiff4 , N/A , N/A , 160 , Multiple instructions
All other SIMD video instructions , 16 , 16 , 32 , Multiple instructions
Type conversions from 8/16-bit integer to 32-bit types , 16 , 16 , 128 , 32
Type conversions from and to 64-bit types , 16 , 4 , 8 , 4
All other type conversions , 16 , 16 , 32 , 32
Some tips:
* bit field operations are as fast as shift operations.