Commit Graph

6913 Commits

Author SHA1 Message Date
Triang3l c238d8af55 [Vulkan] Fix FragStencilRef store type 2023-03-30 22:28:56 +03:00
Wunkolo f357f26eae [Build] Add parallel PPC test generation
Utilizes `multiprocessing` to allow for multiple power-pc assembly tests
to be generated in parallel.

Some results on my i9-11900k(8c/16t):

Before:
```
Measure-Command {.\xb gentests}

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 11
Milliseconds      : 200
Ticks             : 112007585
TotalDays         : 0.000129638408564815
TotalHours        : 0.00311132180555556
TotalMinutes      : 0.186679308333333
TotalSeconds      : 11.2007585
TotalMilliseconds : 11200.7585
```

After:
```
Measure-Command {.\xb gentests}

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 5
Milliseconds      : 426
Ticks             : 54265895
TotalDays         : 6.28077488425926E-05
TotalHours        : 0.00150738597222222
TotalMinutes      : 0.0904431583333333
TotalSeconds      : 5.4265895
TotalMilliseconds : 5426.5895
```

This is an over **x2** speedup!
2023-02-05 20:56:37 -06:00
Shoegzer 4a2f4d9cfe Add include to fix compiling 2023-01-29 21:10:20 +03:00
Gliniak 4e87d1f9d1 [Kernel/Thread] Set TLS slot to 0 while freeing 2023-01-28 17:49:12 -06:00
Wunkolo e55cb737c1 [x64] Add AX512 optimization for `OPCODE_SELECT`(F64) 2022-12-28 14:20:20 -06:00
Wunkolo ba75a016b4 [x64] Add AX512 optimization for `OPCODE_SELECT`(V128)
Uses `vpternlogd` to collapse the bitwise select operation into one
instruction. Though it needs a `vmovdqa` instruction since `vpternlogd`
reads and writes to the first argument.
2022-12-28 14:20:20 -06:00
Wunkolo 7c21b327ff [x64] Add `x64_util.h`
Used to help with generating instruction-specific constants.  Currently
used for the ternary-logic constants(`vpternlog*`).
2022-12-28 14:20:20 -06:00
Gliniak eb25fe4f4a [CPU] Increase amount of possible labels used in FinalizationPass
Instead of using decimal notaation for labels let's use hexadecimal.
That will increase amount of possible combination by a lot.
2022-12-28 14:19:55 -06:00
Joel Linn 9eef64d3fb [SDL2] Print version on startup 2022-12-28 14:19:02 -06:00
Joel Linn 76561d5add [SDL2] Update to version 2.24.2 2022-12-28 14:19:02 -06:00
p01arst0rm 12c8d5348c added fxaa LICENSE file 2022-12-28 14:18:25 -06:00
p01arst0rm 2c1aadd2d2 remove dlmalloc 2022-12-28 14:17:50 -06:00
p01arst0rm a1bb6cc142 moved vswhere to tools directory 2022-12-28 14:17:24 -06:00
Gliniak 859bb89555 [Kernel] Support for loading achievement data 2022-12-28 14:16:43 -06:00
Joel Linn da9c90835b [ImGui] Use new key API 2022-12-28 14:16:32 -06:00
Joel Linn b641e39c4d [ImGui] Use new ImageButton signature 2022-12-28 14:16:32 -06:00
Joel Linn d6b5cbd634 [ImGui] Fix deprecated SetCursorPos use 2022-12-28 14:16:32 -06:00
Joel Linn 29985f69e1 [ImGui] Fix fsr sharpness slider scaling
https://github.com/ocornut/imgui/issues/3361#issuecomment-1287045159
2022-12-28 14:16:32 -06:00
Joel Linn c8a39bad29 [ImGui] Update to v1.89 2022-12-28 14:16:32 -06:00
Joel Linn f452d6a007 [UI] Fix UB (moved mem) in file picker
- References to vector data become UB after vector size changes.
- Add one extra level of indirection to pin the wide string memory
  location regardless of vector memory
2022-12-28 14:16:32 -06:00
Joel Linn 7877331d8a [ImGui] Use ImDrawCmd::IdxOffset field
c80e8b964c
https://github.com/ocornut/imgui/issues/4845#issuecomment-1003329113
2022-12-28 14:16:32 -06:00
Joel Linn 8f88bb237e [ImGui] Fix empty IDs 2022-12-28 14:16:32 -06:00
Joel Linn b107af68bc [ImGui] Fix removed flags 2022-12-28 14:16:32 -06:00
Joel Linn f04cfb3b65 [ImGui] Update to v1.88 2022-12-28 14:16:32 -06:00
Triang3l e97eb75b94 [Vulkan] Update variableMultisampleRate comments (actually supported) [ci skip] 2022-12-04 14:55:56 +03:00
Triang3l 0b4f5ef286 [SPIR-V] Decorate whole gl_PerVertex with Invariant
Block members can be decorated with Invariant only since SPIR-V 1.5 Revision 2. In earlier versions, Invariant can be used only for variables. Mesa warns about this.
2022-12-03 14:27:43 +03:00
Joel Linn 7dd715ea6f [CI, Drone] Disable HighResolutionTimer test cases 2022-11-20 16:41:55 -06:00
Triang3l 778333b1b5 [UI] Fix ClearInput not called in ImGuiDrawer after deferred dialog removal
Also cleanup the code involved in dialog registration, and update the explanation of why dialog removal is delayed until the end of drawing (the original was written back when window listener and UI drawer callback registration during the execution of the callbacks was deferred, but that was wrong as that might result in execution of callbacks belonging to now-deleted objects).
2022-10-31 18:57:54 +03:00
Triang3l a37b57ca8d [GPU] Fix tiled mip tail extent calculation
Previously, for mips, the dimensions of the texture weren't rounded to powers of two before calculating the mip tail extent, resulting in the mip tail for a 260 blocks tall texture, that contains mips ending at Y of up to 36, having the Y extent calculated as 32. With rounding to powers of two, it would have been 64.

However, with the GetTiledAddressUpperBound functions, none of this is necessary at all (and neither is rounding the extents in TextureGuestLayout::Level to 32x32x4 blocks) - using the same code for calculating the XYZ extents of tiled textures as for linear textures now, which, for the mip tail, calculates the actual maximum coordinates of the mips stored in it - and rounding to tiles is done internally by GetTiledAddressUpperBound.
2022-10-23 21:26:47 +03:00
Triang3l 74f1f6bb6d [Vulkan] Check depthClamp feature 2022-10-23 19:01:17 +03:00
Triang3l 4add1f67b1 [D3D12] Replace unused shared memory view with a null view
Fixes the PIX validation warning about missing resource states on every guest draw. Also potentially prevents drivers from making assumptions about the shared memory buffer based on the bindings, though no such cases are currently known.
2022-10-23 18:09:47 +03:00
Wunkolo 5fde7c6aa5 [x64] Add AVX512 optimizations for `PERMUTE_V128`
Uses the single-instruction AVX512 `vperm*` instructions to accelerate
the `INT8_TYPE` and `INT16_TYPE` permutation opcodes.

The `INT8_TYPE` is accelerated using `AVX512VBMI` subset of AVX512.
Available since Icelake(Intel) and Zen4(AMD).
2022-10-21 08:47:31 -05:00
Wunkolo f207239349 [x64] Add `kX64EmitAVX512VBMI` feature-flag and detection
Allows access to byte-element 2-register permutations(32-byte look up
tables) and for 64-bit multi-shifts.
Particularly adding this to accelerate the assembly of our `PERMUTE`
opcode.
2022-10-21 08:47:31 -05:00
Wunkolo d73088e5ca [x64] Add AVX512 optimization for `OPCODE_VECTOR_SUB`(saturated)
Passes the `vsubuws` and `vsubsws` unit-tests from https://github.com/xenia-project/xenia/pull/1348
2022-10-21 08:45:43 -05:00
Triang3l cdb40ddb28 [DXBC] Fix interpolator copying from v# to r# in PS
The bit count was of `(1<<i)-1` itself (thus couldn't handle interpolators with a smaller index skipped), not of `bits&((1<<i)-1)`.
2022-10-18 13:12:37 +03:00
Triang3l 45050b2380 [GPU] Vulkan fragment shader interlock RB and related fixes/cleanup
Also fixes addressing of MSAA samples 2 and 3 for 64bpp color render targets in the ROV RB implementation on Direct3D 12.
Additionally, with FSI/ROV, alpha test and alpha to coverage are done only if the render target 0 was dynamically written to (according to the Direct3D 9 rules for writing to color render targets, though not sure if they actually apply to the alpha tests on Direct3D 9, but for safety).
There is also some code cleanup for things spotted during the development of the feature.
2022-10-09 22:06:41 +03:00
Joel Linn 9ab4db285c [Premake] Update premake-cmake
- Handle compiler flags per-file. Removes ffmpeg warnings
- Switch to JoelLinn fork since original author stopped maintaining
  and other forks don't seem to care about PRs
2022-09-22 06:36:43 -05:00
Rick Gibbed 3bfa3b05e1
Lint fix. 2022-09-22 06:34:21 -05:00
Wunkolo addd8c94e5 [x64] Add AVX512 optimization for `OPCODE_VECTOR_ADD`(saturated)
Uses a single `vpternlogd` to test for signed/unsigned
overflow/underflow. Then utilizes AVX512 mask operations to create
either `0x7FFFFFFF` or `0x80000000` arithmetically.
2022-09-14 11:39:03 -05:00
Wunkolo 9fd684594b [x64] Add AVX512 optimization for `OPCODE_VECTOR_CONVERT_F2I`(unsigned)
`vcvttps2udq` already saturates overflowing and unordered values to `0xFFFFFFFF`. Using mask registers, zeroes are written to negative values within the same instruction.
2022-09-12 13:52:57 -05:00
Wunkolo 90fffe1de7 [PPC] Fix memory assert formatting
This was still using printf-style format specifiers. Causing memory
asserts to show up like this while testing.

```
!> 0000438C Memory 10001040 assert failed:
!> 0000438C   Expected: %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X
!> 0000438C     Actual: %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X
!> 0000438C     TEST FAILED
```

Updated them so they format correctly:

```
!> 00002CCC Memory 10001040 assert failed:
!> 00002CCC   Expected: FC FD FE FF 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
!> 00002CCC     Actual: FC FD FE FF 00 00 00 00 00 00 00 00 00 00 00 00
!> 00002CCC     TEST FAILED
```
2022-09-05 13:47:48 -05:00
Wunkolo b0cc3db4d8 [x64] Add AVX512 optimization for `NOT_V128` 2022-09-05 13:47:30 -05:00
Triang3l 7595cdb52b [Vulkan] Non-GS point sprites + minor SPIR-V fixes 2022-07-27 17:14:28 +03:00
Triang3l ff7ef05063 [SPIR-V] Clamp cube face using NClamp, not NMax/FMin 2022-07-26 17:08:12 +03:00
Triang3l 66c995f3aa [SPIR-V] Saturate point sprite coordinates 2022-07-26 17:04:22 +03:00
Triang3l 8fb5da18ea [Vulkan] Add forgotten fullDrawIndexUint32 check 2022-07-26 16:24:14 +03:00
Triang3l 9fa41c27bc [Vulkan] Point sprite geometry shader 2022-07-26 16:01:20 +03:00
Triang3l f248e23079 [DXBC] Skip backface check in point PsParamGen 2022-07-25 21:48:25 +03:00
Triang3l 77e85ecaa4 [Vulkan] 32-bit index fetch without fullDrawIndexUint32 2022-07-25 16:53:12 +03:00
Triang3l 37579d3bf0 [GPU] Treat non-adaptive-tessellated patches as 1-control-point 2022-07-24 17:38:26 +03:00