Commit Graph

6931 Commits

Author SHA1 Message Date
Adrian 4a3b04d4ee [XAM] Implemented XamGetCurrentTitleId 2023-06-09 19:43:15 -05:00
Gliniak 858af5ae75 [XAM] xeXamContentCreate - Disposition cleanup 2023-06-09 19:42:48 -05:00
Gliniak e110527bfe [Base] ListFiles: Prevent leakage of file descriptors 2023-06-09 19:41:27 -05:00
Wunkolo 6ee2e3718f [x64] Add AVX512 optimizations for `OPCODE_VECTOR_COMPARE_UGT`(Integer)
AVX512 has native unsigned integer comparisons instructions, removing
the need to XOR the most-significant-bit with a constant in memory to
use the signed comparison instructions. These instructions only write to
a k-mask register though and need an additional call to `vpmovm2*` to
turn the mask-register into a vector-mask register.

As of Icelake:
`vpcmpu*` is all L3/T1
`vpmovm2d` is L1/T0.33
`vpmovm2{b,w}` is L3/T0.33

As of Zen4:
`vpcmpu*` is all L3/T0.50
`vpmovm2*` is all L1/T0.25
2023-05-29 14:57:09 -05:00
Wunkolo 121bf93cbe [PPC] Implement `vsubcuw`
Other half of #2125. I don't know of any title that utilizes this instruction, but I went ahead and implemented it for completeness.

Verified the implementation with `instr__gen_vsubcuw` from #1348. Can be grabbed with:
```
git checkout origin/gen_tests -- src\xenia\cpu\ppc\testing\*vsubcuw.s
```
2023-05-29 14:56:12 -05:00
Wunkolo 93b77fb775 [PPC] Implement `vaddcuw`
I don't know of any title that utilizes this instruction, but I went
ahead and implemented it for completeness.

Verified the implementation with `instr__gen_vaddcuw` from #1348. Can be
grabbed with:
```
git checkout origin/gen_tests -- src\xenia\cpu\ppc\testing\*vaddcuw.s
```
2023-05-29 14:56:00 -05:00
Triang3l ed64e3072b [GPU] Remove implicit bool cast in memexport checks 2023-05-05 21:38:45 +03:00
Triang3l 0e81293b02 [GPU] Remove a dangerous comment about break after exece [ci skip]
There can be jumps across an exece, so the code beyond it may still be
executed.
2023-05-05 21:32:02 +03:00
Triang3l 53f98d1fe6 [GPU/D3D12] Memexport from anywhere in control flow + 8/16bpp memexport
There's no limit on the number of memory exports in a shader on the real
Xenos, and exports can be done anywhere, including in loops. Now, instead
of deferring the exports to the end of the shader, and assuming that export
allocs are executed only once, Xenia flushes exports when it reaches an
alloc (allocs terminate memory exports on Xenos, as well as individual ALU
instructions with `serialize`, but not handling this case for simplicity,
it's only truly mandatory to flush memory exports before starting a new
one), the end of the shader, or a pixel with outstanding exports is killed.

To know which eM# registers need to be flushed to the memory, traversing
the successors of each exec potentially writing any eM#, and specifying
that certain eM# registers might have potentially been written before each
reached control flow instruction, until a flush point or the end of the
shader is reached.

Also, some games export to sub-32bpp formats. These are now supported via
atomic AND clearing the bits of the dword to replace followed by an atomic
OR inserting the new byte/short.
2023-05-05 21:32:02 +03:00
Triang3l 8aaa6f1f7d [SPIR-V] Wrap 4-operand ops and 1-3-operand GLSL std calls 2023-04-19 21:44:24 +03:00
Triang3l 19d56001d2 [SPIR-V] Wrap NoContraction operations 2023-04-19 11:53:45 +03:00
Triang3l 78f1d55a36 [SPIR-V] Use Builder createSelectionMerge directly 2023-04-19 11:11:28 +03:00
Triang3l 64d2a80f79 [SPIR-V] Cleanup ALU emulation conditionals 2023-04-19 10:35:09 +03:00
Triang3l eede38ff63 [SPIR-V] Remove more vec2-4 reserve calls 2023-04-18 22:05:02 +03:00
Triang3l 887fda55c2 [SPIR-V] Remove temp reserve for 4 or less elements 2023-04-13 22:43:44 +03:00
Triang3l 75d805245d [DXBC] `discard` pixels from `kill` with ROV instead of returning
Keep the current lane active as it may be needed for derivatives.
2023-04-09 20:13:22 +03:00
Triang3l 88c645d818 [D3D12] Don't use emit_then_cut due to RDNA 3 crash 2023-04-09 18:07:44 +03:00
Triang3l baa2ff78d8 [Vulkan] Add missing stencil reference unpack in RT transfer + formatting fix 2023-03-30 22:40:40 +03:00
Triang3l c238d8af55 [Vulkan] Fix FragStencilRef store type 2023-03-30 22:28:56 +03:00
Wunkolo f357f26eae [Build] Add parallel PPC test generation
Utilizes `multiprocessing` to allow for multiple power-pc assembly tests
to be generated in parallel.

Some results on my i9-11900k(8c/16t):

Before:
```
Measure-Command {.\xb gentests}

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 11
Milliseconds      : 200
Ticks             : 112007585
TotalDays         : 0.000129638408564815
TotalHours        : 0.00311132180555556
TotalMinutes      : 0.186679308333333
TotalSeconds      : 11.2007585
TotalMilliseconds : 11200.7585
```

After:
```
Measure-Command {.\xb gentests}

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 5
Milliseconds      : 426
Ticks             : 54265895
TotalDays         : 6.28077488425926E-05
TotalHours        : 0.00150738597222222
TotalMinutes      : 0.0904431583333333
TotalSeconds      : 5.4265895
TotalMilliseconds : 5426.5895
```

This is an over **x2** speedup!
2023-02-05 20:56:37 -06:00
Shoegzer 4a2f4d9cfe Add include to fix compiling 2023-01-29 21:10:20 +03:00
Gliniak 4e87d1f9d1 [Kernel/Thread] Set TLS slot to 0 while freeing 2023-01-28 17:49:12 -06:00
Wunkolo e55cb737c1 [x64] Add AX512 optimization for `OPCODE_SELECT`(F64) 2022-12-28 14:20:20 -06:00
Wunkolo ba75a016b4 [x64] Add AX512 optimization for `OPCODE_SELECT`(V128)
Uses `vpternlogd` to collapse the bitwise select operation into one
instruction. Though it needs a `vmovdqa` instruction since `vpternlogd`
reads and writes to the first argument.
2022-12-28 14:20:20 -06:00
Wunkolo 7c21b327ff [x64] Add `x64_util.h`
Used to help with generating instruction-specific constants.  Currently
used for the ternary-logic constants(`vpternlog*`).
2022-12-28 14:20:20 -06:00
Gliniak eb25fe4f4a [CPU] Increase amount of possible labels used in FinalizationPass
Instead of using decimal notaation for labels let's use hexadecimal.
That will increase amount of possible combination by a lot.
2022-12-28 14:19:55 -06:00
Joel Linn 9eef64d3fb [SDL2] Print version on startup 2022-12-28 14:19:02 -06:00
Joel Linn 76561d5add [SDL2] Update to version 2.24.2 2022-12-28 14:19:02 -06:00
p01arst0rm 12c8d5348c added fxaa LICENSE file 2022-12-28 14:18:25 -06:00
p01arst0rm 2c1aadd2d2 remove dlmalloc 2022-12-28 14:17:50 -06:00
p01arst0rm a1bb6cc142 moved vswhere to tools directory 2022-12-28 14:17:24 -06:00
Gliniak 859bb89555 [Kernel] Support for loading achievement data 2022-12-28 14:16:43 -06:00
Joel Linn da9c90835b [ImGui] Use new key API 2022-12-28 14:16:32 -06:00
Joel Linn b641e39c4d [ImGui] Use new ImageButton signature 2022-12-28 14:16:32 -06:00
Joel Linn d6b5cbd634 [ImGui] Fix deprecated SetCursorPos use 2022-12-28 14:16:32 -06:00
Joel Linn 29985f69e1 [ImGui] Fix fsr sharpness slider scaling
https://github.com/ocornut/imgui/issues/3361#issuecomment-1287045159
2022-12-28 14:16:32 -06:00
Joel Linn c8a39bad29 [ImGui] Update to v1.89 2022-12-28 14:16:32 -06:00
Joel Linn f452d6a007 [UI] Fix UB (moved mem) in file picker
- References to vector data become UB after vector size changes.
- Add one extra level of indirection to pin the wide string memory
  location regardless of vector memory
2022-12-28 14:16:32 -06:00
Joel Linn 7877331d8a [ImGui] Use ImDrawCmd::IdxOffset field
c80e8b964c
https://github.com/ocornut/imgui/issues/4845#issuecomment-1003329113
2022-12-28 14:16:32 -06:00
Joel Linn 8f88bb237e [ImGui] Fix empty IDs 2022-12-28 14:16:32 -06:00
Joel Linn b107af68bc [ImGui] Fix removed flags 2022-12-28 14:16:32 -06:00
Joel Linn f04cfb3b65 [ImGui] Update to v1.88 2022-12-28 14:16:32 -06:00
Triang3l e97eb75b94 [Vulkan] Update variableMultisampleRate comments (actually supported) [ci skip] 2022-12-04 14:55:56 +03:00
Triang3l 0b4f5ef286 [SPIR-V] Decorate whole gl_PerVertex with Invariant
Block members can be decorated with Invariant only since SPIR-V 1.5 Revision 2. In earlier versions, Invariant can be used only for variables. Mesa warns about this.
2022-12-03 14:27:43 +03:00
Joel Linn 7dd715ea6f [CI, Drone] Disable HighResolutionTimer test cases 2022-11-20 16:41:55 -06:00
Triang3l 778333b1b5 [UI] Fix ClearInput not called in ImGuiDrawer after deferred dialog removal
Also cleanup the code involved in dialog registration, and update the explanation of why dialog removal is delayed until the end of drawing (the original was written back when window listener and UI drawer callback registration during the execution of the callbacks was deferred, but that was wrong as that might result in execution of callbacks belonging to now-deleted objects).
2022-10-31 18:57:54 +03:00
Triang3l a37b57ca8d [GPU] Fix tiled mip tail extent calculation
Previously, for mips, the dimensions of the texture weren't rounded to powers of two before calculating the mip tail extent, resulting in the mip tail for a 260 blocks tall texture, that contains mips ending at Y of up to 36, having the Y extent calculated as 32. With rounding to powers of two, it would have been 64.

However, with the GetTiledAddressUpperBound functions, none of this is necessary at all (and neither is rounding the extents in TextureGuestLayout::Level to 32x32x4 blocks) - using the same code for calculating the XYZ extents of tiled textures as for linear textures now, which, for the mip tail, calculates the actual maximum coordinates of the mips stored in it - and rounding to tiles is done internally by GetTiledAddressUpperBound.
2022-10-23 21:26:47 +03:00
Triang3l 74f1f6bb6d [Vulkan] Check depthClamp feature 2022-10-23 19:01:17 +03:00
Triang3l 4add1f67b1 [D3D12] Replace unused shared memory view with a null view
Fixes the PIX validation warning about missing resource states on every guest draw. Also potentially prevents drivers from making assumptions about the shared memory buffer based on the bindings, though no such cases are currently known.
2022-10-23 18:09:47 +03:00
Wunkolo 5fde7c6aa5 [x64] Add AVX512 optimizations for `PERMUTE_V128`
Uses the single-instruction AVX512 `vperm*` instructions to accelerate
the `INT8_TYPE` and `INT16_TYPE` permutation opcodes.

The `INT8_TYPE` is accelerated using `AVX512VBMI` subset of AVX512.
Available since Icelake(Intel) and Zen4(AMD).
2022-10-21 08:47:31 -05:00