Commit Graph

99 Commits

Author SHA1 Message Date
Gloria d62fe21d47
RADV bug fix (#139)
* Use correct typing for stencil, dispatch launch on UI thread
* Clean up some LaunchPath code
2023-03-06 08:31:05 +01:00
chss95cs@gmail.com 20638c2e61 use Sleep(0) instead of SwitchToThread, should waste less power and help the os with scheduling.
PM4 buffer handling made a virtual member of commandprocessor, place the implementation/declaration into reusable macro files. this is probably the biggest boost here.
Optimized SET_CONSTANT/ LOAD_CONSTANT pm4 ops based on the register range they start writing at, this was also a nice boost

Expose X64 extension flags to code outside of x64 backend, so we can detect and use things like avx512, xop, avx2, etc in normal code
Add freelists for HIR structures to try to reduce the number of last level cache misses during optimization (currently disabled... fixme later)

Analyzed PGO feedback and reordered branches, uninlined functions, moved code out into different functions based on info from it in the PM4 functions, this gave like a 2% boost at best.

Added support for the db16cyc opcode, which is used often in xb360 spinlocks. before it was just being translated to nop, now on x64 we translate it to _mm_pause but may change that in the future to reduce cpu time wasted

texture util - all our divisors were powers of 2, instead we look up a shift. this made texture scaling slightly faster, more so on intel processors which seem to be worse at int divs. GetGuestTextureLayout is now a little faster, although it is still one of the heaviest functions in the emulator when scaling is on.

xe_unlikely_mutex was not a good choice for the guest clock lock, (running theory) on intel processors another thread may take a significant time to update the clock? maybe because of the uint64 division? really not sure, but switched it to xe_mutex. This fixed audio stutter that i had introduced to 1 or 2 games, fixed performance on that n64 rare game with the monkeys.
Took another crack at DMA implementation, another failure.
Instead of passing as a parameter, keep the ringbuffer reader as the first member of commandprocessor so it can be accessed through this
Added macro for noalias
Applied noalias to Memory::LookupHeap. This reduced the size of the executable by 7 kb.
Reworked kernel shim template, this shaved like 100kb off the exe and eliminated the indirect calls from the shim to the actual implementation. We still unconditionally generate string representations of kernel calls though :(, unless it is kHighFrequency

Add nvapi extensions support, currently unused. Will use CPUVISIBLE memory at some point
Inserted prefetches in a few places based on feedback from vtune.
Add native implementation of SHA int8 if all elements are the same

Vectorized comparisons for SetViewport, SetScissorRect
Vectorized ranged comparisons for WriteRegister
Add XE_MSVC_ASSUME
Move FormatInfo::name out of the structure, instead look up the name in a different table. Debug related data and critical runtime data are best kept apart
Templated UpdateSystemConstantValues based on ROV/RTV and primitive_polygonal
Add ArchFloatMask functions, these are for storing the results of floating point comparisons without doing costly float->int pipeline transfers (vucomiss/setb)
Use floatmasks in UpdateSystemConstantValues for checking if dirty, only transfer to int at end of function.
Instead of dirty |= (x == y) in UpdateSystemConstantValues, now we do dirty_u32 |= (x^y). if any of them are not equal, dirty_u32 will be nz, else if theyre all equal it will be zero. This is more friendly to register renaming and the lack of dependencies on EFLAGS lets the compiler reorder better
Add PrefetchSamplerParameters to D3D12TextureCache
use PrefetchSamplerParameters in UpdateBindings to eliminate cache misses that vtune detected

Add PrefetchTextureBinding to D3D12TextureCache
Prefetch texture bindings to get rid of more misses vtune detected (more accesses out of order with random strides)
Rewrote DMAC, still terrible though and have disabled it for now.
Replace tiny memcmp of 6 U64 in render_target_cache with inline loop, msvc fails to make it a loop and instead does a thunk to their memcmp function, which is optimized for larger sizes

PrefetchTextureBinding in AreActiveTextureSRVKeysUpToDate
Replace memcmp calls for pipelinedescription with handwritten cmp
Directly write some registers that dont have special handling in PM4 functions
Changed EstimateMaxY to try to eliminate mispredictions that vtune was reporting, msvc ended up turning the changed code into a series of blends

in ExecutePacketType3_EVENT_WRITE_EXT, instead of writing extents to an array on the stack and then doing xe_copy_and_swap_16 of the data to its dest, pre-swap each constant and then store those. msvc manages to unroll that into wider stores
stop logging XE_SWAP every time we receive XE_SWAP, stop logging the start and end of each viz query

Prefetch watch nodes in FireWatches based on feedback from vtune
Removed dead code from texture_info.cc
NOINLINE on GpuSwap, PGO builds did it so we should too.
2022-09-11 14:14:48 -07:00
illusion0001 f62ac9868a Make portable default for new install 2022-09-04 22:42:40 -05:00
Gliniak 6e1e62378f Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental 2022-07-17 21:27:52 +02:00
Triang3l 037310f8dc [Android] Unified xenia-app with windowed apps and build prerequisites 2022-07-11 21:45:57 +03:00
Gliniak 1d00372e6b Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental 2022-07-10 10:50:39 +02:00
Triang3l 88c055eb30 [CPU] Null backend enough for GPU trace viewing 2022-07-06 23:28:06 +03:00
Gliniak 6e753c6399 Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental 2022-07-04 08:11:04 +02:00
Triang3l bbae909fd7 [GPU] Reasons to keep non-Vulkan backends [ci skip] 2022-07-03 20:39:44 +03:00
Triang3l ed61e15fc3 [App] Make D3D12 the default GPU backend on Windows again 2022-07-03 19:49:11 +03:00
Gliniak 5247220e73 Merge remote-tracking branch 'GliniakRepo/patchingSystem' into canary_pr 2022-05-19 10:01:33 +02:00
Margen67 99e3a1a4b1 Disable Vulkan 2022-05-19 09:39:58 +02:00
Gliniak c73cdb506a Initial support for xex patching 2022-04-26 13:26:49 +02:00
Triang3l c47b874a4d Merge branch 'master' into vulkan 2022-03-21 20:57:02 +03:00
Joel Linn 986dcf4f65 [Base] Check success of sync primitive creation
- Mainly use `assert`s, since failure is very rare
- Forward failure of `CreateSemaphore` to guests because it is more easy
  to trigger with invalid initial parameters.
2022-03-08 12:17:57 -06:00
Triang3l 922efb13ce Merge branch 'master' into vulkan 2022-02-03 21:12:10 +03:00
Triang3l fe3f0f26e4 [UI] Image post-processing and full presentation/window rework
[GPU] Add FXAA post-processing
[UI] Add FidelityFX FSR and CAS post-processing
[UI] Add blue noise dithering from 10bpc to 8bpc
[GPU] Apply the DC PWL gamma ramp closer to the spec, supporting fully white color
[UI] Allow the GPU CP thread to present on the host directly, bypassing the UI thread OS paint event
[UI] Allow variable refresh rate (or tearing)
[UI] Present the newest frame (restart) on DXGI
[UI] Replace GraphicsContext with a far more advanced Presenter with more coherent surface connection and UI overlay state management
[UI] Connect presentation to windows via the Surface class, not native window handles
[Vulkan] Switch to simpler Vulkan setup with no instance/device separation due to interdependencies and to pass fewer objects around
[Vulkan] Lower the minimum required Vulkan version to 1.0
[UI/GPU] Various cleanup, mainly ComPtr usage
[UI] Support per-monitor DPI awareness v2 on Windows
[UI] DPI-scale Dear ImGui
[UI] Replace the remaining non-detachable window delegates with unified window event and input listeners
[UI] Allow listeners to safely destroy or close the window, and to register/unregister listeners without use-after-free and the ABA problem
[UI] Explicit Z ordering of input listeners and UI overlays, top-down for input, bottom-up for drawing
[UI] Add explicit window lifecycle phases
[UI] Replace Window virtual functions with explicit desired state, its application, actual state, its feedback
[UI] GTK: Apply the initial size to the drawing area
[UI] Limit internal UI frame rate to that of the monitor
[UI] Hide the cursor using a timer instead of polling due to no repeated UI thread paints with GPU CP thread presentation, and only within the window
2022-01-29 13:22:03 +03:00
Triang3l ecccd02f8a Merge branch 'master' into vulkan 2021-09-12 14:10:36 +03:00
Triang3l 6ce5330f5f [UI] Loop thread to main thread WindowedAppContext 2021-08-28 19:38:24 +03:00
emoose f2c706f943 [App] Add cache:\ mount for older games that use it 2021-08-18 17:34:59 -05:00
Triang3l 4617dc5569 Merge branch 'master' into vulkan 2020-12-13 20:04:12 +03:00
Triang3l 9a4643d0f2 [GPU] Non-ROV f24 trunc/round, host shader modifications, cache dir 2020-12-07 22:31:46 +03:00
Joel Linn b30fcbd29a [HID] Change order to xinput, sdl, winkey 2020-11-28 14:22:50 -06:00
Triang3l 48c97dd3b4 [Base] Android and Arm platform defines 2020-11-21 16:26:26 +03:00
Triang3l dfa181a529 [Vulkan] Provider init, Android platform defines 2020-09-06 22:08:36 +03:00
Triang3l 7b93670dbd [Vulkan] Remove old Vulkan code, change shaders directory, create empty Vulkan backend 2020-08-31 21:44:29 +03:00
Triang3l dffdf92e39 [Vulkan] Remove stillborn vk project 2020-08-22 23:31:52 +03:00
Jonathan Goyvaerts 92e445f01a [App] Add portable as a launch option in addition to checking for portable.txt existence 2020-08-21 20:31:19 +03:00
gibbed fdfc55c8fd [App] Support a relative content path. 2020-04-13 12:57:14 -05:00
Sandy Carter c8e64da4eb filesystem: use std for PathExists
Remove custom platform implementation of `PathExists` and replace uses
with `std::filesystem::exists`.
2020-04-09 09:44:48 -05:00
gibbed a48bb71c2f Overhaul logging. 2020-04-07 16:09:41 -05:00
gibbed 5bf0b34445 C++17ification.
C++17ification!

- Filesystem interaction now uses std::filesystem::path.
- Usage of const char*, std::string have been changed to
  std::string_view where appropriate.
- Usage of printf-style functions changed to use fmt.
2020-04-07 16:09:41 -05:00
Triang3l cde092ece1 [D3D12] Persistent shader and PSO storage 2020-03-21 19:22:19 +03:00
Triang3l b1d3fd2ad3 [App/Config] Add storage_root cvar and make content_root inside it by default, move game configs from content 2020-03-13 09:42:29 +03:00
Rick Gibbed 4ca0d0a656
[App] Remove inadvertent constexpr. 2020-02-22 13:50:17 -06:00
gibbed a6e6f0f7bf Lint cleanup. 2020-02-22 13:29:07 -06:00
Joel Linn 160f218210 [APU/Linux] Implement cross platform audio using SDL2 library. 2020-02-10 14:01:47 -06:00
Joel Linn 64f3925c7d [HID/Linux] Implement cross platform controller input using SDL2 library. 2020-02-10 13:41:19 -06:00
Prism Tutaj fc37f3e93a [App] Fix discord cvar 2020-02-09 16:21:51 -06:00
gibbed 3e6c2bb47c Fix up handling of positional options in cvar handling.
- Fix up handling of positional options in cvar handling so that executables
  other than app can handle them properly.
- Fix command-line arguments for xenia-vfs-dump.
2019-08-24 07:41:55 -05:00
Triang3l 2334e475de [Vulkan v2] Physical device, [D3D12] Small cleanup 2019-08-08 00:08:20 +03:00
gibbed a1c9d57afc [App] Make target into a transient cvar. 2019-08-04 02:18:03 -05:00
gibbed b2f62b1982 Clean up cvars (rename, recategorize). 2019-08-03 23:46:03 -05:00
gibbed 0ac83f99dc [App] Add winkey input driver last. 2019-08-03 20:47:39 -05:00
gibbed f2dac86b3f [App] Use make_unique when creating a derived type instance. 2019-08-03 20:46:03 -05:00
gibbed 02ea74becd [App] Only create input nop driver when explicitly requested. 2019-08-03 20:07:19 -05:00
gibbed e5eb59df71 [App] Remove unnecessary type aliasing (which also broke Travis). 2019-08-03 18:10:49 -05:00
gibbed f5cddbbf3f [App] Simplify and improve factory template.
[App] Rework audio and input system creation.
2019-08-03 17:36:50 -05:00
gibbed 848e2a4088 [App] Rework graphics system creation. 2019-08-03 16:42:38 -05:00
Triang3l 890a32bd98 [App] Only start D3D12 if DLL exists 2019-08-03 22:33:09 +03:00