Commit Graph

6130 Commits

Author SHA1 Message Date
Gliniak 2509b03b81 [Kernel] Disable high definition mode for internal resolutions lower than 1024x768
This will allow some games to render in lower than normal resolution
2023-10-22 20:24:43 +02:00
Gliniak e633bf3d98 [Kernel] Add support for positive timeout_ticks
used in NetDll_WSAWaitForMultipleEvents
2023-10-22 15:15:49 +02:00
Gliniak 91c67b9af2 [Emulator] Separate host and guest cache directories
Since now modules and shaders are stored within cache_host directory.
This will require coping modules and shaders directories from previous directory to new one
2023-10-21 18:28:14 +02:00
disjtqz a9dd1c89bc [Kernel] shrink ObCreateObject 2023-10-21 17:33:07 +02:00
disjtqz 6a1d612495 [Kernel] Stub out xeObSplitName until cleaner version can be written 2023-10-21 17:33:07 +02:00
disjtqz 275454089e [Kernel] Implement ObCreateObject 2023-10-21 17:33:07 +02:00
none d8aa14da73
Small fixes for better cross-platform compatibility (#200)
* Add ifdef check before the Microsoft-specific movsq in memory.cc

* Added ifdef before Microsoft-specific movsq and replaced with memcpy in other cases. In memory.cc

* Update image_sha_bytes_ from 16 to 20  xex_module.h

The value 16 is less than the expected value 20, causing a buffer overflow during sha1 finalization.

* Update image_sha_bytes_ loop from 16 to 20 iterations xex_module.cc

* Update mapped_memory_posix.cc: Must resize file to map_length.

* Should not map nullptr with MAP_FIXED flag. Update memory_posix.cc.
2023-10-21 17:07:29 +02:00
Adrian 9f1605a2e7 launch module fallback 2023-10-21 16:46:31 +02:00
Gliniak d84df6e47f [Memory] Added check to prevent crashes when title tries to get access to unavailable range 2023-10-21 10:15:26 +02:00
Gliniak d532d8eb61 [Memory] Adjust allocation range in 64k virtual range
Previosly there was no consideration of cutted space for XPS and MMIO
2023-10-21 10:15:26 +02:00
Gliniak 413d60bf54 [XAM] Added XamUserCreateStatsEnumerator 2023-10-20 17:54:54 +02:00
Gliniak 068b3811d9 [XAM] Added missing disposition return value from XamContentCreate 2023-10-20 12:21:11 +02:00
Gliniak bff24e3c18 [XAM] Fixed incorrect enumerator size
This should fix MCLA
2023-10-15 18:15:33 +02:00
disjtqz 1cf9b168e5 [Kernel] delete host to guest mapping in case of address collision 2023-10-14 22:35:49 +02:00
disjtqz 9b3601c6fa [Kernel] add object type stub functions, export object types 2023-10-14 22:35:49 +02:00
disjtqz 6a08208dc8 Proper misalignment for AllocatePool, add guest object table 2023-10-14 19:29:25 +02:00
disjtqz ee424ae14a [Kernel] nonintrusive guest to host object mapping 2023-10-14 15:01:12 +02:00
disjtqz d36b1b3830 [GPU] gpu_allow_invalid_fetch_constants true by default 2023-10-12 23:03:14 +02:00
disjtqz a7b047b2a2 Implement kernel processes 2023-10-12 22:13:40 +02:00
disjtqz ba7397952d implement missing packet_disassembler code 2023-10-11 19:26:42 +02:00
disjtqz 43fd396db7 implement dynamically allocateable guest to host callbacks 2023-10-11 18:37:23 +02:00
disjtqz d0a6cec024 total apc rework 2023-10-11 17:43:59 +02:00
disjtqz b5ddd30572 moved xsemaphore to xthread.d
add typed guest pointer template
add X_KSPINLOCK, rework spinlock functions.
rework irql related code, use irql on pcr instead of on XThread
add guest linked list helper functions
renamed ProcessInfoBlock to X_KPROCESS
assigned names to many kernel structure fields
2023-10-11 17:43:59 +02:00
disjtqz 32f7241526 fix user apc/kernel apc mixup 2023-10-10 10:58:41 +02:00
Gliniak e613134793 Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental 2023-10-02 10:07:05 +02:00
disjtqz fe7dc26e3f place locals on backend pages 2023-10-02 09:50:13 +02:00
disjtqz 67f16c4e31 implement more accurately inaccurate frsqrte 2023-10-02 09:47:30 +02:00
disjtqz 79465708aa implement bit-perfect vrsqrtefp 2023-10-01 11:08:17 +02:00
disjtqz cfecdcbeab [GPU] Vsync timing requires far less cpu now to be accurate 2023-09-29 20:17:11 +02:00
disjtqz 294b968fdf remove vsync_interval; replace with vsync_fps. 2023-09-29 20:17:11 +02:00
disjtqz 911055c44f add writeback base/size and xps base/size registers, EVENT_WRITE_SHD check 2023-09-21 21:44:23 +02:00
disjtqz 3876a1632a fix matchvalueandref freeze 2023-09-19 17:30:57 +02:00
Gliniak f6b5424a9f [VFS] Fixed invalid month decoding in decode_fat_timestamp 2023-09-14 12:32:51 +03:00
Gliniak 0f331b5313 [Testing] Added test project for vfs
- Added test case for: decode_fat_timestamp
- Changed location of: decode_fat_timestamp
2023-09-14 12:32:51 +03:00
Gliniak 9554f82c10 [VFS] More cleanup in Zarchive loader 2023-09-04 21:09:40 +02:00
Gliniak 9496c04ac5 [VFS] Zarchive cleanup & style changes 2023-09-03 21:31:17 +02:00
Romain Tisserand 2cd2d1d620 [VFS] Add support for loading ZArchive files 2023-09-03 21:16:34 +02:00
Gliniak e191f2d8d0 Separation of STFS, SVOD into different entities 2023-09-03 19:37:22 +02:00
Gliniak c1bd30eb7f Fixed invalid month decoding in decode_fat_timestamp 2023-09-02 10:49:58 +02:00
Gliniak d9e1c5024f Added test project for vfs
- Added test case for: decode_fat_timestamp
- Changed location of: decode_fat_timestamp
2023-09-02 10:49:58 +02:00
Gliniak ce9a82ccf8 Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental 2023-09-01 18:20:29 +02:00
Adrian dc29307a55 [Patcher] Fixed loading of disabled plugins 2023-08-08 22:17:58 +02:00
Byrom90 f743cf1e65 Implimented xex timestamp logging & ldr_data field. Allows custom plugins to access a modules timedatestamp via LDR_DATA_TABLE_ENTRY. 2023-08-08 08:45:43 +02:00
Gliniak 44fc8f9412 [Patcher] Fixed unitialized variable. is_any_plugin_loaded_
This will resolve random: [plugins applied] note
2023-08-08 08:21:44 +02:00
Gliniak 2a4d7feaae [GPU] Unified clear_memory_page_state to be used in D3D12 & Vulkan 2023-08-06 21:56:35 +02:00
Adrian 91f976e524 [Patcher] Plugin Loader
Added a plugin loader which can be enabled with allow_plugins in the config.
2023-08-06 21:36:49 +02:00
Gliniak 6e86eacf5a Added Listing directories and filtering filenames by regex 2023-08-03 20:37:00 +02:00
Gliniak 7a1eca7265 [Config] Redesigned config dumper 2023-07-29 21:41:34 +02:00
Adrian a180813fcf [Core] Fix config resetting 2023-07-29 18:14:22 +02:00
illusion0001 df7146818d [Config] Print contents of loaded config file 2023-07-29 03:12:53 -07:00
JeBobs ef91193a70 updated disclaimer, archive detection, clean up flags
No more beeps, no more persistent warnings, no more insults, just a single silent disclaimer that also helps the user get started using games from their console.

Updated the archived format detection to add tar/gz, and updated the box to state that xenia does not support running archived games.
2023-07-27 23:07:26 +02:00
Gliniak c5e6352c34 [CPU] Added constant propagation pass for: OPCODE_AND_NOT 2023-07-27 23:41:45 +03:00
illusion0001 2ab2a65ae1 Remove anti piracy message 2023-07-27 11:37:35 -05:00
Adriano Martins 1887ea0795 [Base] Add missing #include <cstdint> to utf8.cc 2023-07-27 13:02:54 +03:00
illusion0001 33e8c8e1ff
Update emulator.cc 2023-07-26 18:10:20 -05:00
illusion0001 616717b31e
Remove beeps 2023-07-26 18:09:17 -05:00
chris 8985abfcc9 Lowered the frequency of the Beeps down to 60hz, at higher volumes it could be downright painful.
Open to "how to rip games" section of quickstart, skipping install which is obviously completed
2023-07-26 17:46:18 -04:00
chris afef35c1c0 Implement RtlRandom (which despite its name is located in Xam, not xboxkrnl, which is why its not in xboxkrnl_rtl)
Stub XamVoiceSubmitPacket
Stubs for __CAP_Start_Profiling/End_Profiling/Enter_Function/Exit_Function
add a note about io_status_block in NtDeviceIoControlFile, change the param's type
move the X_IOCTL_ constants from NtDeviceIoControlFile's body into xbox.h
add X_STATUS_INVALID_IMAGE_FORMAT to xbox.h
Implement XexLoadImageHeaders
Much more correct version of IoCreateDevice, properly initializes/allocates device extension data
Stub version of IoDeleteDevice
Open the quickstart guide in the browser the first time a user opens the emulator
Add some persistent flags that are stored in the registry on windows, not in the config. These are just used to indicate whether one-time tasks have run, like showing the quickstart guide. On non-windows platforms a default value is returned that skips any of these tasks so they don't run every single time.

If the user opens a .iso file, show a warning telling them that we do not condone or support piracy. If the user closes the messagebox within two seconds set a sticky flag that will show them the warning every time instead. Otherwise the warning is never shown again. The Beep function is used to spook them a bit, to hopefully make them less likely to skip the message
If a user opens an archive file, which we do not support, explain to them that they're dumb.

Removed messages intended to "catch" pirates.  If they're out there, we don't want to know about them. A huge chunk of the discourse on our discord is now about piracy. Hopefully these changes will deter them from coming to us for now, but some of these messages may backfire
2023-07-23 14:26:10 -04:00
Margen67 ba936e8038
Don't have controller hotkeys enabled by default 2023-07-21 13:13:40 -07:00
Adrian 7c5b9372e7 [App] Misc App fixes
- Display xex filenames in title selection.
- Fixed fullscreen controller hotkeys executing from a non-UI thread.
- Fixed recent_titles_entry_amount not disabling if set to <= 0.
- Use RunTitle instead of LaunchPath which does more error checking.
2023-07-11 07:53:41 +02:00
Margen67 71b8c2357e Fix broken errors, log ISO path 2023-06-28 00:25:44 -07:00
Gliniak 921923472b [Kernel] Fixed issue in NtResumeThread introduced in previous commit 2023-06-18 09:48:30 +02:00
Adrian 68b2174378 [XAM] Fixed XexGetModuleHandle 2023-06-15 23:23:24 +02:00
Adrian 6a23f1457b [XAM] Plugin Functions 2023-06-15 21:16:31 +02:00
Margen67 28c67b9384 [APU] Add apu_ prefix to max_queued_frames cvar
Also add note about the minimum value.
2023-06-12 07:21:44 -07:00
Gliniak 9333373872 [Thread] Set pointer for: stack_kernel
Based on disassembly and info from console it is used to store initial thread data like:
 - start_address, context, etc.
2023-06-12 07:40:52 +02:00
Gliniak 4053f7bb0e [Kernel] Implemented: ObLookupAnyThreadByThreadId 2023-06-12 07:32:25 +02:00
Gliniak 00aba94b98 [NET] NetDll___WSAFDIsSet: Fixed incorrect endianness of fd_count
Plus: limit it to 64 entries
Thanks to Bo98 for pointing that out
2023-06-09 19:47:56 -05:00
Roy Stewart 07e81fe172 [Base] Filter out relative directories on linux 2023-06-09 19:47:28 -05:00
Roy Stewart 41c423109f [Base] Set the path for posix file info 2023-06-09 19:43:49 -05:00
Adrian 4a3b04d4ee [XAM] Implemented XamGetCurrentTitleId 2023-06-09 19:43:15 -05:00
Gliniak 858af5ae75 [XAM] xeXamContentCreate - Disposition cleanup 2023-06-09 19:42:48 -05:00
Gliniak e110527bfe [Base] ListFiles: Prevent leakage of file descriptors 2023-06-09 19:41:27 -05:00
Wunkolo 6ee2e3718f [x64] Add AVX512 optimizations for `OPCODE_VECTOR_COMPARE_UGT`(Integer)
AVX512 has native unsigned integer comparisons instructions, removing
the need to XOR the most-significant-bit with a constant in memory to
use the signed comparison instructions. These instructions only write to
a k-mask register though and need an additional call to `vpmovm2*` to
turn the mask-register into a vector-mask register.

As of Icelake:
`vpcmpu*` is all L3/T1
`vpmovm2d` is L1/T0.33
`vpmovm2{b,w}` is L3/T0.33

As of Zen4:
`vpcmpu*` is all L3/T0.50
`vpmovm2*` is all L1/T0.25
2023-05-29 14:57:09 -05:00
Wunkolo 121bf93cbe [PPC] Implement `vsubcuw`
Other half of #2125. I don't know of any title that utilizes this instruction, but I went ahead and implemented it for completeness.

Verified the implementation with `instr__gen_vsubcuw` from #1348. Can be grabbed with:
```
git checkout origin/gen_tests -- src\xenia\cpu\ppc\testing\*vsubcuw.s
```
2023-05-29 14:56:12 -05:00
Wunkolo 93b77fb775 [PPC] Implement `vaddcuw`
I don't know of any title that utilizes this instruction, but I went
ahead and implemented it for completeness.

Verified the implementation with `instr__gen_vaddcuw` from #1348. Can be
grabbed with:
```
git checkout origin/gen_tests -- src\xenia\cpu\ppc\testing\*vaddcuw.s
```
2023-05-29 14:56:00 -05:00
gmriggs 85a93029a7 fix array index out of bounds crash in EmulatorWindow::ProcessControllerHotkey 2023-05-19 15:26:19 -04:00
Triang3l ed64e3072b [GPU] Remove implicit bool cast in memexport checks 2023-05-05 21:38:45 +03:00
Triang3l 0e81293b02 [GPU] Remove a dangerous comment about break after exece [ci skip]
There can be jumps across an exece, so the code beyond it may still be
executed.
2023-05-05 21:32:02 +03:00
Triang3l 53f98d1fe6 [GPU/D3D12] Memexport from anywhere in control flow + 8/16bpp memexport
There's no limit on the number of memory exports in a shader on the real
Xenos, and exports can be done anywhere, including in loops. Now, instead
of deferring the exports to the end of the shader, and assuming that export
allocs are executed only once, Xenia flushes exports when it reaches an
alloc (allocs terminate memory exports on Xenos, as well as individual ALU
instructions with `serialize`, but not handling this case for simplicity,
it's only truly mandatory to flush memory exports before starting a new
one), the end of the shader, or a pixel with outstanding exports is killed.

To know which eM# registers need to be flushed to the memory, traversing
the successors of each exec potentially writing any eM#, and specifying
that certain eM# registers might have potentially been written before each
reached control flow instruction, until a flush point or the end of the
shader is reached.

Also, some games export to sub-32bpp formats. These are now supported via
atomic AND clearing the bits of the dword to replace followed by an atomic
OR inserting the new byte/short.
2023-05-05 21:32:02 +03:00
chss95cs@gmail.com dcb98683cf Replace most magic numbers from previous commit with named constants, if the constants names are known.
Instead of low 32 bits of ptr, use hash of file path for sector information.
2023-05-01 09:32:33 -04:00
chss95cs@gmail.com b270c59d0c Check for null string passed to Win32Thread::set_name
Move impl of XamAlloc into XamAllocImpl
Implement XamAllocEx
Implement XamIsCurrentTitleDash
Implement XamGetLocaleDateFormat
Implement "stub" of XFileFsDeviceInformation, XFileSectorInformation (log warnings when they're executed so this doesnt get mistaken for an actual implementation)
Added xboxkrnl_memory file
Moved meat of MmAllocatePhysicalMemoryEx into xeMmAllocatePhysicalMemoryEx, exposed its definition via xboxkrnl_memory.

Moved impl of RtlImageHeaderNt into a helper function
Implement RtlImageDirectoryEntryToData
Implement KeSuspendThread
Check for bad handles passed in XOVERLAPPED, fixes crash in 415608D8
2023-04-30 13:22:49 -04:00
Radosław Gliński 37051afcaf
Merge branch 'xenia-project:master' into canary_experimental 2023-04-26 13:34:57 +02:00
Gliniak 94759861c5 [Kernel] Implemented: NtAllocateEncryptedMemory 2023-04-26 08:37:39 +02:00
chss95cs@gmail.com c86233cc80 Detect corrupted xiso images that have file entries that are out of range and show a fatal error. 2023-04-23 12:52:03 -04:00
chss95cs@gmail.com 2fa2f1a78c Add more wrapper functions to ppc_context_t in kernel, want to switch…
… over to referencing state through ppc_context as much as possible, it'll make implementing things like kernel processes much easier in the future

Move forward definitions of kernel types into kernel_fwd.h

Stub implementation of  XamLoaderGetMediaInfoEx
Partially implement XamSetDashContext
implement XamGetDashContext
Stub implementation of XamUserIsUnsafeProgrammingAllowed
Stub implementation of XamUserGetSubscriptionType
Expanded the supported object types for ObReferenceObjectByHandle and wrapped the logic for encoding the type in a constexpr function
ObReferenceObjectByName was taking lpstring_t for first param, but the function actually takes X_ANSI_STRING ptr

NtReleaseMutant actually does not return anything from KeReleaseMutant, it just checks the handle and returns whether it was invalid.

Changed the raise/lower irql functions to just set the irql on the pcr instead of setting it on field of Processor, processor is a shared object and irql is per-thread

Semi-stub implementation of KeGetImagePageTableEntry. I locked at it in the HV and got it so the values are in the same range the HV returns + actually reflect the page & memory range, but i doubt its equal to the values the hv returns on real hw. Used by modern dashboards, don't know for what.
Log error message for ObDereferenceObject w/ null ptr.

Allocate a special fixed page that dashboards reference, currently don't know what the data on that page is supposed to be.
Add current_irql field to X_KPCR
Added Device object member of XObject::Type enum
Added some notes about other gpu registers, found a table of register names and indices in xam
2023-04-23 10:39:52 -04:00
Gliniak caddaa509a [UI] Disable achievements notifications from default to prevent ingame spam
Seems like a lot of titles like to spam this randomly and do not track internally if
achievement was unlocked or not.
2023-04-23 10:55:49 +02:00
Triang3l 8aaa6f1f7d [SPIR-V] Wrap 4-operand ops and 1-3-operand GLSL std calls 2023-04-19 21:44:24 +03:00
Triang3l 19d56001d2 [SPIR-V] Wrap NoContraction operations 2023-04-19 11:53:45 +03:00
Triang3l 78f1d55a36 [SPIR-V] Use Builder createSelectionMerge directly 2023-04-19 11:11:28 +03:00
Triang3l 64d2a80f79 [SPIR-V] Cleanup ALU emulation conditionals 2023-04-19 10:35:09 +03:00
Triang3l eede38ff63 [SPIR-V] Remove more vec2-4 reserve calls 2023-04-18 22:05:02 +03:00
chss95cs@gmail.com 1f86dc0454 Check for and allow null critical sections, but log them.
stub XeKeysGetConsoleType
Removed the breakpoints in HandleCppException and RtlRaiseException until we have a real implementation of them. Some apps can continue fine afterwards.
Stub version of HalGetCurrentAVPack
Implement MmIsAddressValid
Implement RtlGetStackLimits
2023-04-16 17:34:46 -04:00
chss95cs@gmail.com 27c4cef1b5 Added logger flags, for selectively disabling categories of logging (cpu, apu, kernel). Need to make more log messages make use of these flags.
The "close window" keyboard hotkey (Guide-B) now toggles between loglevel -1 and the loglevel set in your config.
Added LoggerBatch class, which accumulates strings into the threads scratch buffer. This is only intended to be used for very high frequency debug logging. if it exhausts the thread buffer, it just silently stops.
Cleaned nearly 8 years of dust off of the pm4 packet disassembler code, now supports all packets that the command processor supports.
Added extremely verbose logging for gpu register writes. This is not compiled in outside of debug builds, requires LogLevel::Debug and log_guest_driven_gpu_register_written_values = true.
Added full logging of all PM4 packets in the cp. This is not compiled in outside of debug builds, requires LogLevel::Debug and disassemble_pm4.
Piggybacked an implementation of guest callstack backtraces using the stackpoints from enable_host_guest_stack_synchronization. If enable_host_guest_stack_synchronization = false, no backtraces can be obtained.
Added log_ringbuffer_kickoff_initiator_bts. when a thread updates the cp's read pointer, it dumps the backtrace of that thread
Changed the names of the gpu registers CALLBACK_ADDRESS and CALLBACK_CONTEXT to the correct names.
Added a note about CP_PROG_COUNTER
Added CP_RB_WPTR to the gpu register table
Added notes about CP_RB_CNTL and CP_RB_RPTR_ADDR. Both aren't necessary for HLE
Changed name of UNKNOWN_0E00 gpu register to TC_CNTL_STATUS. Games only seem to write 1 to it (L2 invalidate)
2023-04-16 12:42:42 -04:00
chss95cs@gmail.com ab21e1e0f0 Several changes for timestamp bundle:
Fully defined the structure.
Single copy of it + single timer across all modules, managing it is now the responsibility of KernelState.

add global_critical_region::PrepareToAcquire, which uses Prefetchw on the global crit. We now know we can use Prefetchw on all cpus that have AVX.
add  KeQueryInterruptTime, which is used by some dashboards.

add threading::NanoSleep
2023-04-16 10:08:01 -04:00
chrisps 12c9135843
Merge branch 'xenia-project:master' into canary_experimental 2023-04-16 09:11:39 -04:00
chss95cs@gmail.com e75e0425e0 forward branch for double-clear condition in reserved store 2023-04-15 16:22:37 -04:00
chss95cs@gmail.com 7fb4b4cd41 Attempt to emulate reserved load/store more closely. can't do anything for stores of the same value that are done via a non-reserved store to a reserved location
uses a bitmap that splits up the memory space into 65k blocks per bit.  Currently is using the guest virtual address but should be using physical addresses instead.

Currently if a guest does a reserve on a location and then a reserved store to a totally different location we trigger a breakpoint. This should never happen
Also removed the NEGATED_MUL_blah operations. They weren't necessary, nothing special is needed for the negated result variants.

Added a log message for when watched physical memory has a race, it just would be nice to know when it happens and in what games.
2023-04-15 16:06:07 -04:00
Triang3l 887fda55c2 [SPIR-V] Remove temp reserve for 4 or less elements 2023-04-13 22:43:44 +03:00
Triang3l 75d805245d [DXBC] `discard` pixels from `kill` with ROV instead of returning
Keep the current lane active as it may be needed for derivatives.
2023-04-09 20:13:22 +03:00
Gliniak 5e0c67438c Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental 2023-04-09 17:28:04 +02:00
Gliniak 7fce8fce9d Revert "[D3D12] Added workaround for broken Geometry Shader in AMD 7900 Series GPU"
This reverts commit 2afd2cc4d6.
2023-04-09 17:27:55 +02:00
Triang3l 88c645d818 [D3D12] Don't use emit_then_cut due to RDNA 3 crash 2023-04-09 18:07:44 +03:00
chss95cs@gmail.com 22dd2e52e6 Temporarily disable prefetchw check, odd report from user who definitely has prefetchw, but it is not being reported as present 2023-04-02 17:11:08 -04:00
chss95cs@gmail.com 356300d14c Do not check if we should show the prefetchw error message if we already have an avx error message.
mtmsrd writes the EE bit only now.
2023-04-02 14:26:01 -04:00
chss95cs@gmail.com 6ccdc4d0df setup initial value of MSR on ppc context
Fixed PrefetchW feature check
Added prefetchw check to startup AVX check, there should be no CPUs that support AVX but not PrefetchW.
Init VRSAVE to all ones.
Removed unused disable_global_lock flag.
2023-04-01 14:48:56 -04:00
Triang3l baa2ff78d8 [Vulkan] Add missing stencil reference unpack in RT transfer + formatting fix 2023-03-30 22:40:40 +03:00
Triang3l c238d8af55 [Vulkan] Fix FragStencilRef store type 2023-03-30 22:28:56 +03:00
Adrian 8678becda6 [XAM] StartupEx 2023-03-13 11:00:52 +01:00
Gliniak 23bd18cfca [GPU] Check if memory page is available while copying data 2023-03-11 16:20:01 +01:00
Gliniak eb5da8e557 [UI] Changed default UI font to Tahoma. If it's not available use embedded font
- Additionally allow user to provide own font size
- Forced font scaling in notification window to be reasonable size
2023-03-11 13:07:07 +01:00
Gliniak 9fa6e94772 [Achievements] Present notification in language selected by user 2023-03-08 13:24:55 +01:00
Gliniak 202ab76300 [Kernel] Changed default notification position 2023-03-08 10:59:08 +01:00
Gliniak 0ec65be5ff [UI] Notification & Custom Font Support 2023-03-08 09:36:49 +01:00
Adrian 069d33c03f [XAM] Implemented Functions
Implemented:
- RtlSleep
- SleepEx
- Sleep
- GetTickCount
- GetModuleHandleA
- XamGetCurrentTitleId
2023-03-06 08:31:40 +01:00
Gloria d62fe21d47
RADV bug fix (#139)
* Use correct typing for stencil, dispatch launch on UI thread
* Clean up some LaunchPath code
2023-03-06 08:31:05 +01:00
Adrian 84571f8fe6 Allow patched arrays to start with 0x 2023-02-17 17:05:16 +01:00
Gliniak c74a047655 [Win] Revert XE_USE_KUSER_SHARED back to 0
Also limit queued audio frames
2023-02-11 18:20:21 +01:00
Adrian 321dd75e05 [VFS] Fixed allow_game_relative_writes to write invalid cached entries
This PR fixes https://github.com/xenia-canary/xenia-canary/issues/123 a bug where files would not write to the host if they were deleted from the host filesystem during runtime.
2023-02-08 22:38:16 +01:00
Adrian 333d7c2767 [UI] Added build to exception message 2023-02-05 18:31:12 +01:00
Adrian 3d3810fa98 [App] Minor Fixes
Added F9
[Controller Hotkeys] Call RunTitle from the UI thread
2023-02-03 21:23:35 +01:00
Gliniak 036f19c66e [UI] Uplift of TR - 2092
Plus fixed invalid path, they're not stored anymore
2023-02-02 19:02:23 +01:00
Adrian 4c220770ec Dynamic TU patch
Load the correct TU patch for the loaded launch module
2023-01-31 21:11:38 +01:00
Adrian b10c84b340
Added controller hotkeys cvar (#119)
* Added controller hotkeys setting

Added option to disable controller hotkeys
Minor Changes

* Fixed locked input system

The input system lock should be released even if a controller is not connected.
2023-01-29 19:26:25 +01:00
Shoegzer 4a2f4d9cfe Add include to fix compiling 2023-01-29 21:10:20 +03:00
Gliniak 89f3598426 Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental 2023-01-29 11:25:56 +01:00
Gliniak 4e87d1f9d1 [Kernel/Thread] Set TLS slot to 0 while freeing 2023-01-28 17:49:12 -06:00
Emma 4ecd1bffda
Add configurable toggle to mark code segments as writable (#100) 2023-01-23 23:44:59 +01:00
Margen67 aea9714bd0 Make present_safe_area 100
So people stop asking why their games are cropped off.
2023-01-23 01:22:43 -08:00
Adrian 504fb9f205 Title selection & bug fixes
Added title selection
Fixed controller hotkeys for multiple connected controllers
Fixed ImGUI dialog box stacking
Added a new font for ImGUI
2023-01-22 22:49:51 +01:00
Adrian 459497f0b6
Implemented Controller Hotkeys (#111)
Implemented Controller Hotkeys

Added controller hotkeys
Added guide button support for XInput and winkey

The hotkey configurations can be found in HID -> Display controller hotkeys
If the Xbox Gamebar overlay is enabled then use the Back button instead of the Guide button.

- Fixed hotkey thread destruction
- Fixed XINPUT_STATE by padding 4 bytes
- Added hotkey vibration for user feedback
- Replaced MessageBoxA with ImGuiDialog::ShowMessageBox

Co-authored-by: Margen67 <Margen67@users.noreply.github.com>
2023-01-13 09:17:43 +01:00
Gliniak ea1003c6bf [CPU] Check if flags pointer exists
There is specific situation in FM4 on terminating thread that runs dll
where flags pointer doesn't exist
2023-01-09 16:13:09 +01:00
Gliniak d24d3295c6 [XMA] Clear host data on context clear + swap buffer if decoding fails 2023-01-06 19:27:41 +01:00
Gliniak 668244adb7 [XAM] Fixed importing savefiles from different games (at least partially) 2023-01-06 16:35:40 +01:00
Gliniak 40f6b5b3ac [Kernel] Improvements to host and guest objects handling 2023-01-06 09:13:21 +01:00
Gliniak 5b0fe5aada [XBDM] Added Stub For: DmGetConsoleDebugMemoryStatus 2023-01-05 22:34:26 +01:00
Gliniak 81412c59c1 [Net] NetDll___WSAFDIsSet: Fixed incorrect endianness of fd_count
Plus: limit it to 64 entries
2023-01-05 20:33:12 +01:00
Gliniak 81aaf98e04 [Memory] Added option to ignore offset for ranged physical allocations
Certain titles check if input address matches output and if it doesn't then just crash
2023-01-05 08:56:22 +01:00
Gliniak 39c509b57f [APU] Resolved context stuck with is_stream_done_ flag and no space left 2023-01-03 19:49:21 +01:00
Gliniak b7bc0425ba Revert "[APU] Clear host data while reseting context"
This reverts commit 1451ca4266.
2023-01-01 11:23:47 +01:00
Gliniak 2afd2cc4d6 [D3D12] Added workaround for broken Geometry Shader in AMD 7900 Series GPU 2022-12-31 11:27:01 +01:00
Gliniak 26415cb8b1 Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental 2022-12-31 11:19:01 +01:00
Wunkolo e55cb737c1 [x64] Add AX512 optimization for `OPCODE_SELECT`(F64) 2022-12-28 14:20:20 -06:00
Wunkolo ba75a016b4 [x64] Add AX512 optimization for `OPCODE_SELECT`(V128)
Uses `vpternlogd` to collapse the bitwise select operation into one
instruction. Though it needs a `vmovdqa` instruction since `vpternlogd`
reads and writes to the first argument.
2022-12-28 14:20:20 -06:00
Wunkolo 7c21b327ff [x64] Add `x64_util.h`
Used to help with generating instruction-specific constants.  Currently
used for the ternary-logic constants(`vpternlog*`).
2022-12-28 14:20:20 -06:00
Gliniak eb25fe4f4a [CPU] Increase amount of possible labels used in FinalizationPass
Instead of using decimal notaation for labels let's use hexadecimal.
That will increase amount of possible combination by a lot.
2022-12-28 14:19:55 -06:00
Joel Linn 9eef64d3fb [SDL2] Print version on startup 2022-12-28 14:19:02 -06:00
Gliniak 859bb89555 [Kernel] Support for loading achievement data 2022-12-28 14:16:43 -06:00
Joel Linn da9c90835b [ImGui] Use new key API 2022-12-28 14:16:32 -06:00
Joel Linn b641e39c4d [ImGui] Use new ImageButton signature 2022-12-28 14:16:32 -06:00
Joel Linn d6b5cbd634 [ImGui] Fix deprecated SetCursorPos use 2022-12-28 14:16:32 -06:00
Joel Linn 29985f69e1 [ImGui] Fix fsr sharpness slider scaling
https://github.com/ocornut/imgui/issues/3361#issuecomment-1287045159
2022-12-28 14:16:32 -06:00
Joel Linn f452d6a007 [UI] Fix UB (moved mem) in file picker
- References to vector data become UB after vector size changes.
- Add one extra level of indirection to pin the wide string memory
  location regardless of vector memory
2022-12-28 14:16:32 -06:00
Joel Linn 7877331d8a [ImGui] Use ImDrawCmd::IdxOffset field
c80e8b964c
https://github.com/ocornut/imgui/issues/4845#issuecomment-1003329113
2022-12-28 14:16:32 -06:00
Joel Linn 8f88bb237e [ImGui] Fix empty IDs 2022-12-28 14:16:32 -06:00
Joel Linn b107af68bc [ImGui] Fix removed flags 2022-12-28 14:16:32 -06:00
Gliniak 9f0d3d4c5b [Kernel] Simple support for loading achievement data 2022-12-25 16:11:59 +01:00
Gliniak a25ea03a28 [Threading] Revert sleepdelay0_for_maybeyield back to 0 2022-12-21 20:12:40 +01:00
Gliniak 783845c56e [XAM] Cleanup in xam_input.
- Added some missing xinput flags
- Simplified XamInputGetCapabilities
- Replaced hardcoded checks to flag
2022-12-20 19:33:55 +01:00
InvoxiPlayGames d13c3b7306 fix XamInputGetCapabilities breaking subtypes 2022-12-19 04:09:27 +00:00
Radosław Gliński 8c43160fc6
Merge pull request #106 from Bo98/canary_burnout5_pr1
CPU threading improvements, Kernel game region improvements
2022-12-18 11:16:53 +01:00
Bo Anderson 6c5423ceed app/emulator_window: store recent.toml in storage root
This makes it consistent with the config file.
2022-12-17 13:36:43 -05:00
Bo Anderson 422ee1fbdc kernal/xam/xam_info: base game region on user_country
It is debatable whether this is correct in the general case.

There's nothing really wrong with 0xFFFF logically.
Burnout Paradise however bases its in-game language on this and does not recognise 0xFFFF.
The game uses Japanese in the default case.

I've avoided the "rest of Asia" code since Burnout Paradise seems to use a different value (0x01F8) for that than what I expected (0x01FC).
2022-12-17 13:34:49 -05:00
Bo Anderson 2a77ed7246 cpu/backend/x64/x64_tracers: fix thread matching 2022-12-17 13:34:48 -05:00
Bo Anderson 9489161d0b kernel: separate host objects from regular handle range
Burnout Paradise statically expects certain thread handle values based on how many objects it knows it is allocating ahead of time.
From this, it calculates an ID by subtracting the thread handle from a base handle of what it expects the first such thread to be assigned.
The value is statically declared in the executable and is not determined automatically.

The host objects in the handle range made these thread handles higher than what the game expects.
Removing these, and allowing 0xF8000000 to be assigned, allows the thread handles to fit perfectly in the range the game expects.

It is not clear what handle range the host objects should be taking. For now though, they're 0-based rather than 0xF8000000-based.
2022-12-17 13:34:47 -05:00
MoistyMarley 4f165825ea Fixed three crashes and refactored
While a title is running any attempt to open another title will result in a crash this commit prevents these crashes via File->Open and File->Open Recent->Title

Opening a recent title with an invalid path caused a crash.
2022-12-16 18:39:42 +00:00
chss95cs@gmail.com 3952997116 Fix issue introduced yesterday where the final fetch constant would never be marked as written
Reorganized SystemPageFlags for sharedmemory, each field now goes into its own array, the three arrays are page aligned in a single virtual allocation
Refactored sharedmemory a bit, use tzcnt if available when finding ranges (faster on pre-zen4 amd cpus)
2022-12-15 08:35:36 -08:00
chrisps 8c6cefcce0
Merge branch 'xenia-canary:canary_experimental' into command_processor_optimizations 2022-12-14 11:35:13 -08:00
chss95cs@gmail.com f931c34ecb Cleaned up for commit, moved WriteRegistersFromMemCommonSense code into WriteRegistersFromMem
optimized copy_and_swap_32_unaligned further
2022-12-14 11:34:33 -08:00
chss95cs@gmail.com 754293ffc3 Fix mistake with fetch constant dirty mask 2022-12-14 10:13:13 -08:00
chss95cs@gmail.com 7a0fd0f32a Remove MaybeYields when vsync is off 2022-12-14 09:56:33 -08:00
chss95cs@gmail.com 080b6f4cbd Partially vectorized GetScissor (loading and unpacking the bitfields from the registers is still scalar) 2022-12-14 09:33:14 -08:00
chss95cs@gmail.com ab6d9dade0 add avx2 codepath for copy_and_swap_32_unaligned
use the new writerange approach in WriteRegisterRangeFromMem_WithKnownBound
2022-12-14 07:53:21 -08:00
Gliniak b2dd489151 [APU] Set first frame offset for next buffer + Note about edgecase 2022-12-13 22:57:36 +01:00
Gliniak e00feb7b0f [APU] Fixed incorrect frame count + removed hopefully useless check from now 2022-12-13 21:47:35 +01:00
chss95cs@gmail.com 82dcf3f951 faster/more compact MatchValueAndRef
Made commandprocessor GetcurrentRingReadcount inline, it was made noinline to match PGO decisions but i think PGO can make extra reg allocation decisions that make this inlining choice yield gains, whereas if we do it manually we lose a tiny bit of performance

Working on a more compact vectorized version of GetScissor to save icache on cmd processor thread
Add WriteRegisterForceinline, will probably end up amending to remove it

add ERMS path for vastcpy

Adding WriteRegistersFromMemCommonSense (name will be changed later), name is because i realized my approach with optimizing writeregisters has been backwards,  instead of handling all checks more quickly within the loop that writes the registers, we need a different loop for each range with unique handling. we also manage to hoist a lot of the logic out of the loops

Use 100ns delay for MaybeYield, noticed we often return almost immediately from the syscall so we end up wasting some cpu, instead we give up the cpu for the min waitable time (on my system, this is 0.5 ms)

Added a note about affinity mask/dynamic process affinity updates to threading_win

Add TextureFetchConstantsWritten
2022-12-13 11:25:33 -08:00
Gliniak 16580f5fae [APU] Fixed crash in error message caused by invalid arguments number 2022-12-13 11:02:59 +01:00
Gliniak 97fdf9c6dd [APU] Resolved crash related to negative amount of bits to copy
This is likely due to hitting somehow valid frame in invalid data
2022-12-13 08:46:48 +01:00
Gliniak 43d7fc5158 [APU] Shuffle checks to hopefully prevent crashing from logger 2022-12-11 21:06:47 +01:00
Gliniak c9cd6f15fc [APU] Fixed logged frame count
Until now info about frames that were provided in log was always incorrect by 1
2022-12-11 13:11:23 +01:00
Gliniak 82ccdd3db5 [APU] Misc Changes:
- Unified APU error messages
- Removed magic number from SetOffset call
- Commented out that annoying assertion from XmaContext::GetNextFrame
- Removed checks for current_input_packet_count and replaced with bool check

Not sure how to call it correctly. I know that calls with packet count == 1 is specific one
and probably handled differently. Is it streaming or how should it be called?
2022-12-11 12:22:23 +01:00
chss95cs@gmail.com 7d49b97e4c Print any module name+ offset in host exception reports
print thread name in host exception reports
trying to force win32 error descriptions to english
Return if output buffer block count is 0 in XmaContext, this is an attempt to fix a divide by zero crash many users have reported
2022-12-09 12:24:06 -08:00
Gliniak 7c5da821d4 [Kernel] Fixed invalid thread pointer in KeEnableFpuExceptions 2022-12-08 21:48:13 +01:00
Radosław Gliński 747fb42bdf
Merge pull request #98 from AdrianCassar/canary_experimental
Added a hotkey to open the previously played title
2022-12-05 18:09:52 +01:00
MoistyMarley 6d2724a861 Added a hotkey to open the previously played title 2022-12-05 17:01:16 +00:00
chss95cs@gmail.com a63f424c0a Directly check PEB for IsDebuggerAttached
Add constexpr getters to magicdiv class so it can be used from jitted x64/dxbc
Track the guest return address as well for guest/host sync, if multiple entries have the same guest stack find the first one with a matching guest retaddr. this fixes epic mickey 2 (which the previous guest-stack change had allowed to go ingame for a bit) and potentially also a crash in fable3.
Break if under debugger when stackpoints are overflowed

Add much more useful output for host exceptions, print out xenia_canary.exe relative offsets if exception is in module, formatmessage for ntstatus/win32err, strerror

Minor d3d12 microoptimization, instead of doing SetEventOnCompletion + WaitForSingleObject do SetEventOnCompletion w/ nullptr so that the wait happens in kernel mode, avoiding two extra context switches

add unimplemented kernel functions:
ExAllocatePoolWithTag
ObReferenceObject

ObDereferenceObject has no return value.
Log a message when ObDereferenceObject/Reference receive unregistered guest kernel objects
gave ObLookupThreadByThreadId its correct error status
hoist object_types initialization out of ObReferenceObjectByHandle
Fix out parameter values on error for a few kernel funcs
add note about msr to KeSetCurrentStackPointers
add X_STATUS_OBJECT_TYPE_MISMATCH check for xeNtSetEvent
add msr_mask field to X_KPCR
2022-12-04 12:38:19 -08:00
Triang3l e97eb75b94 [Vulkan] Update variableMultisampleRate comments (actually supported) [ci skip] 2022-12-04 14:55:56 +03:00
Triang3l 0b4f5ef286 [SPIR-V] Decorate whole gl_PerVertex with Invariant
Block members can be decorated with Invariant only since SPIR-V 1.5 Revision 2. In earlier versions, Invariant can be used only for variables. Mesa warns about this.
2022-12-03 14:27:43 +03:00
Gliniak 1eb61aa9ab Added reccently opened titles list 2022-11-29 10:47:30 +01:00
chrisps 0674b68143
Merge pull request #96 from chrisps/host_guest_stack_synchronization
Host/Guest stack sync, exception messagebox, kernel improvements, minor opt
2022-11-27 10:30:16 -08:00
Gliniak 12005acc98 [APU] Check if splitted frame length is valid 2022-11-27 18:40:27 +01:00
chss95cs@gmail.com 90c771526d "Fix" debug console, we were checking the cvar before any cvars were loaded, and the condition it checks in AttachConsole is somehow always false
Remove dead #if 0'd code in math.h

On amd64, page_size == 4096 constant, on amd64 w/ win32, allocation_granularity == 65536. These values for x86 windows havent changed over the last 20 years so this is probably safe
and gives a modest code size reduction

Enable XE_USE_KUSER_SHARED. This sources host time from KUSER_SHARED instead of from QueryPerformanceCounter, which is far faster, but only has a granularity of 100 nanoseconds.

In some games seemingly random crashes were happening that were hard to trace because
the faulting thread was actually not the one that was misbehaving, another threads stack was underflowing into the faulting thread.

Added a bunch of code to synchronize the guest stack and host stack so that if a guest longjmps the host's stack will be adjusted.
Changes were also made to allow the guest to call into a piece of an existing x64 function.

This synchronization might have a slight performance impact on lower end cpus, to disable it set enable_host_guest_stack_synchronization to false.
It is possible it may have introduced regressions, but i dont know of any yet

So far, i know the synchronization change fixes the "hub crash" in super sonic and allows the game "london 2012" to go ingame.

Removed emit_useless_fpscr_updates, not emitting these updates breaks the raiden game

MapGuestAddressToMachineCode now returns nullptr if no address was found, instead of the start of the function

add Processor::LookupModule

Add Backend::DeinitializeBackendContext
Use WriteRegisterRangeFromRing_WithKnownBound<0, 0xFFFF> in WriteRegisterRangeFromRing for inlining (previously regressed on performance of ExecutePacketType0)

add notes about flags that trap in XamInputGetCapabilities

0 == 3 in XamInputGetCapabilities

Name arg 2 of XamInputSetState

PrefetchW in critical section kernel funcs if available & doing cmpxchg

Add terminated field to X_KTHREAD, set it on termination

Expanded the logic of NtResumeThread/NtSuspendThread to include checking the type of the handle (in release, LookupObject doesnt seem to do anything with the type)
and returning X_STATUS_OBJECT_TYPE_MISMATCH if invalid. Do termination check in NtSuspendThread.

Add basic host exception messagebox, need to flesh it out more (maybe use the new stack tracking stuff if on guest thrd?)

Add rdrand patching hack, mostly affects users with nvidia cards who have many threads on zen

Use page_size_shift in more places

Once again disable precompilation! Raiden is mostly weird ppc asm which probably breaks the precompilation. The code is still useful for running the compiler over the whole of an xex in debug to test for issues
"Fix" debug console, we were checking the cvar before any cvars were loaded, and the condition it checks in AttachConsole is somehow always false

Remove dead #if 0'd code in math.h

On amd64, page_size == 4096 constant, on amd64 w/ win32, allocation_granularity == 65536. These values for x86 windows havent changed over the last 20 years so this is probably safe
and gives a modest code size reduction

Enable XE_USE_KUSER_SHARED. This sources host time from KUSER_SHARED instead of from QueryPerformanceCounter, which is far faster, but only has a granularity of 100 nanoseconds.

In some games seemingly random crashes were happening that were hard to trace because
the faulting thread was actually not the one that was misbehaving, another threads stack was underflowing into the faulting thread.

Added a bunch of code to synchronize the guest stack and host stack so that if a guest longjmps the host's stack will be adjusted.
Changes were also made to allow the guest to call into a piece of an existing x64 function.

This synchronization might have a slight performance impact on lower end cpus, to disable it set enable_host_guest_stack_synchronization to false.
It is possible it may have introduced regressions, but i dont know of any yet

So far, i know the synchronization change fixes the "hub crash" in super sonic and allows the game "london 2012" to go ingame.

Removed emit_useless_fpscr_updates, not emitting these updates breaks the raiden game

MapGuestAddressToMachineCode now returns nullptr if no address was found, instead of the start of the function

add Processor::LookupModule

Add Backend::DeinitializeBackendContext
Use WriteRegisterRangeFromRing_WithKnownBound<0, 0xFFFF> in WriteRegisterRangeFromRing for inlining (previously regressed on performance of ExecutePacketType0)

add notes about flags that trap in XamInputGetCapabilities

0 == 3 in XamInputGetCapabilities

Name arg 2 of XamInputSetState

PrefetchW in critical section kernel funcs if available & doing cmpxchg

Add terminated field to X_KTHREAD, set it on termination

Expanded the logic of NtResumeThread/NtSuspendThread to include checking the type of the handle (in release, LookupObject doesnt seem to do anything with the type)
and returning X_STATUS_OBJECT_TYPE_MISMATCH if invalid. Do termination check in NtSuspendThread.

Add basic host exception messagebox, need to flesh it out more (maybe use the new stack tracking stuff if on guest thrd?)

Add rdrand patching hack, mostly affects users with nvidia cards who have many threads on zen

Use page_size_shift in more places

Once again disable precompilation! Raiden is mostly weird ppc asm which probably breaks the precompilation. The code is still useful for running the compiler over the whole of an xex in debug to test for issues
2022-11-27 09:39:33 -08:00
Gliniak 1451ca4266 [APU] Clear host data while reseting context 2022-11-27 17:00:31 +01:00
Gliniak 9fdfd2ada9 [APU] Removed old hack that invalidates input on decoder error
Added returning parsing error while decoder fails
2022-11-26 17:25:39 +01:00
chss95cs@gmail.com 7a17fad88a fix crash from precompiling out of range funcs, add xexcache version, increment xexcache version (all priors are version 0 thanks to 0 initialization) 2022-11-07 05:40:18 -08:00
chss95cs@gmail.com e21fd22d09 add x_kthread priority/fpu_exceptions_on fields, set fpu_exceptions_on in KeEnableFpuExceptions, set priority in SetPriority
add msr field on context
write to msr for mtmsr/mfmsr, do not have correct default value for msr yet, nor has mtmsrd been reimplemented
do not evaluate assert expressions in release at all, while still avoiding unused variable warnings
2022-11-06 11:03:10 -08:00
chss95cs@gmail.com c1d922eebf Minor decoder optimizations, kernel fixes, cpu backend fixes 2022-11-05 10:50:33 -07:00
Gliniak ba66373d8c [APU][Janky] Fixed issues with incorrect frames on streamed data
This requires a lot more research and test data!
2022-11-03 20:56:36 +01:00
Gliniak dae508500a [APU] Clear remaining packets skip when we're done with current stream
Plus some additional logging
2022-11-03 12:59:47 +01:00
Margen67 4ba14bc35e [APU+HID] Optimizations 2022-11-03 03:56:13 -07:00
Gliniak b23566b823 [APU] Fix incorrect packet frame count when frame ends exactly where packet ends
This resolves looping background sound in GoW
2022-11-03 11:14:37 +01:00
Gliniak 259679d53c [APU] Handle exceeding input offset by switching buffer
This should resolve crashes in FH
2022-11-02 08:47:36 +01:00
chrisps 8186792113
Revert "Minor decoder optimizations, kernel fixes, cpu backend fixes" 2022-11-01 14:45:36 -07:00
chrisps 781871e2d5
Merge pull request #87 from chrisps/canary_experimental
Minor decoder optimizations, kernel fixes, cpu backend fixes
2022-11-01 11:49:10 -07:00
Gliniak c080e2e17c [APU] Resolved crashes related to out of bound readouts 2022-11-01 11:24:01 +01:00
Triang3l 778333b1b5 [UI] Fix ClearInput not called in ImGuiDrawer after deferred dialog removal
Also cleanup the code involved in dialog registration, and update the explanation of why dialog removal is delayed until the end of drawing (the original was written back when window listener and UI drawer callback registration during the execution of the callbacks was deferred, but that was wrong as that might result in execution of callbacks belonging to now-deleted objects).
2022-10-31 18:57:54 +03:00
chss95cs@gmail.com 06bfd624de fix failed debug build from loops variable assert 2022-10-30 12:33:08 -07:00
chss95cs@gmail.com bff264b5fd Fixed RtlCompareString and RtlCompareStringN, they were very wrong, for CompareString the params are struct ptrs not char ptrs
Fixed a ton of clang-cl compiler warnings about unused variables, still many left. Fixed a lot of inconsistent override ones too
2022-10-30 10:47:09 -07:00
chrisps 65b9d93551
Merge branch 'xenia-canary:canary_experimental' into canary_experimental 2022-10-30 09:05:40 -07:00
chss95cs@gmail.com 550d1d0a7c use much faster exp2/cos approximations in ffmpeg, large decrease in cpu usage on my machine on decoder thread
properly byteswap r13 for spinlock
Add PPCOpcodeBits
stub out broken fpscr updating in ppc_hir_builder. it's just code that repeatedly does nothing right now.
add note about 0 opcode bytes being executed to ppc_frontend
Add assert to check that function end is greater than function start, can happen with malformed functions
Disable prefetch and cachecontrol by default, automatic hardware prefetchers already do the job for the most part
minor cleanup in simplification_pass, dont loop optimizations, let the pass manager do it for us
Add experimental "delay_via_maybeyield" cvar, which uses MaybeYield to "emulate" the db16cyc instruction
Add much faster/simpler way of directly calling guest functions, no longer have to do a byte by byte search through the generated code
Generate label string ids on the fly
Fix unused function warnings for prefetch on clang, fix many other clang warnings
Eliminated majority of CallNativeSafes by replacing them with naive generic code paths.
^ Vector rotate left, vector shift left, vector shift right, vector shift arithmetic right, and vector average are included
These naive paths are implemented small loops that stash the two inputs to the stack and load them in gprs from there, they are not particularly fast but should be an order of magnitude faster than callnativesafe
to a host function, which would involve a call, stashing all volatile registers, an indirect call, potentially setting up a stack frame for the arrays that the inputs get stashed to, the actual operations, a return, loading all volatile registers, a return, etc
Added the fast SHR_V128 path back in
Implement signed vector average byte, signed vector average word. previously we were emitting no code for them. signed vector average byte appears in many games
Fix bug with signed vector average 32, we were doing unsigned shift, turning negative values into big positive ones potentially
2022-10-30 08:48:58 -07:00
Gliniak 55877f4c61 [APU] Force buffer swap at the end of stream
Plus some debugging messages and lint fixes
2022-10-25 17:20:45 +02:00
Gliniak 6b11787c93 [APU] Fixed typo that prevented last packet in stream to be processed 2022-10-24 21:33:25 +02:00
Gliniak fac2a89d0f Disallow offset to be set before header, header size fix, audio channels crashfix 2022-10-24 19:43:43 +02:00
Triang3l a37b57ca8d [GPU] Fix tiled mip tail extent calculation
Previously, for mips, the dimensions of the texture weren't rounded to powers of two before calculating the mip tail extent, resulting in the mip tail for a 260 blocks tall texture, that contains mips ending at Y of up to 36, having the Y extent calculated as 32. With rounding to powers of two, it would have been 64.

However, with the GetTiledAddressUpperBound functions, none of this is necessary at all (and neither is rounding the extents in TextureGuestLayout::Level to 32x32x4 blocks) - using the same code for calculating the XYZ extents of tiled textures as for linear textures now, which, for the mip tail, calculates the actual maximum coordinates of the mips stored in it - and rounding to tiles is done internally by GetTiledAddressUpperBound.
2022-10-23 21:26:47 +03:00
Triang3l 74f1f6bb6d [Vulkan] Check depthClamp feature 2022-10-23 19:01:17 +03:00
Triang3l 4add1f67b1 [D3D12] Replace unused shared memory view with a null view
Fixes the PIX validation warning about missing resource states on every guest draw. Also potentially prevents drivers from making assumptions about the shared memory buffer based on the bindings, though no such cases are currently known.
2022-10-23 18:09:47 +03:00
Wunkolo 5fde7c6aa5 [x64] Add AVX512 optimizations for `PERMUTE_V128`
Uses the single-instruction AVX512 `vperm*` instructions to accelerate
the `INT8_TYPE` and `INT16_TYPE` permutation opcodes.

The `INT8_TYPE` is accelerated using `AVX512VBMI` subset of AVX512.
Available since Icelake(Intel) and Zen4(AMD).
2022-10-21 08:47:31 -05:00
Wunkolo f207239349 [x64] Add `kX64EmitAVX512VBMI` feature-flag and detection
Allows access to byte-element 2-register permutations(32-byte look up
tables) and for 64-bit multi-shifts.
Particularly adding this to accelerate the assembly of our `PERMUTE`
opcode.
2022-10-21 08:47:31 -05:00
Wunkolo d73088e5ca [x64] Add AVX512 optimization for `OPCODE_VECTOR_SUB`(saturated)
Passes the `vsubuws` and `vsubsws` unit-tests from https://github.com/xenia-project/xenia/pull/1348
2022-10-21 08:45:43 -05:00
Radosław Gliński 7c375879bc
Merge pull request #85 from chrisps/canary_experimental
Kernel improvements, "fix" crash on sandy bridge/ivy bridge
2022-10-21 14:18:03 +02:00
chrisps 4493d17acc
Update hir_builder.cc 2022-10-20 14:58:27 -07:00
chrisps adc3405537
change else{if} to else if in AndNot 2022-10-20 14:56:55 -07:00
Gliniak 48fea6d9aa Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental 2022-10-18 12:19:52 +02:00
Triang3l cdb40ddb28 [DXBC] Fix interpolator copying from v# to r# in PS
The bit count was of `(1<<i)-1` itself (thus couldn't handle interpolators with a smaller index skipped), not of `bits&((1<<i)-1)`.
2022-10-18 13:12:37 +03:00
chss95cs@gmail.com d8b7b3ecec Fix bindless path in d3d12 that i broke in earlier commit (did not affect any users, thats a debug thing)
Fix guest code profiler, it previously only worked with function precomp + all code you were about to execute already discovered
Allow AndNot if type is V128
2022-10-16 07:48:43 -07:00
chss95cs@gmail.com 22e52cbecd Canary can now run on sandy bridge/e and ivy bridge/e
Stubbed out OPCODE_AND_NOT, its fallback implementation if bmi1 was not supported was broken. it's difficult to tell where the actual issue is there.
2022-10-15 05:14:53 -07:00
chss95cs@gmail.com 7204532b1c Implement RtlUpcaseUnicodeChar 2022-10-15 04:29:13 -07:00
chrisps a495709344
Merge branch 'xenia-canary:canary_experimental' into canary_experimental 2022-10-15 03:07:35 -07:00
chss95cs@gmail.com efbeae660c Drastically reduce cpu time wasted by XMADecoderThread spinning, went from 13% of all cpu time to about 0.6% in my tests
Commented out lock in WatchMemoryRange, lock is always held by caller
properly set the value/check the irql for spinlocks in xboxkrnl_threading
2022-10-15 03:07:07 -07:00
Gliniak d262214c1b Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental 2022-10-14 20:13:03 +02:00
chss95cs@gmail.com ecf6bfbbdf Stub event query for linux, fix missing semicolon in linux SetEventBoostPriority 2022-10-09 12:30:18 -07:00
Triang3l 45050b2380 [GPU] Vulkan fragment shader interlock RB and related fixes/cleanup
Also fixes addressing of MSAA samples 2 and 3 for 64bpp color render targets in the ROV RB implementation on Direct3D 12.
Additionally, with FSI/ROV, alpha test and alpha to coverage are done only if the render target 0 was dynamically written to (according to the Direct3D 9 rules for writing to color render targets, though not sure if they actually apply to the alpha tests on Direct3D 9, but for safety).
There is also some code cleanup for things spotted during the development of the feature.
2022-10-09 22:06:41 +03:00
Gliniak c923ab78a9 [Config] Added note about internal_display_resolution 2022-10-09 12:33:04 +02:00
Gliniak 7975ea78d4 [Base] BitStream: Prevent readout beyond buffer 2022-10-09 12:24:46 +02:00
Gliniak 17b3939bbf Revert "[Base] Changed size of bitstream accessed data (Risky)"
This reverts commit 061000af01.
2022-10-09 12:18:43 +02:00
chss95cs@gmail.com 2dd6f33f4b Fix debug/ui premake too 2022-10-08 10:34:50 -07:00
chss95cs@gmail.com d8c94b1aee Fix premake filter mistake that broke debug builds (and likely any build other than release) 2022-10-08 10:10:36 -07:00
chss95cs@gmail.com 8f7f7dc6ad fixed wine crash from use of NtSetEventPriorityBoost
add xe::clear_lowest_bit, use it in place of shift-andnot in some bit iteration code
make is_allocated_ and is_enabled_ volatile in xma_context
preallocate avpacket buffer in XMAContext::Setup, the reallocations of the buffer in ffmpeg were showing up on profiles
check is_enabled and is_allocated BEFORE locking an xmacontext. XMA worker was spending most of its time locking and unlocking contexts
Removed XeDMAC, dma:: namespace. It was a bad idea and I couldn't make it work in the end. Kept vastcpy and moved it to the memory namespace instead
Made the rest of global_critical_region's members static. They never needed an instance.
Removed ifdef'ed out code from ring_buffer.h
Added EventInfo struct to threading, added Event::Query to aid with implementing NtQueryEvent.
Removed vector from WaitMultiple, instead use a fixed array of 64 handles that we populate. WaitForMultipleObjects cannot handle more than 64 objects.
Remove XE_MSVC_OPTIMIZE_SMALL() use in x64_sequences, x64 backend is now always size optimized because of premake
Make global_critical_region_ static constexpr in shared_memory.h to get rid of wasteage of 8 bytes (empty class=1byte, +alignment for next member=8)
Move trace-related data to the tail of SharedMemory to keep more important data together
In IssueDraw build an array of fetch constant addresses/sizes, then pre-lock the global lock before doing requestrange for each instead of individually locking within requestrange for each of them
Consistent access specifier protected for pm4_command_processor_declare
Devirtualize WriteOneRegisterFromRing.
Move ExecutePacket and ExecutePrimaryBuffer to pm4_command_buffer_x
Remove many redundant header inclusions access xenia-gpu
Minor microoptimization of ExecutePacketType0

Add TextureCache::RequestTextures for batch invocation of LoadTexturesData

Add TextureCache::LoadTexturesData for reducing the number of times we release and reacquire the global lock.
Ideally you should hold the global lock for as little time as possible, but if you are constantly acquiring and releasing it you are actually more likely to have contention
Add already_locked param to ObjectTable::LookupObject to help with reducing lock acquire/release pairs
Add missing checks to XAudioRegisterRenderDriverClient_entry. this is unlikely to fix anything, it was just an easy thing to do
Add NtQueryEvent system call implementation. I don't actually know of any games that need it.
Instead of using std::vector + push_back in KeWaitForMultipleObjects and xeNtWaitForMultipleObjectsEx use a fixed size array of 64 and track the count. More than 64 objects is not permitted by the kernel. The repeated reallocations from push_back were appearing unusually high on the profiler, but were masked until now by waitformultipleobjects natural overhead
Pre-lock the global lock before looking up each handle for xeNtWaitForMultipleObjectsEx and KeWaitForMultipleObjects.
Pre-lock before looking up the signal and waiter in NtSignalAndWaitForSingleObjectEx
add missing checks to NtWaitForMultipleObjectsEx
Support pre-locking in XObject::GetNativeObject
2022-10-08 09:55:17 -07:00
chss95cs@gmail.com bae63b95c5 Update to latest version of cxxopts 2022-09-30 06:51:25 -07:00
chss95cs@gmail.com b4c175d8a3 Enable SDL_LEAN_AND_MEAN, SDL_RENDER_DISABLED, saves about 500kb in final exe
Build several projects that arent performance critical with /Os and /O1 under msvc windows
2022-09-29 07:26:38 -07:00
chss95cs@gmail.com 7e58a3b320 Fix compiler errors i introduced under clang-cl
remove xe_kernel_export_shim_fn field of Export function_data, trampoline is now the only way exports get invoked
Remove kernelstate argument from string functions in order to conform to the trampoline signature (the argument was unused anyway)
Constant-evaluated initialization of ppc_opcode_disasm_table, removal of unused std::vector fields
Constant-evaluated initialization of export tables
name field on export is just a const char* now, only immutable static strings are ever passed to it
Remove unused callcount field of export.
PM4 compare op function extracted
Globally apply /Oy, /GS-, /Gw on msvc windows
Remove imgui testwindow code call, it took up like 300 kb
2022-09-29 07:04:17 -07:00
Gliniak 203267b106 Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental 2022-09-23 12:23:53 +02:00
Rick Gibbed 3bfa3b05e1
Lint fix. 2022-09-22 06:34:21 -05:00
Gliniak 7d970967c4 Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental 2022-09-20 21:15:12 +02:00
beeanyew cd17f1846f [Input System] xe_mutex revert
On request from chrisps, revert xe_mutex back to xe_unlikely_mutex to avoid mutex deadlocks while initializing hid-winkey.
2022-09-18 15:18:29 +02:00
chss95cs@gmail.com eb8154908c atomic cas use prefetchw if available
remove useless memorybarrier
remove double membarrier in wait pm4 cmd
add int64 cvar
use int64 cvar for x64 feature mask
Rework some functions that were frontend bound according to vtune placing some of their code in different noinline functions, profiling after indicating l1 cache misses decreased and perf of func increased
remove long vpinsrd dep chain code for conversion.h, instead do normal load+bswap or movbe if avail
Much faster entry table via split_map, code size could be improved though
GetResolveInfo was very large and had impact on icache, mark callees as noinline + msvc pragma optimize small
use log2 shifts instead of integer divides in memory
minor optimizations in PhysicalHeap::EnableAccessCallbacks, the majority of time in the function is spent looping, NOT calling Protect! Someone should optimize this function and rework the algo completely
remove wonky scheduling log message, it was spammy and unhelpful
lock count was unnecessary for criticalsection mutex, criticalsection is already a recursive mutex
brief notes i gotta run
2022-09-17 04:04:53 -07:00
Wunkolo addd8c94e5 [x64] Add AVX512 optimization for `OPCODE_VECTOR_ADD`(saturated)
Uses a single `vpternlogd` to test for signed/unsigned
overflow/underflow. Then utilizes AVX512 mask operations to create
either `0x7FFFFFFF` or `0x80000000` arithmetically.
2022-09-14 11:39:03 -05:00
Wunkolo 9fd684594b [x64] Add AVX512 optimization for `OPCODE_VECTOR_CONVERT_F2I`(unsigned)
`vcvttps2udq` already saturates overflowing and unordered values to `0xFFFFFFFF`. Using mask registers, zeroes are written to negative values within the same instruction.
2022-09-12 13:52:57 -05:00
chss95cs@gmail.com 0fd4a2533b Prevent clang-format from moving d3d12_nvapi above the require d3d12 headers 2022-09-11 14:35:33 -07:00
chss95cs@gmail.com 20638c2e61 use Sleep(0) instead of SwitchToThread, should waste less power and help the os with scheduling.
PM4 buffer handling made a virtual member of commandprocessor, place the implementation/declaration into reusable macro files. this is probably the biggest boost here.
Optimized SET_CONSTANT/ LOAD_CONSTANT pm4 ops based on the register range they start writing at, this was also a nice boost

Expose X64 extension flags to code outside of x64 backend, so we can detect and use things like avx512, xop, avx2, etc in normal code
Add freelists for HIR structures to try to reduce the number of last level cache misses during optimization (currently disabled... fixme later)

Analyzed PGO feedback and reordered branches, uninlined functions, moved code out into different functions based on info from it in the PM4 functions, this gave like a 2% boost at best.

Added support for the db16cyc opcode, which is used often in xb360 spinlocks. before it was just being translated to nop, now on x64 we translate it to _mm_pause but may change that in the future to reduce cpu time wasted

texture util - all our divisors were powers of 2, instead we look up a shift. this made texture scaling slightly faster, more so on intel processors which seem to be worse at int divs. GetGuestTextureLayout is now a little faster, although it is still one of the heaviest functions in the emulator when scaling is on.

xe_unlikely_mutex was not a good choice for the guest clock lock, (running theory) on intel processors another thread may take a significant time to update the clock? maybe because of the uint64 division? really not sure, but switched it to xe_mutex. This fixed audio stutter that i had introduced to 1 or 2 games, fixed performance on that n64 rare game with the monkeys.
Took another crack at DMA implementation, another failure.
Instead of passing as a parameter, keep the ringbuffer reader as the first member of commandprocessor so it can be accessed through this
Added macro for noalias
Applied noalias to Memory::LookupHeap. This reduced the size of the executable by 7 kb.
Reworked kernel shim template, this shaved like 100kb off the exe and eliminated the indirect calls from the shim to the actual implementation. We still unconditionally generate string representations of kernel calls though :(, unless it is kHighFrequency

Add nvapi extensions support, currently unused. Will use CPUVISIBLE memory at some point
Inserted prefetches in a few places based on feedback from vtune.
Add native implementation of SHA int8 if all elements are the same

Vectorized comparisons for SetViewport, SetScissorRect
Vectorized ranged comparisons for WriteRegister
Add XE_MSVC_ASSUME
Move FormatInfo::name out of the structure, instead look up the name in a different table. Debug related data and critical runtime data are best kept apart
Templated UpdateSystemConstantValues based on ROV/RTV and primitive_polygonal
Add ArchFloatMask functions, these are for storing the results of floating point comparisons without doing costly float->int pipeline transfers (vucomiss/setb)
Use floatmasks in UpdateSystemConstantValues for checking if dirty, only transfer to int at end of function.
Instead of dirty |= (x == y) in UpdateSystemConstantValues, now we do dirty_u32 |= (x^y). if any of them are not equal, dirty_u32 will be nz, else if theyre all equal it will be zero. This is more friendly to register renaming and the lack of dependencies on EFLAGS lets the compiler reorder better
Add PrefetchSamplerParameters to D3D12TextureCache
use PrefetchSamplerParameters in UpdateBindings to eliminate cache misses that vtune detected

Add PrefetchTextureBinding to D3D12TextureCache
Prefetch texture bindings to get rid of more misses vtune detected (more accesses out of order with random strides)
Rewrote DMAC, still terrible though and have disabled it for now.
Replace tiny memcmp of 6 U64 in render_target_cache with inline loop, msvc fails to make it a loop and instead does a thunk to their memcmp function, which is optimized for larger sizes

PrefetchTextureBinding in AreActiveTextureSRVKeysUpToDate
Replace memcmp calls for pipelinedescription with handwritten cmp
Directly write some registers that dont have special handling in PM4 functions
Changed EstimateMaxY to try to eliminate mispredictions that vtune was reporting, msvc ended up turning the changed code into a series of blends

in ExecutePacketType3_EVENT_WRITE_EXT, instead of writing extents to an array on the stack and then doing xe_copy_and_swap_16 of the data to its dest, pre-swap each constant and then store those. msvc manages to unroll that into wider stores
stop logging XE_SWAP every time we receive XE_SWAP, stop logging the start and end of each viz query

Prefetch watch nodes in FireWatches based on feedback from vtune
Removed dead code from texture_info.cc
NOINLINE on GpuSwap, PGO builds did it so we should too.
2022-09-11 14:14:48 -07:00