xenia-canary

Commit Graph

Author	SHA1	Message	Date
chrisps	0f94eb21c2	Merge pull request #102 from chrisps/error_modules_and_threads_plus_xmacontext_workaround Host exception improvements, bandaid over div by 0 crash	2022-12-10 09:09:52 -08:00
chss95cs@gmail.com	7d49b97e4c	Print any module name+ offset in host exception reports print thread name in host exception reports trying to force win32 error descriptions to english Return if output buffer block count is 0 in XmaContext, this is an attempt to fix a divide by zero crash many users have reported	2022-12-09 12:24:06 -08:00
Gliniak	7c5da821d4	[Kernel] Fixed invalid thread pointer in KeEnableFpuExceptions	2022-12-08 21:48:13 +01:00
Radosław Gliński	747fb42bdf	Merge pull request #98 from AdrianCassar/canary_experimental Added a hotkey to open the previously played title	2022-12-05 18:09:52 +01:00
MoistyMarley	6d2724a861	Added a hotkey to open the previously played title	2022-12-05 17:01:16 +00:00
chrisps	85723f117d	Merge pull request #99 from chrisps/stack_sync2_fence_krnl_hostexcept Improve stack sync, kernel fixes, better host exception reporting	2022-12-04 13:50:06 -08:00
chss95cs@gmail.com	a63f424c0a	Directly check PEB for IsDebuggerAttached Add constexpr getters to magicdiv class so it can be used from jitted x64/dxbc Track the guest return address as well for guest/host sync, if multiple entries have the same guest stack find the first one with a matching guest retaddr. this fixes epic mickey 2 (which the previous guest-stack change had allowed to go ingame for a bit) and potentially also a crash in fable3. Break if under debugger when stackpoints are overflowed Add much more useful output for host exceptions, print out xenia_canary.exe relative offsets if exception is in module, formatmessage for ntstatus/win32err, strerror Minor d3d12 microoptimization, instead of doing SetEventOnCompletion + WaitForSingleObject do SetEventOnCompletion w/ nullptr so that the wait happens in kernel mode, avoiding two extra context switches add unimplemented kernel functions: ExAllocatePoolWithTag ObReferenceObject ObDereferenceObject has no return value. Log a message when ObDereferenceObject/Reference receive unregistered guest kernel objects gave ObLookupThreadByThreadId its correct error status hoist object_types initialization out of ObReferenceObjectByHandle Fix out parameter values on error for a few kernel funcs add note about msr to KeSetCurrentStackPointers add X_STATUS_OBJECT_TYPE_MISMATCH check for xeNtSetEvent add msr_mask field to X_KPCR	2022-12-04 12:38:19 -08:00
Triang3l	e97eb75b94	[Vulkan] Update variableMultisampleRate comments (actually supported) [ci skip]	2022-12-04 14:55:56 +03:00
Triang3l	0b4f5ef286	[SPIR-V] Decorate whole gl_PerVertex with Invariant Block members can be decorated with Invariant only since SPIR-V 1.5 Revision 2. In earlier versions, Invariant can be used only for variables. Mesa warns about this.	2022-12-03 14:27:43 +03:00
Gliniak	1eb61aa9ab	Added reccently opened titles list	2022-11-29 10:47:30 +01:00
chrisps	0674b68143	Merge pull request #96 from chrisps/host_guest_stack_synchronization Host/Guest stack sync, exception messagebox, kernel improvements, minor opt	2022-11-27 10:30:16 -08:00
Gliniak	12005acc98	[APU] Check if splitted frame length is valid	2022-11-27 18:40:27 +01:00
chss95cs@gmail.com	90c771526d	"Fix" debug console, we were checking the cvar before any cvars were loaded, and the condition it checks in AttachConsole is somehow always false Remove dead #if 0'd code in math.h On amd64, page_size == 4096 constant, on amd64 w/ win32, allocation_granularity == 65536. These values for x86 windows havent changed over the last 20 years so this is probably safe and gives a modest code size reduction Enable XE_USE_KUSER_SHARED. This sources host time from KUSER_SHARED instead of from QueryPerformanceCounter, which is far faster, but only has a granularity of 100 nanoseconds. In some games seemingly random crashes were happening that were hard to trace because the faulting thread was actually not the one that was misbehaving, another threads stack was underflowing into the faulting thread. Added a bunch of code to synchronize the guest stack and host stack so that if a guest longjmps the host's stack will be adjusted. Changes were also made to allow the guest to call into a piece of an existing x64 function. This synchronization might have a slight performance impact on lower end cpus, to disable it set enable_host_guest_stack_synchronization to false. It is possible it may have introduced regressions, but i dont know of any yet So far, i know the synchronization change fixes the "hub crash" in super sonic and allows the game "london 2012" to go ingame. Removed emit_useless_fpscr_updates, not emitting these updates breaks the raiden game MapGuestAddressToMachineCode now returns nullptr if no address was found, instead of the start of the function add Processor::LookupModule Add Backend::DeinitializeBackendContext Use WriteRegisterRangeFromRing_WithKnownBound<0, 0xFFFF> in WriteRegisterRangeFromRing for inlining (previously regressed on performance of ExecutePacketType0) add notes about flags that trap in XamInputGetCapabilities 0 == 3 in XamInputGetCapabilities Name arg 2 of XamInputSetState PrefetchW in critical section kernel funcs if available & doing cmpxchg Add terminated field to X_KTHREAD, set it on termination Expanded the logic of NtResumeThread/NtSuspendThread to include checking the type of the handle (in release, LookupObject doesnt seem to do anything with the type) and returning X_STATUS_OBJECT_TYPE_MISMATCH if invalid. Do termination check in NtSuspendThread. Add basic host exception messagebox, need to flesh it out more (maybe use the new stack tracking stuff if on guest thrd?) Add rdrand patching hack, mostly affects users with nvidia cards who have many threads on zen Use page_size_shift in more places Once again disable precompilation! Raiden is mostly weird ppc asm which probably breaks the precompilation. The code is still useful for running the compiler over the whole of an xex in debug to test for issues "Fix" debug console, we were checking the cvar before any cvars were loaded, and the condition it checks in AttachConsole is somehow always false Remove dead #if 0'd code in math.h On amd64, page_size == 4096 constant, on amd64 w/ win32, allocation_granularity == 65536. These values for x86 windows havent changed over the last 20 years so this is probably safe and gives a modest code size reduction Enable XE_USE_KUSER_SHARED. This sources host time from KUSER_SHARED instead of from QueryPerformanceCounter, which is far faster, but only has a granularity of 100 nanoseconds. In some games seemingly random crashes were happening that were hard to trace because the faulting thread was actually not the one that was misbehaving, another threads stack was underflowing into the faulting thread. Added a bunch of code to synchronize the guest stack and host stack so that if a guest longjmps the host's stack will be adjusted. Changes were also made to allow the guest to call into a piece of an existing x64 function. This synchronization might have a slight performance impact on lower end cpus, to disable it set enable_host_guest_stack_synchronization to false. It is possible it may have introduced regressions, but i dont know of any yet So far, i know the synchronization change fixes the "hub crash" in super sonic and allows the game "london 2012" to go ingame. Removed emit_useless_fpscr_updates, not emitting these updates breaks the raiden game MapGuestAddressToMachineCode now returns nullptr if no address was found, instead of the start of the function add Processor::LookupModule Add Backend::DeinitializeBackendContext Use WriteRegisterRangeFromRing_WithKnownBound<0, 0xFFFF> in WriteRegisterRangeFromRing for inlining (previously regressed on performance of ExecutePacketType0) add notes about flags that trap in XamInputGetCapabilities 0 == 3 in XamInputGetCapabilities Name arg 2 of XamInputSetState PrefetchW in critical section kernel funcs if available & doing cmpxchg Add terminated field to X_KTHREAD, set it on termination Expanded the logic of NtResumeThread/NtSuspendThread to include checking the type of the handle (in release, LookupObject doesnt seem to do anything with the type) and returning X_STATUS_OBJECT_TYPE_MISMATCH if invalid. Do termination check in NtSuspendThread. Add basic host exception messagebox, need to flesh it out more (maybe use the new stack tracking stuff if on guest thrd?) Add rdrand patching hack, mostly affects users with nvidia cards who have many threads on zen Use page_size_shift in more places Once again disable precompilation! Raiden is mostly weird ppc asm which probably breaks the precompilation. The code is still useful for running the compiler over the whole of an xex in debug to test for issues	2022-11-27 09:39:33 -08:00
Gliniak	1451ca4266	[APU] Clear host data while reseting context	2022-11-27 17:00:31 +01:00
Gliniak	9fdfd2ada9	[APU] Removed old hack that invalidates input on decoder error Added returning parsing error while decoder fails	2022-11-26 17:25:39 +01:00
Joel Linn	7dd715ea6f	[CI, Drone] Disable HighResolutionTimer test cases	2022-11-20 16:41:55 -06:00
chrisps	6e541536dd	Merge pull request #93 from chrisps/canary_experimental add some missing kthread fields, fix assert eval in release	2022-11-07 14:49:31 -08:00
chss95cs@gmail.com	7a17fad88a	fix crash from precompiling out of range funcs, add xexcache version, increment xexcache version (all priors are version 0 thanks to 0 initialization)	2022-11-07 05:40:18 -08:00
chss95cs@gmail.com	e21fd22d09	add x_kthread priority/fpu_exceptions_on fields, set fpu_exceptions_on in KeEnableFpuExceptions, set priority in SetPriority add msr field on context write to msr for mtmsr/mfmsr, do not have correct default value for msr yet, nor has mtmsrd been reimplemented do not evaluate assert expressions in release at all, while still avoiding unused variable warnings	2022-11-06 11:03:10 -08:00
chrisps	3dcbd25e7f	Merge pull request #92 from chrisps/canary_experimental ffmpeg decoder optimizations, kernel fixes, cpu backend fixes, clang warnings, implement some missing kernel functions	2022-11-05 11:59:35 -07:00
chss95cs@gmail.com	c70ae76a69	hopefully switched cxxopts to the main master branch now that the selectany changes are accepted	2022-11-05 11:08:04 -07:00
chss95cs@gmail.com	c1d922eebf	Minor decoder optimizations, kernel fixes, cpu backend fixes	2022-11-05 10:50:33 -07:00
Gliniak	ba66373d8c	[APU][Janky] Fixed issues with incorrect frames on streamed data This requires a lot more research and test data!	2022-11-03 20:56:36 +01:00
Gliniak	dae508500a	[APU] Clear remaining packets skip when we're done with current stream Plus some additional logging	2022-11-03 12:59:47 +01:00
Margen67	4ba14bc35e	[APU+HID] Optimizations	2022-11-03 03:56:13 -07:00
Gliniak	b23566b823	[APU] Fix incorrect packet frame count when frame ends exactly where packet ends This resolves looping background sound in GoW	2022-11-03 11:14:37 +01:00
Gliniak	259679d53c	[APU] Handle exceeding input offset by switching buffer This should resolve crashes in FH	2022-11-02 08:47:36 +01:00
chrisps	ff0f3fcc9d	Merge pull request #89 from xenia-canary/revert-87-canary_experimental Revert "Minor decoder optimizations, kernel fixes, cpu backend fixes"	2022-11-01 14:46:55 -07:00
chrisps	8186792113	Revert "Minor decoder optimizations, kernel fixes, cpu backend fixes"	2022-11-01 14:45:36 -07:00
chrisps	781871e2d5	Merge pull request #87 from chrisps/canary_experimental Minor decoder optimizations, kernel fixes, cpu backend fixes	2022-11-01 11:49:10 -07:00
Gliniak	c080e2e17c	[APU] Resolved crashes related to out of bound readouts	2022-11-01 11:24:01 +01:00
Triang3l	778333b1b5	[UI] Fix ClearInput not called in ImGuiDrawer after deferred dialog removal Also cleanup the code involved in dialog registration, and update the explanation of why dialog removal is delayed until the end of drawing (the original was written back when window listener and UI drawer callback registration during the execution of the callbacks was deferred, but that was wrong as that might result in execution of callbacks belonging to now-deleted objects).	2022-10-31 18:57:54 +03:00
chss95cs@gmail.com	06bfd624de	fix failed debug build from loops variable assert	2022-10-30 12:33:08 -07:00
chss95cs@gmail.com	941237027d	fix ffmpeg submodule ptr	2022-10-30 11:16:05 -07:00
chss95cs@gmail.com	bff264b5fd	Fixed RtlCompareString and RtlCompareStringN, they were very wrong, for CompareString the params are struct ptrs not char ptrs Fixed a ton of clang-cl compiler warnings about unused variables, still many left. Fixed a lot of inconsistent override ones too	2022-10-30 10:47:09 -07:00
chrisps	65b9d93551	Merge branch 'xenia-canary:canary_experimental' into canary_experimental	2022-10-30 09:05:40 -07:00
chss95cs@gmail.com	f5cc54bdae	Fix building on clang-cl, it did not like the cxxopts selectany changes	2022-10-30 09:05:10 -07:00
chss95cs@gmail.com	4fc18949a2	Merge branch 'canary_experimental' of https://github.com/chrisps/xenia-canary into canary_experimental	2022-10-30 08:55:53 -07:00
chss95cs@gmail.com	550d1d0a7c	use much faster exp2/cos approximations in ffmpeg, large decrease in cpu usage on my machine on decoder thread properly byteswap r13 for spinlock Add PPCOpcodeBits stub out broken fpscr updating in ppc_hir_builder. it's just code that repeatedly does nothing right now. add note about 0 opcode bytes being executed to ppc_frontend Add assert to check that function end is greater than function start, can happen with malformed functions Disable prefetch and cachecontrol by default, automatic hardware prefetchers already do the job for the most part minor cleanup in simplification_pass, dont loop optimizations, let the pass manager do it for us Add experimental "delay_via_maybeyield" cvar, which uses MaybeYield to "emulate" the db16cyc instruction Add much faster/simpler way of directly calling guest functions, no longer have to do a byte by byte search through the generated code Generate label string ids on the fly Fix unused function warnings for prefetch on clang, fix many other clang warnings Eliminated majority of CallNativeSafes by replacing them with naive generic code paths. ^ Vector rotate left, vector shift left, vector shift right, vector shift arithmetic right, and vector average are included These naive paths are implemented small loops that stash the two inputs to the stack and load them in gprs from there, they are not particularly fast but should be an order of magnitude faster than callnativesafe to a host function, which would involve a call, stashing all volatile registers, an indirect call, potentially setting up a stack frame for the arrays that the inputs get stashed to, the actual operations, a return, loading all volatile registers, a return, etc Added the fast SHR_V128 path back in Implement signed vector average byte, signed vector average word. previously we were emitting no code for them. signed vector average byte appears in many games Fix bug with signed vector average 32, we were doing unsigned shift, turning negative values into big positive ones potentially	2022-10-30 08:48:58 -07:00
Gliniak	55877f4c61	[APU] Force buffer swap at the end of stream Plus some debugging messages and lint fixes	2022-10-25 17:20:45 +02:00
Gliniak	6b11787c93	[APU] Fixed typo that prevented last packet in stream to be processed	2022-10-24 21:33:25 +02:00
Gliniak	fac2a89d0f	Disallow offset to be set before header, header size fix, audio channels crashfix	2022-10-24 19:43:43 +02:00
Triang3l	a37b57ca8d	[GPU] Fix tiled mip tail extent calculation Previously, for mips, the dimensions of the texture weren't rounded to powers of two before calculating the mip tail extent, resulting in the mip tail for a 260 blocks tall texture, that contains mips ending at Y of up to 36, having the Y extent calculated as 32. With rounding to powers of two, it would have been 64. However, with the GetTiledAddressUpperBound functions, none of this is necessary at all (and neither is rounding the extents in TextureGuestLayout::Level to 32x32x4 blocks) - using the same code for calculating the XYZ extents of tiled textures as for linear textures now, which, for the mip tail, calculates the actual maximum coordinates of the mips stored in it - and rounding to tiles is done internally by GetTiledAddressUpperBound.	2022-10-23 21:26:47 +03:00
Triang3l	74f1f6bb6d	[Vulkan] Check depthClamp feature	2022-10-23 19:01:17 +03:00
Triang3l	4add1f67b1	[D3D12] Replace unused shared memory view with a null view Fixes the PIX validation warning about missing resource states on every guest draw. Also potentially prevents drivers from making assumptions about the shared memory buffer based on the bindings, though no such cases are currently known.	2022-10-23 18:09:47 +03:00
Wunkolo	5fde7c6aa5	[x64] Add AVX512 optimizations for `PERMUTE_V128` Uses the single-instruction AVX512 `vperm*` instructions to accelerate the `INT8_TYPE` and `INT16_TYPE` permutation opcodes. The `INT8_TYPE` is accelerated using `AVX512VBMI` subset of AVX512. Available since Icelake(Intel) and Zen4(AMD).	2022-10-21 08:47:31 -05:00
Wunkolo	f207239349	[x64] Add `kX64EmitAVX512VBMI` feature-flag and detection Allows access to byte-element 2-register permutations(32-byte look up tables) and for 64-bit multi-shifts. Particularly adding this to accelerate the assembly of our `PERMUTE` opcode.	2022-10-21 08:47:31 -05:00
Wunkolo	d73088e5ca	[x64] Add AVX512 optimization for `OPCODE_VECTOR_SUB`(saturated) Passes the `vsubuws` and `vsubsws` unit-tests from https://github.com/xenia-project/xenia/pull/1348	2022-10-21 08:45:43 -05:00
Radosław Gliński	7c375879bc	Merge pull request #85 from chrisps/canary_experimental Kernel improvements, "fix" crash on sandy bridge/ivy bridge	2022-10-21 14:18:03 +02:00
chrisps	4493d17acc	Update hir_builder.cc	2022-10-20 14:58:27 -07:00

... 2 3 4 5 6 ...

7325 Commits All Branches Search

7325 Commits

All Branches