Externally it is still applied via: ApplyTitleUpdate
But internally it is splitted into:
- FindTitleUpdate
- LoadTitleUpdate
- IsPatchSignatureProper
- ApplyTitleUpdate
Added host window with warning about TU incompatibility
* Add ifdef check before the Microsoft-specific movsq in memory.cc
* Added ifdef before Microsoft-specific movsq and replaced with memcpy in other cases. In memory.cc
* Update image_sha_bytes_ from 16 to 20 xex_module.h
The value 16 is less than the expected value 20, causing a buffer overflow during sha1 finalization.
* Update image_sha_bytes_ loop from 16 to 20 iterations xex_module.cc
* Update mapped_memory_posix.cc: Must resize file to map_length.
* Should not map nullptr with MAP_FIXED flag. Update memory_posix.cc.
add typed guest pointer template
add X_KSPINLOCK, rework spinlock functions.
rework irql related code, use irql on pcr instead of on XThread
add guest linked list helper functions
renamed ProcessInfoBlock to X_KPROCESS
assigned names to many kernel structure fields
AVX512 has native unsigned integer comparisons instructions, removing
the need to XOR the most-significant-bit with a constant in memory to
use the signed comparison instructions. These instructions only write to
a k-mask register though and need an additional call to `vpmovm2*` to
turn the mask-register into a vector-mask register.
As of Icelake:
`vpcmpu*` is all L3/T1
`vpmovm2d` is L1/T0.33
`vpmovm2{b,w}` is L3/T0.33
As of Zen4:
`vpcmpu*` is all L3/T0.50
`vpmovm2*` is all L1/T0.25
Other half of #2125. I don't know of any title that utilizes this instruction, but I went ahead and implemented it for completeness.
Verified the implementation with `instr__gen_vsubcuw` from #1348. Can be grabbed with:
```
git checkout origin/gen_tests -- src\xenia\cpu\ppc\testing\*vsubcuw.s
```
I don't know of any title that utilizes this instruction, but I went
ahead and implemented it for completeness.
Verified the implementation with `instr__gen_vaddcuw` from #1348. Can be
grabbed with:
```
git checkout origin/gen_tests -- src\xenia\cpu\ppc\testing\*vaddcuw.s
```
The "close window" keyboard hotkey (Guide-B) now toggles between loglevel -1 and the loglevel set in your config.
Added LoggerBatch class, which accumulates strings into the threads scratch buffer. This is only intended to be used for very high frequency debug logging. if it exhausts the thread buffer, it just silently stops.
Cleaned nearly 8 years of dust off of the pm4 packet disassembler code, now supports all packets that the command processor supports.
Added extremely verbose logging for gpu register writes. This is not compiled in outside of debug builds, requires LogLevel::Debug and log_guest_driven_gpu_register_written_values = true.
Added full logging of all PM4 packets in the cp. This is not compiled in outside of debug builds, requires LogLevel::Debug and disassemble_pm4.
Piggybacked an implementation of guest callstack backtraces using the stackpoints from enable_host_guest_stack_synchronization. If enable_host_guest_stack_synchronization = false, no backtraces can be obtained.
Added log_ringbuffer_kickoff_initiator_bts. when a thread updates the cp's read pointer, it dumps the backtrace of that thread
Changed the names of the gpu registers CALLBACK_ADDRESS and CALLBACK_CONTEXT to the correct names.
Added a note about CP_PROG_COUNTER
Added CP_RB_WPTR to the gpu register table
Added notes about CP_RB_CNTL and CP_RB_RPTR_ADDR. Both aren't necessary for HLE
Changed name of UNKNOWN_0E00 gpu register to TC_CNTL_STATUS. Games only seem to write 1 to it (L2 invalidate)
uses a bitmap that splits up the memory space into 65k blocks per bit. Currently is using the guest virtual address but should be using physical addresses instead.
Currently if a guest does a reserve on a location and then a reserved store to a totally different location we trigger a breakpoint. This should never happen
Also removed the NEGATED_MUL_blah operations. They weren't necessary, nothing special is needed for the negated result variants.
Added a log message for when watched physical memory has a race, it just would be nice to know when it happens and in what games.
Fixed PrefetchW feature check
Added prefetchw check to startup AVX check, there should be no CPUs that support AVX but not PrefetchW.
Init VRSAVE to all ones.
Removed unused disable_global_lock flag.
Uses `vpternlogd` to collapse the bitwise select operation into one
instruction. Though it needs a `vmovdqa` instruction since `vpternlogd`
reads and writes to the first argument.
Add constexpr getters to magicdiv class so it can be used from jitted x64/dxbc
Track the guest return address as well for guest/host sync, if multiple entries have the same guest stack find the first one with a matching guest retaddr. this fixes epic mickey 2 (which the previous guest-stack change had allowed to go ingame for a bit) and potentially also a crash in fable3.
Break if under debugger when stackpoints are overflowed
Add much more useful output for host exceptions, print out xenia_canary.exe relative offsets if exception is in module, formatmessage for ntstatus/win32err, strerror
Minor d3d12 microoptimization, instead of doing SetEventOnCompletion + WaitForSingleObject do SetEventOnCompletion w/ nullptr so that the wait happens in kernel mode, avoiding two extra context switches
add unimplemented kernel functions:
ExAllocatePoolWithTag
ObReferenceObject
ObDereferenceObject has no return value.
Log a message when ObDereferenceObject/Reference receive unregistered guest kernel objects
gave ObLookupThreadByThreadId its correct error status
hoist object_types initialization out of ObReferenceObjectByHandle
Fix out parameter values on error for a few kernel funcs
add note about msr to KeSetCurrentStackPointers
add X_STATUS_OBJECT_TYPE_MISMATCH check for xeNtSetEvent
add msr_mask field to X_KPCR
Remove dead #if 0'd code in math.h
On amd64, page_size == 4096 constant, on amd64 w/ win32, allocation_granularity == 65536. These values for x86 windows havent changed over the last 20 years so this is probably safe
and gives a modest code size reduction
Enable XE_USE_KUSER_SHARED. This sources host time from KUSER_SHARED instead of from QueryPerformanceCounter, which is far faster, but only has a granularity of 100 nanoseconds.
In some games seemingly random crashes were happening that were hard to trace because
the faulting thread was actually not the one that was misbehaving, another threads stack was underflowing into the faulting thread.
Added a bunch of code to synchronize the guest stack and host stack so that if a guest longjmps the host's stack will be adjusted.
Changes were also made to allow the guest to call into a piece of an existing x64 function.
This synchronization might have a slight performance impact on lower end cpus, to disable it set enable_host_guest_stack_synchronization to false.
It is possible it may have introduced regressions, but i dont know of any yet
So far, i know the synchronization change fixes the "hub crash" in super sonic and allows the game "london 2012" to go ingame.
Removed emit_useless_fpscr_updates, not emitting these updates breaks the raiden game
MapGuestAddressToMachineCode now returns nullptr if no address was found, instead of the start of the function
add Processor::LookupModule
Add Backend::DeinitializeBackendContext
Use WriteRegisterRangeFromRing_WithKnownBound<0, 0xFFFF> in WriteRegisterRangeFromRing for inlining (previously regressed on performance of ExecutePacketType0)
add notes about flags that trap in XamInputGetCapabilities
0 == 3 in XamInputGetCapabilities
Name arg 2 of XamInputSetState
PrefetchW in critical section kernel funcs if available & doing cmpxchg
Add terminated field to X_KTHREAD, set it on termination
Expanded the logic of NtResumeThread/NtSuspendThread to include checking the type of the handle (in release, LookupObject doesnt seem to do anything with the type)
and returning X_STATUS_OBJECT_TYPE_MISMATCH if invalid. Do termination check in NtSuspendThread.
Add basic host exception messagebox, need to flesh it out more (maybe use the new stack tracking stuff if on guest thrd?)
Add rdrand patching hack, mostly affects users with nvidia cards who have many threads on zen
Use page_size_shift in more places
Once again disable precompilation! Raiden is mostly weird ppc asm which probably breaks the precompilation. The code is still useful for running the compiler over the whole of an xex in debug to test for issues
"Fix" debug console, we were checking the cvar before any cvars were loaded, and the condition it checks in AttachConsole is somehow always false
Remove dead #if 0'd code in math.h
On amd64, page_size == 4096 constant, on amd64 w/ win32, allocation_granularity == 65536. These values for x86 windows havent changed over the last 20 years so this is probably safe
and gives a modest code size reduction
Enable XE_USE_KUSER_SHARED. This sources host time from KUSER_SHARED instead of from QueryPerformanceCounter, which is far faster, but only has a granularity of 100 nanoseconds.
In some games seemingly random crashes were happening that were hard to trace because
the faulting thread was actually not the one that was misbehaving, another threads stack was underflowing into the faulting thread.
Added a bunch of code to synchronize the guest stack and host stack so that if a guest longjmps the host's stack will be adjusted.
Changes were also made to allow the guest to call into a piece of an existing x64 function.
This synchronization might have a slight performance impact on lower end cpus, to disable it set enable_host_guest_stack_synchronization to false.
It is possible it may have introduced regressions, but i dont know of any yet
So far, i know the synchronization change fixes the "hub crash" in super sonic and allows the game "london 2012" to go ingame.
Removed emit_useless_fpscr_updates, not emitting these updates breaks the raiden game
MapGuestAddressToMachineCode now returns nullptr if no address was found, instead of the start of the function
add Processor::LookupModule
Add Backend::DeinitializeBackendContext
Use WriteRegisterRangeFromRing_WithKnownBound<0, 0xFFFF> in WriteRegisterRangeFromRing for inlining (previously regressed on performance of ExecutePacketType0)
add notes about flags that trap in XamInputGetCapabilities
0 == 3 in XamInputGetCapabilities
Name arg 2 of XamInputSetState
PrefetchW in critical section kernel funcs if available & doing cmpxchg
Add terminated field to X_KTHREAD, set it on termination
Expanded the logic of NtResumeThread/NtSuspendThread to include checking the type of the handle (in release, LookupObject doesnt seem to do anything with the type)
and returning X_STATUS_OBJECT_TYPE_MISMATCH if invalid. Do termination check in NtSuspendThread.
Add basic host exception messagebox, need to flesh it out more (maybe use the new stack tracking stuff if on guest thrd?)
Add rdrand patching hack, mostly affects users with nvidia cards who have many threads on zen
Use page_size_shift in more places
Once again disable precompilation! Raiden is mostly weird ppc asm which probably breaks the precompilation. The code is still useful for running the compiler over the whole of an xex in debug to test for issues
add msr field on context
write to msr for mtmsr/mfmsr, do not have correct default value for msr yet, nor has mtmsrd been reimplemented
do not evaluate assert expressions in release at all, while still avoiding unused variable warnings