Ported from https://github.com/xqemu/xqemu/pull/113 by dracc https://github.com/dracc
A few GPU timing related fixes
NV2A crystal frequency is currently set to debug unit speed of 13.5MHz.
Later retail xboxes had a 16.67MHz crystal according to @JayFoxRox's research(?).
Also, current ptimer clock value calculation is flipped around and on top of that always overflows.
With these fixes, running get_timers.py against xqemu reports values close to what real hardware gives.
Proposed changes
Set NV2A_CRYSTAL_FREQ to retail speed
Fix ptimer_get_clock() behaviour
Having multiple build types for each branch is proving to be confusing for some users.
Ever since we implemented #1388, there has been no end-user benefit for using Debug builds, as they can get the same logging result by toggling the log level to Debug
Both XInput and DInput are migrated together. It's best to keep them in XInput folder.
As for Xapi files, may not require any plugin? Or possible put into their own folder?
Cortex (and likely many others)
Calling the unpatched trampoline of the patched function is enough to
solve the issue: This further enforces that unpatching these functions
and reading from NV2A state is the right thing to do, work will continue
on that as a seperate branch.
This allows for both steps to be completely disconnected, easily
allowing patches to be turned on or off based on a set of flags, as well
as preventing the need to clear the HLE cache when switching from
HLE->LLE.
This also allows patches to be seen/modifed from a central location,
no more searching through the codebase to determine if a function should
be patched or not, and no more 'FUNC_EXPORTS/GetProcAddress' magic!
Currently, this is used for HLE only, but could really shine when
extended to introduce optional detour based logging even when LLE is
enabled.
For example We could easily add a LLE_D3D_DETOUR flag, which when
enabled, patches functions with a wrapper, which simply logs input and
output, calling the original xbox function via a trampoline.
This would be great for debugging, as we'd get a full call trace from
the API level, even when not implementing HLE.
There's also the possibility of mixing in some patches even with LLE
enabled: for a hybrid HLE/LLE solution of the same functionality,
but there are no plans to implement that at this moment of time.
- Updated incomplete logic for texm instructions to perform correct number of dot products
- updated CMP instruction logic to more closely match CND logic
- Added missing compare to depth instruction