Commit Graph

6687 Commits

Author SHA1 Message Date
PatrickvL 7e0f8f4b30
Merge pull request #2031 from CookiePLMonster/xdk-3911-ltcg-fixes
Misc fixes after the vertex declaration changes
2020-11-15 22:20:51 +01:00
Silent 770d062014
Fix a merge error in CxbxImpl_SetViewPort 2020-11-15 22:03:43 +01:00
Silent 67c31c650b
Const qualify a few parameters in XbVertexShader.cpp functions
No functional change, just preventing future me from a typo I made.
Const-qualifying those parameters would have prevented it.
2020-11-15 22:03:43 +01:00
Silent dab1da6caf
Fix X_VERTEXSHADER_FLAG_PROGRAM flag for XDK-3948
In this XDK X_VERTEXSHADER_FLAG_PROGRAM appears to have a value of 4,
not 16. Adjusted the code accordingly and added test cases to verify
that assumption.
2020-11-15 22:03:43 +01:00
Silent c3568f6ea3
Fix D3DDevice_SetVertexShader_0 not calling a trampoline
Fixes NASCAR Heat 2002 rendering (again)
2020-11-15 20:28:24 +01:00
Luke Usher 213dd2f86f
Merge pull request #1894 from PatrickvL/vertex_declaration_refactoring
Vertex declaration refactoring
2020-11-14 14:04:58 +00:00
Luke Usher 3f4f141cf1
Merge pull request #2030 from RadWolfie/update-env-method
Action: Fix CI Release Job
2020-11-13 12:08:12 +00:00
RadWolfie a6353f7554 action: fix CI release job 2020-11-13 02:21:30 -06:00
Luke Usher 498cf39664
Merge pull request #2029 from RadWolfie/update-env-method
Action: Replace Deprecated Method to New Method for Set Environment
2020-11-12 08:57:54 +00:00
RadWolfie 2759d9bb8c action: replace deprecated method to new method for set environment 2020-11-11 12:33:27 -06:00
PatrickvL c3993c6abb
Merge pull request #2028 from CookiePLMonster/subhook-update
Update subhook
2020-11-11 00:12:19 +01:00
Silent 50164a5a67
Update subhook 2020-11-10 23:39:02 +01:00
patrickvl 561f76c067 Fixes after rebase
Write our viewport constants after copying dirty constants from PGRAPH
Fixes geometry flickering in some titles
Miscellaneous cleanup (renaming, indentation, function inlining)
Start stream offset from slot offset rather than vertex stride
Vertex declaration debug logging
Xbox clamps fog in pixel shader (not vertex shader)
Fix typos
Introduce HostStreamNumber, to use Xbox stream index on host
Pass HostStreamNumber to Activate
Let CountActiveD3DStreams return an actual count
Call ConvertStream with a regular counter instead of pretended stream index
For clarity in ConvertStream, discern between XboxStreamNumber and HostStreamNumber (even though they're the same value)
Don't check CxbxVertexDeclarationNeedsPatching in GetNbrStreams
Remove now unused CxbxVertexDeclarationNeedsPatching
Rename IndexOfStream into StreamIndex
Remove unused DeclPosition
Rename CurrentStreamNumber into XboxStreamIndex
Reduce size of VertexElements array to X_VSH_MAX_ATTRIBUTES (16)
Rename StreamNumber into XboxStreamNumber
Set XboxStreamIndex only once
Assert VertexStreams won't be accessed outside it's size
Assert VertexElements won't be accessed outside it's size
For AUTONORMAL, set UsageIndex to 0 instead of (according to docs) incorrect 1
Derive NeedPatching from XboxVertexElementByteSize, instead of setting it alongside
Set dummy vertex buffers using HostStreamNumber argument name
Clamp output fog in vertex shader HLSL.

Also, cleanup passthrough HLSL to write outputs identical to vertex shader template HLSL
Turns out, the scale and offset we send to the Xbox passthrough program, should just be identity, regardless resolution or scale.

This seems to fix sub-pixel differences, noticeable when F7-toggling passthrough mode between our dedicated HLSL vs the Xbox program.
Let F7 toggle passthrough based rendering between an Xbox (-derived) shader or our dedicated HLSL shader. Also fixed the index of the -96 and -95 (scale and offset) passthrough constants. F7-toggling, you can see a slight sub-pixel difference between the two modes, probably related to how scale and offset are calculated and used differently between the two approaches. With this, we can postpone the decision on how we should handle passthrough mode.
Fixup use render target width rather than backbuffer width
Multiply instead of divide in ReverseScreenspaceTransform
Don't scale Z in passthrough
test case GTA III sprites
Scale viewport X and Y as well as height.
Fixes cases where X and Y are nonzero e.g. DoA3 character select
Tidy and simplify GetViewPortOffsetAndScale a bit
Refactor passthrough HLSL to call reverseScreenspaceTransform, which not only uses offset and scale constants, but is now also configurable to handle RHW transformed positions.

This takes us one step closer to merging passthrough HLSL with our generic vertex shader HLSL
- IVB passes the position register in full (FLOAT4 instead of FLOAT3)
Fixes samples that pack 2d coords and texcoord into position
- Remove POSITIONT semantic, as we don't expect it in our shaders or pass it to the fixed function pipeline
scaley
rhw
Don't scale texcoords by default. Not all texcoords are used for texture fetches
LoadVertexShader_4 avoids trashing EAX parameter
Apply g_RenderScaleFactor to passthrough constants.

Also renamed ViewPort into HostViewPort. Added Comments, fixed typo's, marked unused code.
Extract D3DDevice_SetViewPort into CxbxImpl_SetViewPort

Also fix build
Extract D3DDevice_SetRenderTarget into CxbxImpl_SetRenderTarget, and call that from Direct3D_CreateDevice_End
Split off CxbxUpdateHostTextureScaling()
Typos
Separate setting host textures from texture coord scaling
Call UpdateHostTextures before state apply calls
Fix build
Prepare for more accurate calculation of passthrough constants zero and one (not functional yet).

Plus some cleanup
While at it, implement the conversion of remaining TextureStageStates in a similar way as TextureCoordinateIndex (mentioning known values explicitly in code, LOG_TEST_CASE or EmuLog for unsupported/unexpected input values).
Fix TSS_TCI conversion (and some typos, and reordering of code)
Map texture coordinate indices in fixed function mode only

(cherry picked from commit 32878cac2fc3682ac057af4f74f495d994fa13b8)
- Revert to scaling coordinates for linear textures
- Use texture state to map from stages to texcoord indices
- Add Get method to XboxTextureState

(cherry picked from commit 39dd0144851e49ea2452506293dca5e1f532ac97)
Fix XDK Ripple sample regression in CxbxSetVertexShaderPassthroughProgram, by not setting our own calculations in constant zero and one (and instead rely on Xbox code setting those through pushbuffer commands)
Remove texture normalization from vertex buffer conversion.
Instead, apply the texture scale factor in our vertex shader HLSL
This removed yet another reason for buffer patching, simplifying code more and speeding up rendering a little.

The Ripple XDK sample regressed because of this (or an earlier commit?), which might (or might not) be related to vertex explosions seen in some games.

(That, or it has something to do with the use of non-standard registers for passing in texture coordinates - in any case, a fix for this will probably improve a few games as well).
Set vertex shader constants based on pgraph (and write then to there as well)
Set constant zero and one for passthrough programs
For this, introduce and call CxbxImpl_SetScreenSpaceOffset
Renamed all host update functions to : CxbxUpdateHost...
As it turns out, texture normalization only applies to pre-transformed (X_D3DFVF_XYZRHW) vertex declarations (not just FVF based declarations)!

So, replace final use of VshHandleIsFVF (allowing removal of it's declaration) with GetXboxVertexAttributeFormat(), and update CxbxVertexBufferConverter::ConvertStream to use the Xbox AttributeFormat (instead of decoding FVF's).
With this, there's also no more use for DxbxFVF_GetNumberOfTextureCoordinates nor DxbxFVFToVertexSizeInBytes, so these are now removed as well.

I verified this still renders all XDK samples identically, but some games might improve due to this (especially if they have separate sets of texture-coordinates in a single stream). There's a low chance for regressions.
Remove our final SetFVF call on host, by composing an Xbox vertex attribute format according to the registers that have been written to in CxbxImpl_SetVertexData4f

This also allowed to clean up the code that copies data from g_InlineVertexBuffer_Table to g_InlineVertexBuffer_pData (a pass that we might even be able to skip?)
Extract the code from our D3DDevice_Begin patch towards CxbxImpl_Begin
Rename EmuFlushIVB into CxbxImpl_End
With this, all use of g_InlineVertexBuffer* symbols is limited to XbVertexBuffer.cpp
So, remove all extern declarations on g_InlineVertexBuffer* symbols.

Remove implementation and calls to HLE_write_NV2A_vertex_attribute_slot,
because CxbxSetVertexAttribute already does that with less overhead,
which is already called in CxbxImpl_SetVertexData4f.
Some comments on how we might handle vertex shader constants later on
Disable two LOG_TEST_CASE's
Simplify CxbxSetVertexAttribute
Extract CxbxImpl_SetVertexData4f from our D3DDevice_SetVertexData4f patch
Move the implementation to XbVertexBuffer.cpp
There, extract the part about setting default register values towards a separate function, called CxbxSetVertexAttribute
In CxbxImpl_SetVertexData4f, read starting values for all attributes
For this, refactored HLE_read_NV2A_vertex_attribute_slot into HLE_get_NV2A_vertex_attribute_value_pointer
Convert it's float pointer result to required data type per g_InlineVertexBuffer_Table field.
Use the same function in CxbxSetVertexAttribute to write default attribute values
In CxbxImpl_SetVertexShader, call CxbxSetVertexAttribute to set default values for attributes missing from vertex shader
Remove duplicate reset of g_Xbox_VertexShader_FunctionSlots_StartAddress
Remove bNeedRHWReset remnants
Don't set fixed function mode when we don't know what to do
Fixes Amped menu graphics
Make sure we process stream elements in order of offset
Use clamped reciprocal for defined behavour with rcp(0)
Remove FVF vertex buffer fixups
Reset vertex shader address when setting the passthrough program
Revert "Postpone calling EmuParseVshFunction until after shader cache miss, this should speed up rendering a little"

This reverts commit a4b647e6fe365ca414815afcf813431e7080546d.

Reason : EmuParseVshFunction sets the size needed for ComputeHash, so we can't avoid it!
Silence compiler warning
Reset g_Xbox_VertexShader_FunctionSlots_StartAddress to zero for passthrough mode

Also prepared storing g_Xbox_VertexShader_Ptr (See CXBX_USE_GLOBAL_VERTEXSHADER_POINTER).
Postpone calling EmuParseVshFunction until after shader cache miss, this should speed up rendering a little
Avoid calling trampoline when not assigned
Write binary Xbox shader to our slots for passthrough shaders
Use a version-dependent getter for shader tokens
Make sure EmuParseVshFunction never goes out of bounds (by putting a FLD_FINAL at slot 136 in CxbxSetVertexShaderSlots)
Document vertex shader flags and set more of them in XboxVertexShaderFromFVF
Oops
Took some stuff from NZJenkins dca881d61f
Postpone host update of vertex declaration and shader towards draw-time.
Introduce new fixed-function status boolean
Conversion of FVF to internal vertex shader INCLUDING texture Dimensions.
Avoid treating internal vertex shader as older version
Some more cleanup

Status of this is, that some XDK samples lack geometry, not sure if this is the result of this commit or a prior one. NZJenkins has a branch that shares history with this one, that does show geometry, so perhaps we should mix & match the best parts of these two branches, and continue with the result?!?
Call UpdateViewPortOffsetAndScaleConstants only from CxbxUpdateNativeD3DResources (and after CxbxTransferVertexShaderConstants)
Extracted code into CxbxTransferVertexShaderConstants function, using new (renamed) HLE_read_NV2A_vertex_constant_float4_ptr function
Introduce HLE_read_NV2A_vertex_program_slot and HLE_read_NV2A_vertex_constant_slot functions
Fix missing nv2a registers
Differentiate between two versions of X_D3DVertexShader
In HLE_write_NV2A_vertex_attribute_slot assert failure in pgraph_handle_method()
Call HLE_init_pgraph_plugins() from a better suitable place (EmuD3DInit)
Processed code review comments : Fixed a few typo's, document SetVertexShaderInput test-cases, rename inaccurate symbol names, add more comments, add LOG_TEST_CASE("Limiting FVF to 4 textures")
Start using GetXboxVertexStreamInput everywhere g_Xbox_SetStreamSource was accessed
Removed CreateVertexShader patch and implementation
Cache VertexDeclarations based on hash of their contents
Store FVF based VertexAttributeFormat in global variable
GetXboxVertexAttributeFormat returns a pointer now
A lot of cleanup (like IsValidCurrentShader and VshHandleIsValidShader are no longer needed)
Fix post-processing of elements for D3DDECLMETHOD_CROSSUV (normal tesselation)
Move and rename global variables.

Also, partly picked conversion of tesselation-declarations.
Reorder and comment vertex-shader related types
Removed now-obsolete CxbxVertexShader struct, instead use CxbxVertexDeclaration and renamed all references to that.
Start using GetXboxVertexAttributes, which calls the new (temporary) XboxFVFToXboxVertexAttributeFormat function for FVF vertex shader handles)

Also removed the now-obsolete SetCxbxVertexShaderHandle() and SetCxbxVertexDeclaration() functions
Introduce GetXboxVertexShader and GetXboxVertexAttributes getters (both not yet used)
In D3DDevice_SwitchTexture use a switch statement instead of an array plus for-loop
Implement our patch on SetVertexShaderInput and introduce GetXboxVertexStreamInput, a getter that honors this g_Xbox_SetStreamSource override (not yet used)
Disabled patches on D3DDevice_GetVertexShaderInput and D3DDevice_SetVertexShaderInputDirect.
Call trampoline in D3DDevice_SetVertexShaderInput (and add a LOG_TEST_CASE)
Disabled patch on D3DDevice_SelectVertexShaderDirect (since all it does, is forward to D3DDevice_SelectVertexShader, which we DO patch)
Call trampoline in D3DDevice_SetVertexShader
CxbxImpl_LoadVertexShader must not skip first program DWORD
Explicit padding in X_VERTEXSHADERINPUT to avoid potential alignment issue
Make Xb2PCRegisterType more compact, and let it support D3DDECLUSAGE_POSITIONT
WIP
Introduce CxbxFVFToXboxVertexAttributeFormat, a function that converts an Xbox FVF handle to the Xbox Vertex attribute format struct. This, so that in a next step we can convert the Xbox Vertex attribute format struct to a CxbxVertexDeclaration (or maybe just straight to a host declaration)
Implement CxbxImpl_LoadVertexShader much closer to reality
Define X_D3DVertexShader.Flags values
CxbxImpl_SelectVertexShader : Only store Handle when it's non-NULL (which must always be a VertexShader, so LOG_TEST_CASE when not)
Use CxbxSetVertexShaderSlots tooling function to reduce duplicate code
2020-11-02 21:39:40 +01:00
PatrickvL 97bf1d9169
Merge pull request #2022 from CookiePLMonster/mm3-fixes
Miscellaneous Direct3D LTCG fixes
2020-11-02 18:36:08 +01:00
Silent 2875342b5c
Fix a resource leak in D3DDevice_Swap 2020-11-02 17:52:56 +01:00
Silent 65d5abc813
Implement D3DDevice_DeleteVertexShader_0
Test case now can be removed, as it existed only due
to no known games using this function.
2020-11-02 17:52:56 +01:00
Silent feef6ffb3d
Refactor LTCG versions of Direct3D_CreateDevice
* Make Direct3D_CreateDevice_4 naked to remove
   the risk of trashing parameters
* Split Direct3D_CreateDevice_16 into two separate functions
   with different calling convention
2020-11-02 17:52:55 +01:00
Silent cbe534cb54
Patch D3D_CommonSetRenderTarget 2020-11-02 17:52:55 +01:00
Silent 62af56b67a
Fix D3DDevice_SetPixelShader_0 corrupting the stack 2020-11-02 17:52:54 +01:00
Silent 8b7f4a5027
Patch D3DDevice_SetRenderTarget_0 and factorize implementations
CreateDevice would try to call an inexistant guest trampoline,
but in fact we only needed to call the host implementation
2020-10-31 14:44:13 +01:00
Luke Usher 38242d48f9
Merge pull request #2015 from CookiePLMonster/affinity-fix
Fix affinity for EmuCreateDeviceProxy thread
2020-10-28 16:59:52 +00:00
Silent 16efb84eb9
Fix affinity for EmuCreateDeviceProxy thread 2020-10-28 17:47:34 +01:00
Luke Usher 5fe769b906
Merge pull request #2002 from PatrickvL/ps_const_simplfy
Simplfy pixel shader constant handling;
2020-10-28 13:40:57 +00:00
PatrickvL 44f0aee5d4
Merge pull request #2012 from CookiePLMonster/interlocked-lockcounts
Interlocked lockcounts
2020-10-28 11:20:11 +01:00
PatrickvL b8bb054402 Revert unintentional subhook change 2020-10-27 18:23:00 +01:00
patrickvl c09a90e459 Remap host pixel shader constant indexes, so that all constants can be set using just one call to SetPixelShaderConstantsF
Also, added more notes and code on the PSDef.PSTextureMode field (which lies outside of the render state pixel shader range), and skip the values of the final combiner constants when checking for uniqueness of pixel shader definitions.
2020-10-27 18:22:59 +01:00
patrickvl eae97f3f07 Optimize setting host pixel shader constants, by collecting all values and set them using a single call.
Also remove one more unused variable
2020-10-27 18:22:59 +01:00
patrickvl 337946db25 Simplfy pixel shader constant handling;
Since we've ported over to Direct3D 9, and we're using pixel shader version 1.4, we've got more than enough constants available to remove the need for constant packing.

Also, there was a left-over patch on SetPixelShaderConstant which must no longer be applied, since nowadays we read constant values straight from their corresponding render state slots.
This also implies we no longer need to declare the final combiner constants as part of the shader assembly, because these 2 are also read from their corresponding xbox render state slots, and thus can be transferred to host on each update.

This will likely improve the output of pixel shaders which stay otherwise unchanged but rely on changing constant values.
2020-10-27 18:22:58 +01:00
RadWolfie f8593e692d
Merge pull request #1993 from CookiePLMonster/dsound-improvements-alt
DSound improvements (alternative volume heuristics)
2020-10-26 17:29:31 -05:00
Silent dd0e331528
Thread safety fixes for ERWLOCK 2020-10-26 20:55:24 +01:00
Silent 4323e401d8
Thread safety fixes for RtlCriticalSection 2020-10-26 20:51:19 +01:00
Luke Usher 1c465409c2
Merge pull request #2011 from CookiePLMonster/transform-patches-recursive
Guard against nested SetTransform/MultiplyTransform calls
2020-10-26 18:01:11 +00:00
Silent 4dd9aaeed7
Guard against nested SetTransform/MultiplyTransform calls
In the case of 25 to Life, MultiplyTransform calls SetTransform
which corrupted the host's internal state. Introduce a guard variable
to ensure we call to host only once per the patch chain and keep
the internal state pristine
2020-10-26 18:54:39 +01:00
Silent acff986fe1
Add NestedPatchCounter 2020-10-26 18:50:54 +01:00
Luke Usher 3b82af621d
Merge pull request #2007 from CookiePLMonster/burnout-patches
Changes to D3D patches
2020-10-25 19:07:50 +00:00
Silent e81c9fecb8
Unpatch D3DDevice_GetTransform and call to guest in SetTransform and MultiplyTransform
Fixes (not yet visible) rendering in Burnout 3, possibly because to it
having an unpatched LTCG-specific GetTransform or reading from
the D3D state directly.
2020-10-25 18:53:34 +01:00
Silent 5592f81c02
Implement D3D_BlockOnTime_4 2020-10-25 18:52:20 +01:00
Luke Usher 9a773ef7ac
Merge pull request #2003 from CookiePLMonster/fix-apu-timer
Fix APU timer ticking at wrong frequency
2020-10-25 01:41:56 +01:00
Silent d5adbb2ab3
Refactor APU, TSC and ACPI timers to use shared code 2020-10-24 23:53:15 +02:00
PatrickvL 23a3bc4b78
Merge pull request #2005 from CookiePLMonster/fix-vs-precision
Improve reverse screenspace transformation precision
2020-10-24 23:21:48 +02:00
Silent 709a3508ee
Pre-divide reverse scale in reverseScreenspaceTransform
This should improve numerical stability of the reverse transformation
when D24 depth is used by the game, as this caused viewport.z
to be very large (0xFFFFFF).
2020-10-24 23:18:59 +02:00
Silent 9b2c1ba2ce
Submit viewport scale and offset in one batch 2020-10-24 22:36:56 +02:00
Silent 0f88b77bfe
Fix APU timer ticking at wrong frequency 2020-10-24 13:09:37 +02:00
Luke Usher 724a1ca684
Merge pull request #2001 from PatrickvL/fix_ps_sum_reg
Pixel shader fix XFC SUM
2020-10-23 09:10:23 +01:00
patrickvl a412c80b24 Fix how our current pixel shader conversion calculates the final combiner special purpose register 'sum' : it was accidentally multiplying instead of adding it's arguments! 2020-10-23 01:05:24 +02:00
Luke Usher 1ee123900b
Merge pull request #1982 from ergo720/InlineVertexBuffer_as_vector
Use std::vector for g_InlineVertexBuffer_Table instead of realloc
2020-10-21 08:38:02 +01:00
Luke Usher 4e6068f6b4
Merge pull request #1998 from CookiePLMonster/thread-creation-delay
Simplify thread creation logic, remove hardcoded delays and tighten affinity changes
2020-10-21 08:36:10 +01:00
PatrickvL 7938142faf
Merge pull request #2000 from CookiePLMonster/copyrects-fallback
Add a fallback to CopyRects for cases which StretchRect can't handle
2020-10-21 01:04:09 +02:00
ergo720 a4d1807b4c Use std::fill and std::copy where possible 2020-10-20 21:26:43 +02:00
ergo720 47ea099a92 Use std::vector for g_InlineVertexBuffer_Table instead of realloc 2020-10-20 21:26:43 +02:00