Commit Graph

156 Commits

Author SHA1 Message Date
Matt Borgerson ab811bf987 nv2a: Const-ify some function parameters 2025-07-02 21:26:17 -07:00
Matt Borgerson 7908bcbbe6 nv2a: Const-ify LRU callback key parameters 2025-07-02 21:21:08 -07:00
Matt Borgerson 62ab68b2ab n2va/vk: Drop some useless includes 2025-07-02 20:22:18 -07:00
Matt Borgerson 21284ba3f2 nv2a/vk: Update some copyrights 2025-07-02 20:22:01 -07:00
Matt Borgerson 1a8a8ad03d nv2a/vk: Initialize ShaderBinding in shader_cache_entry_init 2025-07-02 20:11:30 -07:00
Matt Borgerson 90a0187e9b nv2a: Move numeric locale setup to main 2025-07-02 20:11:30 -07:00
Matt Borgerson 4921607c90 nv2a/vk: Group module info and locs in ShaderBinding 2025-07-02 20:11:30 -07:00
Matt Borgerson dd3f4db0a9 nv2a/vk: Cache shader modules 2025-07-02 20:11:30 -07:00
Matt Borgerson dcd524c4bc nv2a: Split nv2a_pgraph_surface_cpu_access trace into read/write 2025-07-02 01:41:09 -07:00
Matt Borgerson 966115336a nv2a: Fix CPU surface access callback race and use-after-free 2025-07-02 01:41:09 -07:00
Matt Borgerson 1489253c68 nv2a/glsl: Add glsl_ prefix to public functions 2025-06-28 00:18:28 -07:00
Matt Borgerson cbcb7c2181 nv2a/glsl: Factor out geometry state to GeomState 2025-06-28 00:18:28 -07:00
Matt Borgerson c29546e2e1 nv2a: Rename update_shader_{constant_locations -> uniform_locs} 2025-06-28 00:18:28 -07:00
Matt Borgerson d3606813eb nv2a/vk: Fix DGROUP_BEGIN order nit 2025-06-28 00:18:28 -07:00
Matt Borgerson d17be812ea nv2a/glsl: Unify dirty shader state check 2025-06-28 00:18:28 -07:00
Matt Borgerson 4e6c6518f9 nv2a: Add _regs suffix to vsh.h and psh.h 2025-06-28 00:18:28 -07:00
Matt Borgerson 22b242b2d6 nv2a/glsl: Let pgraph_gen_vsh_glsl take a pointer to PshState 2025-06-28 00:18:28 -07:00
Matt Borgerson 34e8c62a42 nv2a: Move {Vsh,Psh}State generation into glsl subdir 2025-06-28 00:18:28 -07:00
Matt Borgerson bebffc7d64 nv2a/glsl: Let pgraph_gen_geom_glsl take VshState and GlslOptions 2025-06-28 00:18:28 -07:00
Matt Borgerson c88bac1706 nv2a: Simplify shader uniform declaration and update
This patch moves uniform declaration into {vsh, psh}.h headers, using
macros to generate accessory definitions. Mapping of PGRAPH state to
uniform values is factored out of parallel paths in GL/Vk renderers into
common renderer-agnostic helper functions, with renderer-specific
uniform value update paths being automated.
2025-06-28 00:18:28 -07:00
Matt Borgerson 18872f2eb9 nv2a: Update various copyright headers 2025-06-28 00:18:28 -07:00
Matt Borgerson d3821c5513 nv2a: Structure shader uniform locs 2025-06-28 00:18:28 -07:00
Matt Borgerson 4977e65bd5 nv2a/vk: Clean up layout binding ids 2025-06-28 00:18:28 -07:00
Matt Borgerson 9020913e29 nv2a: Extract GLSL generation options from {Vsh,Psh}State 2025-06-28 00:18:28 -07:00
Matt Borgerson c575b08b5f nv2a: Extract VshState from ShaderState 2025-06-28 00:18:28 -07:00
Matt Borgerson 3ad4eb3101 nv2a: Remove colorkey_mask from PshState
It's a uniform, so we don't want it to be part of the PSH setup state,
which used as the shader cache key.
2025-06-28 00:18:28 -07:00
Matt Borgerson 1fa242bd00 nv2a/vk: Only include palette in texture key when necessary
Fixes degraded performance due to garbage palette details polluting the
texture cache.
2025-06-23 01:09:51 -07:00
Matt Borgerson a242964793 nv2a/vk: Require fillModeNonSolid feature 2025-06-22 16:44:14 -07:00
Matt Borgerson cbcfd1d1d6 nv2a/vk: Enable wideLines feature before use 2025-06-22 16:44:14 -07:00
Matt Borgerson df1ac31eb2 nv2a/vk: Set line width state dynamically 2025-06-22 16:44:14 -07:00
specialfred453@gmail.com d8afb35c40 nv2a/vk: Scale line width by surface scale 2025-06-22 16:44:14 -07:00
Erik Abair 348b03d6ce
nv2a: Handle PGRAPH color keying 2025-06-21 13:25:24 -07:00
coldhex 11dcae01b9 nv2a: implement screen coordinate rounding to 4 bit fractional precision
Xbox triangle rasterization appears to follow the usual top-left rule.
However, since Xemu renders to an OpenGL framebuffer object (FBO) instead
of directly to the default framebuffer, Xemu actually has what could be
called the bottom-left triangle rasterization rule. I'll address that in
another commit.

Also, note that the ProjAdjacentGeometry_0.5625 test in nxdk_pgraph_tests
is very sensitive to floating point rounding errors. For example, the
nxdk_pgraph_tests commit 66b32a0b1feba32a0db7a95d6358e84f7a6246ad changed
the math library which caused the test result to change also on real Xbox
hardware due to floating point rounding error differences in matrix
inverse computation. Apart from the bottom-left rasterization issue, the
differing result between Xbox and the rounding I am proposing here for
Xemu seems to stem from floating point rounding that happens in screen
coordinate calculations before the rounding to 4 bit precision takes place.
Fixing such rounding issues would require carrying all preceding floating
point computations exactly in the same order and with same precision as
Xbox. Note that Xbox Direct3D library seems to add 0.03125 (1/32) to
screen coordinates by default. Likely the idea there was to make floating
point screen coordinates round to the nearest screen coordinates in
4 bit fixed point precision. So the Xbox Direct3D library (and therefore
games) already mitigate against precarious rounding when exactly
half-integer coordinates are used by games. Actually they would use
integer coordinates because it is Direct3D 8, but since nv2a appears to
rasterize at half-integer coordinates like OpenGL, Xbox Direct3D
also adds 0.5 to screen coordinates in addition to 1/32.
2025-05-20 13:15:12 -07:00
Erik Abair 428c975f09 nv2a: Allow multiframe RenderDoc captures with nv2a traces
Allows multiple frames to be captured at once by holding shift while pressing
F10.

Temporarily toggles nv2a trace messages if control is held while pressing F10.
2025-05-15 08:37:13 -07:00
Erik Abair d593869429
nv2a: Move point params to uniforms
Co-authored-by: Matt Borgerson <contact@mborgerson.com>
2025-04-30 23:43:38 -07:00
Matt Borgerson 5685a6290c nv2a/vk: Set specular power uniform 2025-04-16 20:26:22 -07:00
Erik Abair 69c8df2a3e nv2a: Partial implementation of SET_SPECULAR_PARAMS 2025-04-16 18:24:46 -07:00
Matt Borgerson 1893b56c38 nv2a/vk: Fix vertex ram buffer dirty bit check 2025-03-19 02:25:33 -07:00
Matt Borgerson b929d4eced nv2a: Drop surface compat clip constraint 2025-03-17 14:48:47 -07:00
Matt Borgerson c3a8b9569f nv2a: Simplify surface clip to scissor size calculation 2025-03-17 14:48:47 -07:00
Logan Stromberg 860bccb722
nv2a: Fix surface clip to scissor origin 2025-03-17 14:32:40 -07:00
wilkovatch a00820746f
nv2a: Handle texture dimensions not divisible by 4 in S3TC decoder 2025-03-14 18:44:25 -07:00
Matt Borgerson 6e3dfb36d8 nv2a/vk: Don't set compressed, swizzled when attribute is uniform 2025-03-10 14:23:43 -07:00
coldhex 8dc6c90e11 nv2a/vk: Drop unnecessary dirty check for NV_PGRAPH_ZCOMPRESSOCCLUDE
This was used to enable/disable Vulkan depth clamping, but that was
removed in previous commit.
2025-03-08 14:54:18 -07:00
coldhex 798ad30819 nv2a: Perspective-correct interpolation for w-buffering
z_perspective is true implies w-buffering and then the w-coordinate stored
in the depth buffer should also be interpolated in a perspective-correct
way. We do this by calculating w and setting gl_FragDepth in the fragment
shader.

Since enabling polygon offset and setting values using glPolygonOffset
won't have any effect when manually setting gl_FragDepth for w-buffering,
we introduce the depthOffset variable to obtain similar behaviour (but the
glPolygonOffset factor-argument is currently not emulated.) (Note that
glPolygonOffset is OpenGL implementation-dependent and it might be good to
use depthOffset for z-buffering as well, but this is not done here and we
still use OpenGL/Vulkan zbias functionality.)

This also implements depth clipping and clamping in the fragment shader.
If triangles are clipped, the shadows of the small rocks in Halo 2 Beaver
Creek map can have flickering horizontal lines. The shadows are drawn on
the ground in another pass with the same models as for the ground, but for
some reason with depth clamping enabled. The flickering happens if Xemu
clips the ground triangles, but the exact same shadow triangles are depth
clamped, so there are small differences in the coordinates. The shadows
are drawn with depth function GL_EQUAL so there is no tolerance for any
differences. Clipping in the fragment shader solves the problem because
the ground and shadow triangles remain exactly the same regardless of
depth clipping/clamping. For some performance gain, it might be a good
idea to cull triangles by depth in the geometry shader, but this is not
implemented here.

In the programmable vertex shader we always multiply position output by w
because this improves numerical stability in subsequent floating point
computations by modern GPUs. This usually means that the perspective
divide done by the vertex program gets undone.

The magic bounding constants 5.42101e-020 and 1.884467e+019 are replaced
by 5.421011e-20 and 1.8446744e19, i.e. more decimals added. This makes the
32-bit floating point numbers represent exactly 2^(-64) and 2^64 (raw bits
0x1f800000 and 0x5f800000) which seem more likely the correct values
although testing with hardware was not done to this precision.

Testing indicates that the same RCC instruction magic constants are also
applied to both fixed function and programmable vertex shader w-coordinate
output. This bounding replaces the special test for w==0.0 and abs(w)==inf
which used to set vtx_inv_w=1.0 (which did not match Xbox hardware
behaviour.)
2025-03-08 14:54:18 -07:00
Erik Abair f701573d44 nv2a: Use rounded values for alpha testing 2025-03-05 17:12:14 -07:00
Matt Borgerson d76898f63b nv2a: Fix variable shadowing complaints 2025-01-07 00:52:51 -07:00
Matt Borgerson 57c6d82fa3 nv2a/vk: Simplify debug indent loop to a variable field width format 2025-01-07 00:52:51 -07:00
Matt Borgerson 0e50741c28 ui,xbox: Copyright updates on changed files 2025-01-06 23:06:21 -07:00
Matt Borgerson 3106ea97e5 mcpx: Use new bql_[un]lock functions 2025-01-06 23:05:53 -07:00