Commit Graph

102 Commits

Author SHA1 Message Date
RSDuck 58ab33210a handle address wrap around in texture cache
fixes out of bounds access in Mario 64
also slightly optimise paletted texture conversion
2024-10-27 23:32:05 +01:00
Jakly d0a7239f15
fix some bugs with compressed texture look up (#2051) 2024-08-01 22:44:04 +02:00
Arisotura 8fc403cdad update copyright headers 2024-06-15 17:01:19 +02:00
RSDuck 043244a56d
Compute shader renderer (#2041)
* nothing works yet

* don't double buffer 3D framebuffers for the GL Renderer
looks like leftovers from when 3D+2D composition was done in the frontend

* oops

* it works!

* implement display capture for compute renderer
it's actually just all stolen from the regular OpenGL renderer

* fix bad indirect call

* handle cleanup properly

* add hires rendering to the compute shader renderer

* fix UB
also misc changes to use more unsigned multiplication
also fix framebuffer resize

* correct edge filling behaviour when AA is disabled

* fix full color textures

* fix edge marking (polygon id is 6-bit not 5)
also make the code a bit nicer

* take all edge cases into account for XMin/XMax calculation

* use hires coordinate again

* stop using fixed size buffers based on scale factor in shaders
this makes shader compile times tolerable on Wintel
- beginning of the shader cache
- increase size of tile idx in workdesc to 20 bits

* apparently & is not defined on bvec4
why does this even compile on Intel and Nvidia?

* put the texture cache into it's own file

* add compute shader renderer properly to the GUI
also add option to toggle using high resolution vertex coordinates

* unbind sampler object in compute shader renderer

* fix GetRangedBitMask for 64 bit aligned 64 bits
pretty embarassing

* convert NonStupidBitfield.h back to LF only new lines

* actually adapt to latest changes

* fix stupid merge

* actually make compute shader renderer work with newest changes

* show progress on shader compilation

* remove merge leftover
2024-05-13 17:17:39 +02:00
Jesse Talavera 8143f54956
Protect savestates while the threaded software renderer is running (#1864)
* First crack at ensuring the render thread doesn't touch GPU state while it's being serialized

* Get rid of the semaphore wait

* Add some extra fields into GPU3D's serialization

* Oops, TempVertexBuffer is already serialized

* Move vertex serialization into its own method

* Lock the GPU3D state when rendering on the render thread or serializing it

* Revert "Lock the GPU3D state when rendering on the render thread or serializing it"

This reverts commit 2f49a551c1.

* Add comments that describe the synchronization within GPU3D_Soft

- I need to understand it before I can solve my actual problem
- Now I do

* Revert "Revert "Lock the GPU3D state when rendering on the render thread or serializing it""

This reverts commit 1977566a6d.

* Let's try locking the GPU3D state throughout NDS::RunFrame

- Just to see what happens

* Slim down the lock's scope

* Narrow the lock's scope some more

* Remove the lock entirely

* Try protecting the GPU3D state with just a mutex

- I'll clean this up once I know it works

* Remove a duplicate method definition

* Add a missing `noexcept` specifier

* Remove an unused function

* Cut some non-hardware state from `GPU3D`'s savestate

* Assume that the next frame after loading a savestate won't be identical

* Actually, it _is_ worth it

* Don't serialize the clip matrix

- It's recalculated anyway

* Serialize `RenderPolygonRAM` as an array of indexes

* Clean up some comments

- I liked the dialogue style, but oh well

* Try restarting the render thread instead of using the lock

- Let's see what happens

* Put the lock back

* Fix some polygon and vertex indexes being saved incorrectly

- Taking the difference between two pointers results in the number of elements, not the number of bytes

* Remove `SoftRenderer::StateBusy` since it turns out we don't need it

- The real synchronization was the friends we made along the way
2024-01-07 23:39:43 +01:00
Jesse Talavera c867a7f1c0
Make the initial 3D renderer configurable via `NDSArgs` (#1913)
* Allow 3D renderers to be created without passing `GPU` to the constructor

* Make the initial 3D renderer configurable via `NDSArgs`

* Fix a compiler error
2023-12-15 14:53:31 +01:00
Jesse Talavera 9bfc9c08ff
Sprinkle `const` around where appropriate (#1909)
* Sprinkle `const` around where appropriate

- This will make it easier to use `NDS` objects in `const` contexts (e.g. `const` parameters or methods)

* Remove the `const` qualifier on `DSi_DSP::DSPRead16`

- MMIO reads can be non-pure, so this may not be `const` in the future
2023-12-12 11:07:22 +01:00
Jesse Talavera-Greenberg 7caddf9615
Clean up the 3D renderer for enhanced flexibility (#1895)
* Give `GPU2D::Unit` a virtual destructor

- Undefined behavior avoided!

* Add an `array2d` alias

* Move various parts of `GPU2D::SoftRenderer` to `constexpr`

- `SoftRenderer::MosaicTable` is now initialized at compile-time
- Most of the `SoftRenderer::Color*` functions are now `constexpr`
- The aforementioned functions are used with a constant value in at least one place, so they'll be at least partially computed at compile-time

* Generalize `GLRenderer::PrepareCaptureFrame`

- Declare it in the base `Renderer3D` class, but make it empty

* Remove unneeded `virtual` specifiers

* Store `Framebuffer`'s memory in `unique_ptr`s

- Reduce the risk of leaks this way

* Clean up how `GLCompositor` is initialized

- Return it as an `std::optional` instead of a `std::unique_ptr`
- Make `GLCompositor` movable
- Replace `GLCompositor`'s plain arrays with `std::array` to simplify moving

* Pass `GPU` to `GLCompositor`'s important functions instead of passing it to the constructor

* Move `GLCompositor` to be a field within `GLRenderer`

- Some methods were moved up and made `virtual`

* Fix some linker errors

* Set the renderer in the frontend

* Remove unneeded `virtual` specifiers

* Remove `RenderSettings` in favor of just exposing the relevant member variables

* Update the frontend to accommodate the core changes

* Add `constexpr` and `const` to places in the interpolator

* Qualify references to `size_t`

* Construct the `optional` directly instead of using `make_optional`

- It makes the Linux build choke
- I think it's because `GLCompositor`'s constructor is `private`
2023-11-29 15:23:11 +01:00
Jaklyy ad7b1a8c61
only fill edges when translucent if blending is enabled (#1882) 2023-11-25 18:40:07 +01:00
Jesse Talavera-Greenberg 346dd4006e
Move all core types into namespaces (#1886)
* Reorganize namespaces

- Most types are now moved into the `melonDS` namespace
- Only good chance to do this for a while, since a big refactor is next

* Fix the build
2023-11-25 18:32:09 +01:00
Jesse Talavera-Greenberg 4558be0d8e
Refactor the GPU to be object-oriented (#1873)
* Refactor GPU3D to be an object

- Who has two thumbs and is the sworn enemy of global state? This guy!

* Refactor GPU itself to be an object

- Wow, it's used in a lot of places
- Also introduce a new `Melon` namespace for a few classes
- I expect other classes will be moved into `Melon` over time

* Change signature of Renderer3D::SetRenderSettings

- Make it noexcept, and its argument const

* Remove some stray whitespace
2023-11-09 21:54:51 +01:00
Arisotura ac38faef14 update copyright years 2023-11-04 00:21:46 +01:00
Jesse Talavera-Greenberg 9d9ba83731
Clean up some rendering-related resources in DeInit (#1836)
- The unique_ptr destructors will take care of the cleanup
2023-09-24 18:33:14 +02:00
Jesse Talavera-Greenberg db963aa002
Make the NDS teardown more robust (#1798)
* Make cleanup a little more robust to mitigate undefined behavior

- Add some null checks before cleaning up the GPU3D renderer
- Make sure that all deleted objects are null
- Move cleanup logic out of an assert call
- Note that deleting a null pointer is a no-op, so there's no need to check for null beforehand
- Use RAII for GLCompositor instead of Init/DeInit methods

* Replace a DeInit call that I missed

* Make ARMJIT_Memory less likely to generate errors

- Set FastMem7/9Start to nullptr at the end
- Only close and unmap the file if it's initialized

* Make Renderer3D manage its resources with RAII

* Don't try to deallocate frontend resources that aren't loaded

* Make ARMJIT_Memory::DeInit more robust on the Switch

* Reset MemoryFile on Windows to INVALID_HANDLE_VALUE, not nullptr

- There is a difference

* Don't explicitly store a Valid state in GLCompositor or the 3D renderers

- Instead, create them with static methods while making the actual constructors private

* Make initialization of OpenGL resources fail if OpenGL isn't loaded

* assert that OpenGL is loaded instead of returning failure
2023-09-15 15:31:05 +02:00
Jaklyy 2bd12669b2
Edge fill rules for swapped polygons + a few minor fixes to edge cases (#1815)
* fix edge fill rules for swapped polygons

also fixes translucent polygons not being always edge filled.

* fix right edge fill rule

* fix right edge fill rule for realsies

* fix a few more glitchy polygons

specifically quads similar to: (-67,40) (64, 160) (192, 160), (8, 111)

* fix one edge case pixel

i hate this so much

* fix "flat bottom" edge fill

* fix regression + apply changes to shadow masks

fix a regression with certain line polygons not rendering; there seems to be an exception made by the ds'  gpu in order for these polygons to render properly.
also apply these changes to shadow masks because i forgot to

* forgot to remove a line

---------

Co-authored-by: Arisotura <thetotalworm@gmail.com>
2023-08-27 13:32:31 +02:00
Jaklyy d69745b3a8
Fix Incorrect Polygon Swapping Behavior and Implement Correct Rules for Shifting Right Edges Left (#1816)
* fix polygons being swapped incorrectly

"borrowed" this from noods
needs verification that the >= and <= signs aren't actually supposed to be > and <

* proper rules for moving vertical right slopes left

* nvm most of that was actually pointless

that's on me for not checking
2023-08-27 13:29:12 +02:00
Jaklyy dc8efb62b8
Fix aa being upside down on swapped y-major slopes (#1803)
* fix aa being upside down on swapped y-major slopes

* further improvements to swapped aa

in addition to fixing swapped y-major slope aa, now fixes:
swapped x-major slope aa
swapped vertical slope aa

* use templates instead + style/comment tweaks

should force the compiler to precompile if statements like i want it to do, instead of just hoping it does so on its own
2023-08-27 13:28:44 +02:00
Jaklyy d7369857c3
Small Fix to Anti-Aliasing + Edge Marking Behavior (#1680)
* Anti-Alias All Edges

Changing a bunch of 0x3s to 0xF since I figure if they're checking the left and right edge they wanna be checking the top and bottom too now that they're gonna be aa'd. also copy that if statement over since otherwise there won't be anything to blend with.

* small optimization

its probably a tiny bit faster?
idk id need actual benchmarking tools.
doesn't break anything at least.
2023-08-27 13:28:26 +02:00
Jaklyy f454eba3c3
check lower pixel when top pixel ignores fog (#1808) 2023-08-13 05:38:26 +02:00
Arisotura 35cc79787d update copyright headers 2022-01-09 02:15:50 +01:00
Arisotura 19ddaee13b finally decouple Config from the core. baahhahahahah 2021-11-18 01:17:51 +01:00
RSDuck 883fceb6ce use std::swap 🔃 2021-08-21 01:54:45 +02:00
RSDuck f900792dc0 addition to last commit 2021-08-04 14:35:54 +02:00
RSDuck 9181ab19c7 GPU3D soft: prevent out of bounds read 2021-05-24 19:41:24 +02:00
RSDuck 509107fb59 set instead of or stencil buffer for left edges 2021-05-08 00:12:48 +02:00
RSDuck 436b3c4c1d update copyright year and add missing GPL headers 2021-03-12 20:07:40 +01:00
RSDuck bc63531e00 avoid leaking threads in NDSCart_SRAMManager
also atomics
2021-03-11 16:54:43 +01:00
RSDuck 295d60e4cb try to fix build when the compiler is stricter 2021-02-11 19:11:18 +01:00
RSDuck f05bc50d40 use std::function in Thread_Create so we can revert back to using it 2021-02-11 16:00:36 +01:00
Wunk a7029aebae
Allow for a more modular renderer backends (#990)
* Draft GPU3D renderer modularization

* Update sources C++ standard to C++17

The top-level `CMakeLists.txt` is already using the C++17 standard.

* Move GLCompositor into class type

Some other misc fixes to push towards better modularity

* Make renderer-implementation types move-only

These types are going to be holding onto handles
of GPU-side resources and shouldn't ever be copied around.

* Fix OSX: Remove 'register' storage class specifier

`register` has been removed in C++17...
But this keyword hasn't done anything in years anyways.

OSX builds consider this "warning" an error and it
stops the whole build.

* Add RestartFrame to Renderer3D interface

* Move Accelerated property to Renderer3D interface

There are points in the code base where we do:
`renderer != 0` to know if we are feeding
an openGL renderer. Rather than that we can instead just have this be
a property of the renderer itself.
With this pattern a renderer can just say how it wants its data to come
in rather than have everyone know that they're talking to an OpenGL
renderer.

* Remove Accelerated flag from GPU

* Move 2D_Soft interface in separate header

Also make the current 2D engine an "owned" unique_ptr.

* Update alignment attribute to standard alignas

Uses standardized `alignas` rather than compiler-specific
attributes.

https://en.cppreference.com/w/cpp/language/alignas

* Fix Clang: alignas specifier

Alignment must be specified before the array to align the entire array.

https://en.cppreference.com/w/cpp/language/alignas

* Converted Renderer3D Accelerated to variable

This flag is checked a lot during scanline rasterization. So rather
than having an expensive vtable-lookup call during mainline rendering
code, it is now a public constant bool type that is written to only once
during Renderer3D initialization.
2021-02-09 23:38:51 +01:00
RSDuck b78bc4cb66 fixes to the threadedness of the sw rasteriser
also fix #639 and fix #880
2021-01-26 16:42:27 +01:00
RSDuck 7d448d911d use C++ style structs everywhere 2021-01-02 11:38:06 +01:00
RSDuck 6e8bac3909 Merge vram dirty tracking
Squashed commit of the following:

commit b463a05d4b909372f0cd1ad91caa0c77a25e5901
Author: RSDuck <rsduck@users.noreply.github.com>
Date:   Mon Nov 30 01:55:35 2020 +0100

    minor fix

commit ce73cebbdf5da243d7ebade82d8799ded9cd6b28
Author: RSDuck <rsduck@users.noreply.github.com>
Date:   Mon Nov 30 00:43:08 2020 +0100

    fix dirty flags of BG/OBJ mappings not being reset

commit fc5d73a6178e3adc444398bdd23de8314b5ca8f8
Author: RSDuck <rsduck@users.noreply.github.com>
Date:   Mon Nov 30 00:11:13 2020 +0100

    use flat vram for gpu2d everywhere

commit 34ee9fe2bf04fcfa2a5a1c8d78d70007e606f1a2
Author: RSDuck <rsduck@users.noreply.github.com>
Date:   Sat Nov 28 19:10:34 2020 +0100

    mark VRAM dirty for display capture

commit e8778fa2f429c6df0eece19d6a5ee83ae23a0cf4
Author: RSDuck <rsduck@users.noreply.github.com>
Date:   Sat Nov 28 18:59:31 2020 +0100

    use flat VRAM for textures and texpals
    also skip rendering if nothing changed and a bunch of fixes

commit 53f2041e2e1a28b35702a2ed51de885c36689f71
Author: RSDuck <rsduck@users.noreply.github.com>
Date:   Fri Nov 27 18:29:56 2020 +0100

    use vram dirty tracking for extpals
    also preparations to take this further

commit 4cdfa329e95aed26d3b21319c8fd86a04abf20f7
Author: RSDuck <rsduck@users.noreply.github.com>
Date:   Mon Nov 16 23:32:22 2020 +0100

    VRAM dirty tracking
2020-11-30 19:49:18 +01:00
RSDuck 2720df9650 make platform objects typesafer and add mutex 2020-11-09 21:52:35 +01:00
Arisotura 0804ab3c78 * rework GPU's settings interface, make it config-agnostic
* make video settings dialog functional, sorta
* fix dialogs that were resizable
2020-05-28 15:53:32 +02:00
Arisotura 83f8e11bc1 update copyright years 2020-02-14 20:18:08 +01:00
Arisotura 3c70015da7 software renderer: fix rendering of line polygons. fixes #350 2019-06-11 03:10:32 +02:00
Arisotura db396e992b welp.
progress
2019-05-21 22:28:46 +02:00
Arisotura c1746f0c60 BAHAHAHHHH
HARK HARK HARK
2019-05-16 20:58:07 +02:00
Arisotura c81bcccadc BAHAHAHAHAHAHAHAA 2019-05-16 16:27:45 +02:00
Arisotura 0a464c504d de-hardcode the GL renderer.
init framebuffer to black.
fix bugs.
2019-05-12 16:32:53 +02:00
Arisotura fb4f972cad hires hax. somewhat functional 2019-05-08 01:58:34 +02:00
Arisotura f8751bd1fb first attempt at things
(also fix softrenderer reset)
2019-04-01 02:51:31 +02:00
Arisotura b0efde8bf7 also, update copyright name 2019-01-22 15:58:29 +01:00
StapleButter f6e6fa05ea some work on extreme/degenerate shit in GPU
* clip against Z then Y then X. apparently, fixes #310. I had also observed hints that the hardware does it this way.
* truncate W to 24 bits before viewport transform.
* mark any polygons that have a W=0 at that point as degenerate. do not render.
2018-12-20 01:31:31 +01:00
StapleButter dd30b417b8 implement proper support for POWCNT1.
fixes #260
2018-12-18 17:04:42 +01:00
StapleButter 68d5e3c782 3D: in Z-buffering mode, margin for 'equal' depth test mode is +-0x200, not +-0xFF
fixes #274
2018-12-13 22:46:12 +01:00
StapleButter b4165cc0a9 3D: keep the rasterizer from accidentally going out of bounds when given very flat X-major edge slopes.
this, by a fucking shitshow of butterfly effect, ends up fixing #234. technically, the rasterizer was going out of bounds, which, under certain circumstances, caused interpolation to shit itself and generate Z values that were out of range (but still ended up in the zbuffer). sometimes those values ended up negative, which caused these glitches when polygons had to be drawn over those.

about fucking time.
2018-11-04 23:21:58 +01:00
StapleButter 4075dad0a8 3D: attempt at fixing that shadow/AA interaction bug in the MKDS character select screen 2018-10-22 01:36:04 +02:00
StapleButter fea7955675 fixor copyright years. 2018-09-15 02:32:13 +02:00