Commit Graph

66 Commits

Author SHA1 Message Date
Jakly f719438a6e
Improve calculation of light colors (#1967)
* maintain precision until all lights are calculated

fixes lugia on the soul silver title screen

* small optimization

* small note

* small cleanup/notes

shouldn't need to check that every time, since the variable shouldn't be able to overflow

* hw doesn't cap difflevel at 255

Should it cap at all?
Can vtx colors overflow...?

* diffuse level appears to be shifted right by 9

fixes some minor inaccuracies

* improve specular lighting a little

* small improvement to diffuse lighting

fixes a few off by ones
- finding by azusa

* small tweaks

* handle overflows of diffuse lighting properly

-credits to azusa once more

* attempt at improving specular lighting calcs

still far from correct, but its a start.
fixes: https://github.com/melonDS-emu/melonDS/issues/1545

* meh

* improve specular lighting further

* add notes

* theory: add half vec instead of subt 1

* implement azusa's specular lighting algorithm

* fix minor edge case with spec lighting

* give proper credit in comments

* fix some bugs/misc tweaks

* more quirky overflow/underflow handling

* fix a spec lighting edgecase

remove some redundant parentheses

* fix an edge case with light vector calcs

* spec recip uses a different calc for light dir?

also remove a check that shouldn't be mathematically possible to trigger

* nvm that thing i thought couldn't trigger was required

also move reciprocal calc into the light vector calc function since i might as well now ig

* replace a bunch of stuff with much *much* simpler algorithms

* misc cleanup

PARENTHESES WOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO

* leave a note abt shininess table's default value being incorrect
2024-09-10 16:13:51 +02:00
Jakly 161bd9d3d2
Default zero dot display register to the 24 bit integer limit (#1968)
* 0 dot disp defaults to the 24 bit integer limit

* useless correction

it goes through the reset function to set the variable on boot anyway but why not have the initialized state be correct too
2024-08-01 22:46:05 +02:00
Arisotura 8fc403cdad update copyright headers 2024-06-15 17:01:19 +02:00
RSDuck 043244a56d
Compute shader renderer (#2041)
* nothing works yet

* don't double buffer 3D framebuffers for the GL Renderer
looks like leftovers from when 3D+2D composition was done in the frontend

* oops

* it works!

* implement display capture for compute renderer
it's actually just all stolen from the regular OpenGL renderer

* fix bad indirect call

* handle cleanup properly

* add hires rendering to the compute shader renderer

* fix UB
also misc changes to use more unsigned multiplication
also fix framebuffer resize

* correct edge filling behaviour when AA is disabled

* fix full color textures

* fix edge marking (polygon id is 6-bit not 5)
also make the code a bit nicer

* take all edge cases into account for XMin/XMax calculation

* use hires coordinate again

* stop using fixed size buffers based on scale factor in shaders
this makes shader compile times tolerable on Wintel
- beginning of the shader cache
- increase size of tile idx in workdesc to 20 bits

* apparently & is not defined on bvec4
why does this even compile on Intel and Nvidia?

* put the texture cache into it's own file

* add compute shader renderer properly to the GUI
also add option to toggle using high resolution vertex coordinates

* unbind sampler object in compute shader renderer

* fix GetRangedBitMask for 64 bit aligned 64 bits
pretty embarassing

* convert NonStupidBitfield.h back to LF only new lines

* actually adapt to latest changes

* fix stupid merge

* actually make compute shader renderer work with newest changes

* show progress on shader compilation

* remove merge leftover
2024-05-13 17:17:39 +02:00
Jesse Talavera 8143f54956
Protect savestates while the threaded software renderer is running (#1864)
* First crack at ensuring the render thread doesn't touch GPU state while it's being serialized

* Get rid of the semaphore wait

* Add some extra fields into GPU3D's serialization

* Oops, TempVertexBuffer is already serialized

* Move vertex serialization into its own method

* Lock the GPU3D state when rendering on the render thread or serializing it

* Revert "Lock the GPU3D state when rendering on the render thread or serializing it"

This reverts commit 2f49a551c1.

* Add comments that describe the synchronization within GPU3D_Soft

- I need to understand it before I can solve my actual problem
- Now I do

* Revert "Revert "Lock the GPU3D state when rendering on the render thread or serializing it""

This reverts commit 1977566a6d.

* Let's try locking the GPU3D state throughout NDS::RunFrame

- Just to see what happens

* Slim down the lock's scope

* Narrow the lock's scope some more

* Remove the lock entirely

* Try protecting the GPU3D state with just a mutex

- I'll clean this up once I know it works

* Remove a duplicate method definition

* Add a missing `noexcept` specifier

* Remove an unused function

* Cut some non-hardware state from `GPU3D`'s savestate

* Assume that the next frame after loading a savestate won't be identical

* Actually, it _is_ worth it

* Don't serialize the clip matrix

- It's recalculated anyway

* Serialize `RenderPolygonRAM` as an array of indexes

* Clean up some comments

- I liked the dialogue style, but oh well

* Try restarting the render thread instead of using the lock

- Let's see what happens

* Put the lock back

* Fix some polygon and vertex indexes being saved incorrectly

- Taking the difference between two pointers results in the number of elements, not the number of bytes

* Remove `SoftRenderer::StateBusy` since it turns out we don't need it

- The real synchronization was the friends we made along the way
2024-01-07 23:39:43 +01:00
RSDuck ed650f2b46 call Reset on 3D renderer object 2023-12-21 21:43:57 +01:00
Jesse Talavera c867a7f1c0
Make the initial 3D renderer configurable via `NDSArgs` (#1913)
* Allow 3D renderers to be created without passing `GPU` to the constructor

* Make the initial 3D renderer configurable via `NDSArgs`

* Fix a compiler error
2023-12-15 14:53:31 +01:00
Jesse Talavera 9bfc9c08ff
Sprinkle `const` around where appropriate (#1909)
* Sprinkle `const` around where appropriate

- This will make it easier to use `NDS` objects in `const` contexts (e.g. `const` parameters or methods)

* Remove the `const` qualifier on `DSi_DSP::DSPRead16`

- MMIO reads can be non-pure, so this may not be `const` in the future
2023-12-12 11:07:22 +01:00
RSDuck 082310d5d5 hopefully reset all GPU3D attributes properly 2023-12-08 23:42:08 +01:00
Jesse Talavera-Greenberg 7caddf9615
Clean up the 3D renderer for enhanced flexibility (#1895)
* Give `GPU2D::Unit` a virtual destructor

- Undefined behavior avoided!

* Add an `array2d` alias

* Move various parts of `GPU2D::SoftRenderer` to `constexpr`

- `SoftRenderer::MosaicTable` is now initialized at compile-time
- Most of the `SoftRenderer::Color*` functions are now `constexpr`
- The aforementioned functions are used with a constant value in at least one place, so they'll be at least partially computed at compile-time

* Generalize `GLRenderer::PrepareCaptureFrame`

- Declare it in the base `Renderer3D` class, but make it empty

* Remove unneeded `virtual` specifiers

* Store `Framebuffer`'s memory in `unique_ptr`s

- Reduce the risk of leaks this way

* Clean up how `GLCompositor` is initialized

- Return it as an `std::optional` instead of a `std::unique_ptr`
- Make `GLCompositor` movable
- Replace `GLCompositor`'s plain arrays with `std::array` to simplify moving

* Pass `GPU` to `GLCompositor`'s important functions instead of passing it to the constructor

* Move `GLCompositor` to be a field within `GLRenderer`

- Some methods were moved up and made `virtual`

* Fix some linker errors

* Set the renderer in the frontend

* Remove unneeded `virtual` specifiers

* Remove `RenderSettings` in favor of just exposing the relevant member variables

* Update the frontend to accommodate the core changes

* Add `constexpr` and `const` to places in the interpolator

* Qualify references to `size_t`

* Construct the `optional` directly instead of using `make_optional`

- It makes the Linux build choke
- I think it's because `GLCompositor`'s constructor is `private`
2023-11-29 15:23:11 +01:00
Jesse Talavera-Greenberg e973236203
Refactor `NDS` and `DSi` to be objects (#1893)
* First crack at refactoring NDS and DSi into objects

- Remove all global/`static` variables in `NDS` and related classes
- Rely more on virtual dispatch when we need to pick methods at runtime
- Pass `NDS&` or `DSi&` to its constituent components where necessary
- Introduce some headers or move some definitions to break `#include` cycles

* Refactor the frontend to accommodate the core's changes

* Move up `SchedList`'s declaration

- Move it to before the components are initialized so the `map`s inside are initialized
- Fields in C++ are initialized in the order they're declared

* Fix a crash when allocating memory

* Fix JIT-free builds

* Fix GDB-free builds

* Fix Linux builds

- Explicitly qualify some member types in NDS, since they share the same name as their classes

* Remove an unnecessary template argument

- This was causing the build to fail on macOS

* Fix ARM and Android builds

* Rename `Constants.h` to `MemConstants.h`

* Add `NDS::IsRunning()`

* Use an `#include` guard instead of `#pragma once`
2023-11-28 23:16:41 +01:00
Jesse Talavera-Greenberg 346dd4006e
Move all core types into namespaces (#1886)
* Reorganize namespaces

- Most types are now moved into the `melonDS` namespace
- Only good chance to do this for a while, since a big refactor is next

* Fix the build
2023-11-25 18:32:09 +01:00
Jesse Talavera-Greenberg 4558be0d8e
Refactor the GPU to be object-oriented (#1873)
* Refactor GPU3D to be an object

- Who has two thumbs and is the sworn enemy of global state? This guy!

* Refactor GPU itself to be an object

- Wow, it's used in a lot of places
- Also introduce a new `Melon` namespace for a few classes
- I expect other classes will be moved into `Melon` over time

* Change signature of Renderer3D::SetRenderSettings

- Make it noexcept, and its argument const

* Remove some stray whitespace
2023-11-09 21:54:51 +01:00
Arisotura ac38faef14 update copyright years 2023-11-04 00:21:46 +01:00
Jesse Talavera-Greenberg db963aa002
Make the NDS teardown more robust (#1798)
* Make cleanup a little more robust to mitigate undefined behavior

- Add some null checks before cleaning up the GPU3D renderer
- Make sure that all deleted objects are null
- Move cleanup logic out of an assert call
- Note that deleting a null pointer is a no-op, so there's no need to check for null beforehand
- Use RAII for GLCompositor instead of Init/DeInit methods

* Replace a DeInit call that I missed

* Make ARMJIT_Memory less likely to generate errors

- Set FastMem7/9Start to nullptr at the end
- Only close and unmap the file if it's initialized

* Make Renderer3D manage its resources with RAII

* Don't try to deallocate frontend resources that aren't loaded

* Make ARMJIT_Memory::DeInit more robust on the Switch

* Reset MemoryFile on Windows to INVALID_HANDLE_VALUE, not nullptr

- There is a difference

* Don't explicitly store a Valid state in GLCompositor or the 3D renderers

- Instead, create them with static methods while making the actual constructors private

* Make initialization of OpenGL resources fail if OpenGL isn't loaded

* assert that OpenGL is loaded instead of returning failure
2023-09-15 15:31:05 +02:00
Arisotura 35cc79787d update copyright headers 2022-01-09 02:15:50 +01:00
RSDuck f792d3e6a1 handle changed VCount+threaded rasteriser more gracefully 2021-08-04 14:21:45 +02:00
RSDuck 436b3c4c1d update copyright year and add missing GPL headers 2021-03-12 20:07:40 +01:00
Wunk a7029aebae
Allow for a more modular renderer backends (#990)
* Draft GPU3D renderer modularization

* Update sources C++ standard to C++17

The top-level `CMakeLists.txt` is already using the C++17 standard.

* Move GLCompositor into class type

Some other misc fixes to push towards better modularity

* Make renderer-implementation types move-only

These types are going to be holding onto handles
of GPU-side resources and shouldn't ever be copied around.

* Fix OSX: Remove 'register' storage class specifier

`register` has been removed in C++17...
But this keyword hasn't done anything in years anyways.

OSX builds consider this "warning" an error and it
stops the whole build.

* Add RestartFrame to Renderer3D interface

* Move Accelerated property to Renderer3D interface

There are points in the code base where we do:
`renderer != 0` to know if we are feeding
an openGL renderer. Rather than that we can instead just have this be
a property of the renderer itself.
With this pattern a renderer can just say how it wants its data to come
in rather than have everyone know that they're talking to an OpenGL
renderer.

* Remove Accelerated flag from GPU

* Move 2D_Soft interface in separate header

Also make the current 2D engine an "owned" unique_ptr.

* Update alignment attribute to standard alignas

Uses standardized `alignas` rather than compiler-specific
attributes.

https://en.cppreference.com/w/cpp/language/alignas

* Fix Clang: alignas specifier

Alignment must be specified before the array to align the entire array.

https://en.cppreference.com/w/cpp/language/alignas

* Converted Renderer3D Accelerated to variable

This flag is checked a lot during scanline rasterization. So rather
than having an expensive vtable-lookup call during mainline rendering
code, it is now a public constant bool type that is written to only once
during Renderer3D initialization.
2021-02-09 23:38:51 +01:00
RSDuck b78bc4cb66 fixes to the threadedness of the sw rasteriser
also fix #639 and fix #880
2021-01-26 16:42:27 +01:00
RSDuck 7d448d911d use C++ style structs everywhere 2021-01-02 11:38:06 +01:00
Arisotura 66cec85a9a GPU: forward BG0HOFS to internal rendering engine register for 3D layer scroll (only when the rendering engine is enabled).
fixes #840

thank you RSDuck and Hydr8gon for your insight into this.
2020-12-10 19:12:08 +01:00
RSDuck 6e8bac3909 Merge vram dirty tracking
Squashed commit of the following:

commit b463a05d4b909372f0cd1ad91caa0c77a25e5901
Author: RSDuck <rsduck@users.noreply.github.com>
Date:   Mon Nov 30 01:55:35 2020 +0100

    minor fix

commit ce73cebbdf5da243d7ebade82d8799ded9cd6b28
Author: RSDuck <rsduck@users.noreply.github.com>
Date:   Mon Nov 30 00:43:08 2020 +0100

    fix dirty flags of BG/OBJ mappings not being reset

commit fc5d73a6178e3adc444398bdd23de8314b5ca8f8
Author: RSDuck <rsduck@users.noreply.github.com>
Date:   Mon Nov 30 00:11:13 2020 +0100

    use flat vram for gpu2d everywhere

commit 34ee9fe2bf04fcfa2a5a1c8d78d70007e606f1a2
Author: RSDuck <rsduck@users.noreply.github.com>
Date:   Sat Nov 28 19:10:34 2020 +0100

    mark VRAM dirty for display capture

commit e8778fa2f429c6df0eece19d6a5ee83ae23a0cf4
Author: RSDuck <rsduck@users.noreply.github.com>
Date:   Sat Nov 28 18:59:31 2020 +0100

    use flat VRAM for textures and texpals
    also skip rendering if nothing changed and a bunch of fixes

commit 53f2041e2e1a28b35702a2ed51de885c36689f71
Author: RSDuck <rsduck@users.noreply.github.com>
Date:   Fri Nov 27 18:29:56 2020 +0100

    use vram dirty tracking for extpals
    also preparations to take this further

commit 4cdfa329e95aed26d3b21319c8fd86a04abf20f7
Author: RSDuck <rsduck@users.noreply.github.com>
Date:   Mon Nov 16 23:32:22 2020 +0100

    VRAM dirty tracking
2020-11-30 19:49:18 +01:00
RSDuck 6977302403 make OpenGL renderer a build option
mostly meant for the Switch port
2020-10-01 00:01:05 +02:00
Arisotura 0804ab3c78 * rework GPU's settings interface, make it config-agnostic
* make video settings dialog functional, sorta
* fix dialogs that were resizable
2020-05-28 15:53:32 +02:00
Arisotura 83f8e11bc1 update copyright years 2020-02-14 20:18:08 +01:00
Arisotura 27f758d353 hack so that the GL renderer can render lines 2019-06-12 03:55:40 +02:00
Arisotura 3c70015da7 software renderer: fix rendering of line polygons. fixes #350 2019-06-11 03:10:32 +02:00
Arisotura 667dee6754 more code botching
it's less shitty tho

but still has bugs
2019-05-24 02:04:41 +02:00
Arisotura db396e992b welp.
progress
2019-05-21 22:28:46 +02:00
Arisotura b493c24128 remove reference to GL version 4.3 from filenames and namespaces 2019-05-20 00:05:37 +02:00
Arisotura de287825ee start work on display capture
also fix a bug in the compositing shader
2019-05-17 22:50:41 +02:00
Arisotura c1746f0c60 BAHAHAHHHH
HARK HARK HARK
2019-05-16 20:58:07 +02:00
Arisotura c81bcccadc BAHAHAHAHAHAHAHAA 2019-05-16 16:27:45 +02:00
Arisotura 0a464c504d de-hardcode the GL renderer.
init framebuffer to black.
fix bugs.
2019-05-12 16:32:53 +02:00
Arisotura 53b2262917 calculate hi-res vertex positions. reduces shaking of polygons when rendering at a higher res. 2019-05-11 15:14:59 +02:00
Arisotura fb4f972cad hires hax. somewhat functional 2019-05-08 01:58:34 +02:00
Arisotura f8751bd1fb first attempt at things
(also fix softrenderer reset)
2019-04-01 02:51:31 +02:00
Arisotura b0efde8bf7 also, update copyright name 2019-01-22 15:58:29 +01:00
StapleButter 669247e8c8 redesign main emu loop to use timestamps instead of being a trainwreck
* cleaner code
* faster in some cases
* more accurate (on-demand compensation for timers and GPU)
* less prone to desyncs
* overall betterer
2019-01-05 05:28:58 +01:00
StapleButter f6e6fa05ea some work on extreme/degenerate shit in GPU
* clip against Z then Y then X. apparently, fixes #310. I had also observed hints that the hardware does it this way.
* truncate W to 24 bits before viewport transform.
* mark any polygons that have a W=0 at that point as degenerate. do not render.
2018-12-20 01:31:31 +01:00
StapleButter dd30b417b8 implement proper support for POWCNT1.
fixes #260
2018-12-18 17:04:42 +01:00
StapleButter a9e7f8bc5b add proper support for GXFIFO stalls.
bad games that blast the GXFIFO and overflow it:
* Super Mario 64 DS
* Rayman RR2

latter seems to get its music streaming crapoed.
2018-11-23 22:21:41 +01:00
StapleButter a2cc7087f7 GPU done 2018-10-18 02:31:01 +02:00
StapleButter fea7955675 fixor copyright years. 2018-09-15 02:32:13 +02:00
StapleButter 2e23ae54b2 3D:
* more accurate polygon edges (still not perfect. heh)
* antialiasing (doesn't always work)
2017-08-28 18:37:07 +02:00
StapleButter 8f031f698b normalize W values in both directions (0123-0157 -> 1230-1570) 2017-07-06 18:54:51 +02:00
StapleButter d5376b4184 3D: Y-sorting 2017-07-05 18:38:10 +02:00
StapleButter 01404ac6c3 3D: move opaque/translucent sorting to GPU3D.cpp 2017-07-05 18:11:00 +02:00
StapleButter fa2db3826e (finally) make the threaded 3D renderer option actually work 2017-06-04 15:55:23 +02:00