rogerman
c7bb41e4b1
matrix.cpp: Rework all matrix function parameters for explicit array sizing in order to aid compiler optimization and (hopefully) aid in code readability. Also add SSE4.1 versions for the main matrix functions.
2018-02-19 11:43:55 -08:00
rogerman
5a61a08727
matrix.cpp: Do some more code cleanup.
2018-02-16 20:11:36 -08:00
rogerman
249afccfca
matrix.cpp: Do a bunch of code cleanup.
2018-02-16 11:59:19 -08:00
rogerman
c41a006b2a
GPU: Add additional basic SIMD-accelerated functions for memset_u16(), memset_u16_fast(), memset_u32(), and memset_u32_fast() for AVX2 and Altivec.
2018-02-13 14:45:17 -08:00
rogerman
5fbaa53b46
GPU: If a custom-sized layer is to be rendered first, GPUEngineBase::_TransitionLineNativeToCustom() will do a line clear instead of an upscaled line copy.
...
- Since this is a very common occurrence in many games, and since doing a clear is faster than doing an upscaled copy, this should give a small performance improvement for the larger framebuffer sizes.
2018-02-13 13:54:10 -08:00
rogerman
43d3883986
SoftRasterizer: Framebuffer clears are now accelerated using AVX2 and Altivec.
2018-02-12 18:03:52 -08:00
rogerman
ab18de05ef
SoftRasterizer: Oops! Fix a performance regression in SoftRasterizerRenderer_SSE2::ClearUsingValues() where the framebuffer was accidentally being cleared twice. (Regression from commit 7509d46.)
2018-02-12 13:42:42 -08:00
rogerman
7509d469b9
SoftRasterizer: Do some multithreading improvements, and also clean up and refactor RasterizerUnit.
...
- Completely encapsulate all stray global variables into the SoftRasterizer class where they belong.
- Framebuffer clears are now fully multithreaded, significantly improving clearing performance.
- Doing multithreaded texture loads and vertex calculations now requires a minimum of 2 threads, down from 4 threads.
- The maximum amount of SoftRasterizer threads has been increased from 16 to 32.
2018-02-12 11:35:21 -08:00
rogerman
9e3b694ace
Cocoa Port: Do some minor code cleanup.
2018-02-05 20:29:09 -08:00
rogerman
d1dcbb8218
Cocoa Port: Fix a potential deadlock that may occur on emulation reset.
2018-02-04 13:07:59 -08:00
rogerman
9ee7cd8ec0
NDSSystem.cpp: Check for the GPU struct before calling GPUSubsystem::ForceFrameStop() in GameInfo::closeROM().
2018-02-03 21:59:00 -08:00
rogerman
23be799a67
Cocoa Port: Metal display views no longer lose visible frames when running multiple display views.
2018-02-03 21:21:54 -08:00
rogerman
01c508f93a
Cocoa Port: Remove and replace the high-overhead NSThread with the lower-overhead pthread_t. Improves video display performance when the frame rate is very high (greater than 600 FPS).
2018-02-03 11:31:41 -08:00
rogerman
f9c32c9e79
Cocoa Port: Rework triple buffering for Metal display views yet again. This should fix the performance regression introduced in commit a65ceae9
for the larger custom framebuffer sizes.
2018-01-30 16:26:05 -08:00
zeromus
2a58246eb5
Merge pull request #123 from keelimeguy/master
...
Windows Port: Adding Pen and Touch support for touch screen devices
2018-01-08 16:04:11 -06:00
Keelin Wheeler
b11bde4be4
Windows Port: Adding Pen and Touch support for touch screen devices
2018-01-08 16:37:20 -05:00
rogerman
f2f3680a7c
Cocoa Port: Fix a bug where Metal display view backing textures weren't updating their custom framebuffer sizes. (Regression from commit 4c01e66.)
2017-12-19 15:35:20 -08:00
rogerman
4c01e66a8a
GPU: Instead of using fixed double-buffered output framebuffers, allow clients to request any number of framebuffer pages between 1 and 8.
...
- For all non-Cocoa ports, reduce the number of framebuffer pages from 2 to 1, reducing the memory usage for those ports.
- For the Cocoa port, increase the number of framebuffer pages from 2 to 3 in preparation for a new triple-buffered display scheme.
2017-12-19 14:33:48 -08:00
rogerman
d3b628af47
Cocoa Port: Rework synchronization for Metal display views yet again. It should be a lot better now.
2017-12-17 20:35:00 -08:00
rogerman
a65ceae98f
Cocoa Port: For Metal display views, be much smarter about how we do synchronization. Should fix the performance issues introduced with commit 26ac91ed
.
2017-12-11 16:28:42 -08:00
rogerman
1ea95cdde4
Cocoa Port: Do some minor code cleanup.
2017-12-11 16:17:02 -08:00
rogerman
cd2f75e43a
Cocoa Port: Replace all POSIX named semaphores with Mach semaphores and GCD semaphores, which are both faster than POSIX named semaphores.
2017-12-08 11:49:49 -08:00
rogerman
1e36b36bef
Cocoa Port: Remove now obsolete locks from sound functions, since we now call SPU_Emulate_user() in the emulation thread again.
2017-12-07 23:00:28 -08:00
rogerman
cee6867bd8
Cocoa Port: In RunCoreThread(), don’t use a potentially more expensive wait method before doing a cheaper time comparison first.
2017-12-07 21:01:59 -08:00
rogerman
bac10c7618
Cocoa Port: OpenGL display views no longer use glFlush() when rendering for final flush, since glFlush() has been found to not actually be necessary.
2017-12-05 17:13:24 -08:00
rogerman
26ac91edd0
Cocoa Port: For Metal display views, replace all locks with semaphores, which are the correct synchronization primitive to use here.
...
- Also change the CocoaDSOutput list lock from a mutex to a rwlock, since testing has shown that there is more thread contention here than I previously thought.
2017-12-05 13:43:30 -08:00
rogerman
f9109568b8
Cocoa Port: Improve stability of Metal display views when running CPU-based pixel scalers.
...
- Also fix a bug where restoring multiple display windows on startup would only have the last display window shown to work properly.
2017-12-03 00:18:30 -08:00
rogerman
39039f2396
Cocoa Port: Have the HUD Settings panel title show the number of the display window that is currently in focus, just like all the other panels.
2017-12-02 19:33:08 -08:00
rogerman
b48666ea9c
Cocoa Port: Add new HUD item, “Show Execution Speed”, which displays the emulator’s execution speed as a percentage.
2017-12-02 15:35:51 -08:00
zeromus
87335dd57a
Merge pull request #121 from spiveeworks/master
...
Update README.LIN (autogen.sh, dependencies)
2017-12-01 01:11:48 -06:00
Spivee
93d01f7bf5
Update README.LIN (autogen.sh, dependencies)
...
I found these dependencies harder to figure out than usual,
since I'm used to installing packages with pregenerated `configure` scripts.
In particular if `glib` is missing then `configure` will generate with unexpanded macros, which is confusing.
This extra paragraph should be helpful for others.
Thanks for a great program :)
2017-12-01 18:07:11 +11:00
rogerman
66e8a95657
Cocoa Port: Stability improvements for Metal display views.
2017-11-29 21:31:39 -08:00
rogerman
02a3b58edd
Cocoa Port: Fix memory leaks with Metal display views.
...
- Also fix a bug where Metal display views fail on macOS High Sierra if a CPU-based pixel scaler was used.
2017-11-29 20:02:39 -08:00
rogerman
c81df97a92
Cocoa Port: Restore the ability to use Metal display views on macOS High Sierra.
...
- Also rework the way the HQnx LUTs are loaded in Metal.
2017-11-28 14:06:34 -08:00
rogerman
cd6fbcd5ea
Cocoa Port: In the Metal framebuffer fetcher, further optimize 18-bit to 32-bit color conversions whenever the master brightness does not need to be applied, which is the most typical use case.
2017-11-28 00:53:50 -08:00
rogerman
f0564cc4ac
Cocoa Port: Fix a couple of rare edge-case bugs with Metal display views.
2017-11-27 22:53:18 -08:00
rogerman
258ebfd6ea
Cocoa Port: Synchronously force a framebuffer fetch on startup to guarantee that all display windows will appear black.
2017-11-27 21:17:16 -08:00
rogerman
e18dd27d30
Cocoa Port: Fix the Screenshot Capture Tool with running Metal. (Regression from commit f5ead86.)
2017-11-27 21:15:26 -08:00
rogerman
7213c6373b
GPU: All fields for NDSDisplayInfo should be set consistently relative to the NDSDisplayID, not the GPUEngineID.
...
- In practice, this should change nothing, since all pointers somehow managed to point to the correct buffer locations. This should be nothing more than a programming consistency and readability improvement.
2017-11-27 21:07:14 -08:00
zeromus
fa4b027dbd
winport: add --windowed-fullscreen
2017-11-27 18:16:40 -06:00
rogerman
d5b62d3d02
OpenGL Renderer: Improve the robustness of error-checking OpenGL drivers.
2017-11-24 02:02:05 -08:00
rogerman
010efff31b
Linux Port (GTK): Fix OSMesa context creation. (Regression from r4905. Fixes #119.)
...
- Also do some code cleanup.
2017-11-24 00:28:49 -08:00
rogerman
b9ada994df
Linux Port (GTK): Remove and replace legacy colorspace handling routines with the new SIMD-optimized colorspace handling routines.
2017-11-22 17:43:17 -08:00
rogerman
96bd35517b
Cocoa Port: Signal a fetch at startup, after reading the user defaults for GPU Scaling Factor and GPU Color Depth, in order to guarantee that the client-side fetch buffers will be cleared.
2017-11-21 19:05:11 -08:00
rogerman
5890540007
OpenGL: Maintain one more flag to ensure that textures are always initialized. (Fixes #116.)
2017-11-21 11:17:52 -08:00
rogerman
4269925258
GPU: Properly initialize the newer NDSDisplayInfo fields.
...
- This has the side-effect of having the Windows port’s display window
start up with a white screen and HUD showing (if enabled) just like
before, rather than a black screen and HUD possibly hidden.
2017-11-20 23:51:42 -08:00
rogerman
25eae6e1ed
Cocoa Port (OpenEmu Plug-in): Change the video format to 18-bit RGB666, which matches a hardware NDS.
2017-11-20 21:04:50 -08:00
rogerman
a7065311cc
Core: Change some default settings to use more compatible and true-to-hardware settings.
...
- GPU Color Depth (from 24-bit to 18-bit), Advanced SPU Logic (from Disabled to Enabled), SPU Interpolation (from Linear to Cosine), Synchronization Mode (from Dual SPU Sync/Async to Synchronous)
- Just like the previous change to the default JIT block size, let the users themselves disable these settings so that they are more aware that they are sacrificing compatibility for speed.
2017-11-20 16:39:50 -08:00
rogerman
48fee8d590
Windows Port: Don't resize the display window after changing GPU Scaling Factor.
...
- Also remove TCommonSettings.GFX3D_PrescaleHD. It is a useless setting in core because the internal resolution is not limited to integer-multiplied scaling.
- Also fix spelling on the "Maintain Aspect Ratio" menu option.
2017-11-20 14:37:30 -08:00
rogerman
24108e35e2
Core: Change the default max. JIT block size from 100 to 12, since 12 has been tested to be more compatible and safer to use.
...
- There is only a negligible performance difference between 100 and 12.
- It is better for users to change the JIT block size from 12 to 100
themselves, since it might make them more aware that they are
sacrificing compatibility in favor of speed.
2017-11-20 12:57:24 -08:00