- While this results in a 20% performance loss for video display window functions (like video filters), this also dramatically extends battery life. To return to the old way of using the discrete GPU for video display windows, the user must disable Automatic Graphics Switching in their System Preferences.
- The Troubleshooting Window now correctly reports the emulated 3D renderer that is currently active, rather than the one that is selected in the GUI.
- Also fix a bug where creating a OpenGL 3D renderer's context would immediately fall back to Apple Software Renderer if context creation failed. Now, context creation falls back to Apple Software Renderer as the last resort, only after all other Core Profile contexts have failed.
POSIX Ports: When acquiring an EGL context, try calling the client-specific eglGetPlatformDisplay() before falling back to the more generic eglGetDisplay(). Hopefully fixes#864.
feature was originally added via PR #764 to the gtk3 frontend.
this makes it possible to run the scaling on the GPU, avoiding the
incredibly slow software scaling that's otherwise done via cairo
when view->window size is set to anything > 1.0.
note that the window size "scale" needs to be identical to the
chosen GPU scale factor, otherwise software scaling kicks in again.
unlike the scale setting in the CLI port, which simply upscales the
native NDS framebuffer in hardware, this setting scales up even
the actual 3D textures, resulting in a sharper image, at the cost
of higher CPU/GPU usage. a game using demanding 3D scenes, like
zelda phantom hourglass' intro scene, may be able to still trash
the FPS.
the original PR also reported issues when setting the GPU scale to
a fraction, therefore the increments are currently locked to 1.0.
until now, the boost key was hardcoded to 'o' even when the config
said otherwise, and not treated in case of a joypad at all.
when triggered with the o key, it even behaved differently than
the gtk ui - the boost wasn't released together with the key,
but only when pressed again, so it was more like a shortcut for
"disable fps limiter".
this change implements the desired outcome of the second part of PR #822,
but without introducing more hacks and relying on magic values.
closes#822
because the GTK frontends use GDK keysyms, not SDL ones, - the former
are being stored in the config file and used by the GTK ui) - a temporary
workaround was put into place 14 years ago: the loaded config values
were simply being overwritten with the hardcoded defaults.
this commit removes the overriding of the config, and introduces
a cli frontend specific section "SDLKEYS", which is written by the
GTK2 frontend upon a configuration change.
it tries to convert the GDK keycodes into SDL2 ones while doing so.
an alternative solution (involving less code changes) would have been
to do the conversion in the cli frontend, but that would require having
the gdk header available for compilation, which may not be the case
if the user only wants the cli frontend. such a user could now create
the config file on another machine with the GTK frontend, or simply
manually take the desired values from the SDL_keycode.h header.
this change is instigated by one of the changes in PR #822, which simply
removed the workaround and kept parsing on error, which mitigated the
problem for some keys, but not all.
[GTK*] Some gamepad input rework
- Unbind gamepad keys by default. The default bindings may fit one gamepad model but work weirdly with another causing issues like [LINUX/GTK3] Whitescreen freeze when pressing square on ds4 #834
- Use non-blocking method to obtain gamepad keys/axes during configuration to avoid visible emulator freeze and possible deadlock (see [linux] editing controlls sometimes (often) freezes the emulator #843)
closes#834closes#843
* GTK cheats UI inputs/displays hex for address (offset)
* Added range bound to keep internal cheat data within specified size
The decision to change the value column to store G_TYPE_UINT is purely pragmatic--having all data be treated and displayed as unsized is simpler than writing a custom display to handle signedness and cleaner than 1-3 byte ints appearing unsigned and 4 byte int appearing signed.
* Added index and cheat type data to ListStore
My implementation plan is to use a GtkTreeModelFilter to create separate section for internal and ActionReplay cheats, but I was not convinced the indices in the tree path would correspond to the indices in , hence the additional column.
* Filter raw and AR cheats, display both filters in UI, patch up raw update/delete
* Action Replay UI elements [GTK]
* Memory leak fixes
+ some additional clean-up comments and small bug patches
* Backport to GTK2
- Using the new OpenGLGeometryResource and OpenGLRenderStatesResource classes, the 3.2 Core Profile and ES renderers now use triple-buffering for all geometry rendering resources and framebuffer constants.
- Delete the InitFinalRenderStates() method, which has been obsoleted over the years. Its functionality has been rolled into the InitExtensions() and _RenderGeometryLoopBegin() methods.
- In the 3.2 Core Profile and ES renderers, synchronization of vertex info uploads now occurs at a more reasonable time.
- Geometry rendering is now Y-flipped by default, eliminating the need to Y-flip the final output framebuffer under all conditions.
- Legacy OpenGL no longer needs to perform any color conversion of the final output framebuffer if FBOs are supported.
- Rename some functions and variables to better describe what things are doing now.
- Remove some methods in OpenGLRenderer_2_0 and OpenGLRenderer_2_1 that have negligible contribution to either performance or code simplicity.
- OpenGLRenderer_1_2, OpenGLRenderer_2_0, and OpenGLRenderer_2_1 are now instantiated with their specific variant IDs.
- Calls to malloc_alignedCacheLine() have been replaced with the more appropriate malloc_alignedPage().
- Do some misc. code cleanup.
- Also fixes build errors related to explicitly building SIMD files for colorspacehandler. Don't do it! The proper way is to simply include "colorspacehandler.cpp" alone and let the compiler's preprocessor macros determine which SIMD file to use.
- Release builds now use -Ofast optimization instead of -O3.
- Release builds no longer strip their symbols. This will be needed for profiling and debugging.
- These changes now presume that standard OpenGL and OpenGL ES are mutually exclusive. We will NOT support running a standard OpenGL context and an OpenGL ES context in the same process.
- Clients must explicitly request supporting the OpenGL 3D renderer in their build configuration. Build configurations must define ENABLE_OPENGL_STANDARD (replacing the HAVE_OPENGL macro) or ENABLE_OPENGL_ES. If neither macro is defined, then the OpenGL 3D renderer will be assumed unavailable.
- Meson and Autotools now use better header/library checks for OpenGL functionality with their associated context type.
- Add a new Code::Blocks project file that can make builds for CLI, GTK, and GTK2.
- GTK and GTK2 ports now have the option to run a legacy OpenGL context, a 3.2 Core Profile context, or simply choosing the best context automatically like before.
- GTK and GTK2 ports have GLX and EGL as new context types. OSMesa and SDL have been updated to the latest design pattern.
- GTK and GTK2 ports can now be configured (via Meson or Autotools) to use a GLX, OSMesa, EGL, or SDL context.
- OSMesa contexts are now marked as deprecated. I don't know of anyone who still uses them, and I've never been able to get it to work correctly for years. Now that we have GLX contexts for compatibility purposes, OSMesa contexts are now completely redundant.
- Fix a bug with the GTK port where ancient GPUs without FBO support can still run framebuffers at custom sizes.
- For POSIX ports, move all "avout" and context creation files to the "shared" directory for better file organization.
- Fix a compiling bug for ES due to missing tokens.
- Fix Fog and Edge Mark feature availability when running legacy OpenGL. (Regressions from commit 0c7cb99 and commit 8b5ac56.)
- glDrawBuffer() now determines its own algorithm at runtime instead of at compile time. This change is being done to be consistent with all of the other Standard vs ES changes.
- Respect the draw buffers ordering rules that are unique to ES.
- Report the actual OpenGL variant being requested when initializing the renderer.
- Add oglrender_deinit() function pointer so that clients can properly handle the destruction of their associated context resources at the correct time.
- Certain internal OpenGL info is now assigned at run time instead of at compile time. This change now allows any of the OpenGL renderers to run side-by-side.
- Fix a potential unaligned access crashing bug in ES when clear images are to be rendered.
- Fix an ES bug where clear images would fail to render when MSAA is enabled.
- Fix a legacy OpenGL bug where toon table colors were not ignoring their alpha bit, according to GBATEK.
- These graphical glitches are resolved only when running 3.2 Core Profile with the GL_ARB_sample_shading extension, available on all modern GPUs.
- Do some minor optimizations to the Edge Mark and Fog shaders.
- Also fix an FBO attachment bug in legacy OpenGL that was introduced in commit 8b5ac56.
- The output framebuffers now bind their own FBOs rather than changing draw targets with glDrawBuffer().
- Rework the general FBO management.
- Legacy OpenGL now outputs native RGBA color if FBOs are supported. This should give a minor performance increase on older GPUs.
- The fixed-function pipeline can now flip the framebuffer on GPU. This greatly reduces the CPU usage when doing the final color conversion and gives a significant performance increase on ancient GPUs.
- There is no more need to switch to the working texture as the destination for the fog output. This change will be essential for future commits.
- The dual-source blending method has been obsoleted and removed.
- FBOs are no longer required for the fog feature, easing requirements for ancient GPUs.
- Ancient GPUs may see a small performance benefit due to shader simplification.
- Centralize all header includes into OGLDisplayOutput.h.
- Update all extension versions of OpenGL functions/tokens that would be core in OpenGL v1.5.
- Change the remaining FBO attachments that I missed the last time to use RGBA8 internal format.
- Fix picky ancient drivers that won't accept GL_RED as a texture internal format. It has been changed to GL_LUMINANCE.
- Fix picky ES drivers that demand that the external format used in both glTexImage2D() and glTexSubImage2D() are exactly the same.
- OpenGL ES doesn't support GL_UNSIGNED_SHORT_1_5_5_5_REV for texture data, and GL_UNSIGNED_SHORT_5_5_5_1 is incompatible with our data. So instead, all 16-bit data will be converted to 32-bit LE before uploading it via the textures, and such ES textures will now take GL_UNSIGNED_BYTE format.
- Remove the #include for EGL/egl.h in OGLRender.h, since EGL shouldn't exist at this level of the code stack.
- ColorspaceConvert5551To8888()
- ColorspaceConvert5551To6665()
- ColorspaceConvertBuffer5551To8888()
- ColorspaceConvertBuffer5551To6665()
- Also rename the existing 16-bit color conversion functions to help further distinguish the functions from one another.
- glTexParameteri() with GL_TEXTURE_2D_MULTISAMPLE throws INVALID_ENUM when trying to assign a sampler-related state, so these calls have been removed.
- glMapBufferRange() throws GL_INVALID_OPERATION if the buffer size is 0, so check for this condition first.
- Any ES-specific error messages are now reported as "OpenGL ES" instead of "OpenGL".
- Explicitly declare the internal format of all textures and RBOs used as FBO attachments as "GL_RGBA8" instead of the generic "GL_RGBA". This may fix some extremely picky ES drivers that would throw GL_INVALID_ENUM if we don't use a sized internal format.
- Hard code the read pixels format as GL_RGBA and the read data type as GL_UNSIGNED_BYTE. It is meaningless to rely on GL_IMPLEMENTATION_COLOR_READ_FORMAT and GL_IMPLEMENTATION_COLOR_READ_TYPE since we are reading from our own FBOs that are hard coded to GL_RGBA8 format.
- Require that OpenGL ES contexts support the GL_OES_surfaceless_context extension, since we are doing offscreen rendering strictly on FBOs only. This extension has been around for a very long time, and so this serves as a kind of compatibility check.
- Fix a bug where _isShaderFixedLocationSupported wasn't being set to true, causing shaders to fail.
- OpenGL version checks now account for non-compliant ES drivers that contain text before the version number.
- Manually set the default read/draw buffers for all FBOs upon creation. We shouldn't need to do this, since they should always be set later, but just in case...
- Add missing GL_BGRA macro.
- Tidy up the OpenGL naming in some error strings.
- Add support for fixed locations in shaders for OpenGL 3.3 and later.
- The fog density table texture is now a 2D texture instead of a 1D texture.
- Fix a memory leak where the polygon state texture wasn't being deleted upon the destruction of the OpenGLRenderer object.
- PBO handling now works via glMapBufferRange() instead of glMapBuffer().
- Polygon states can now be uploaded using plain integer textures. 64k UBOs and TBOs are no longer required.
- Slightly change the names of attachment defines so that they can be distinguished from the native OpenGL attachment defines.
- The minimum version driver check now accounts for OpenGL ES.
- OpenGLRendererCreate() can now handle any possible OpenGL variant, as declared in the OpenGLVariantID enum.
- Framebuffer read backs may now assigned their format and data type. (Legacy OpenGL assigns the format as GL_BGRA, while 3.2 Core Profile assigns the format as GL_RGBA.)
- All platform-specific header includes are now centralized in OGLRender.h.
- Remove all ARB and EXT versions of legacy functions (everything pre 3.0). Legacy functions now only reference their core versions.
- Do a some minor code cleanup.
* gtk: make OSD scalable
* Scale save slot indicator (oops), make text outlines look smoother, use
larger font when not scaling
* Save and load HUD layout, prefer raster font on low resolution, select
vector font size close to raster one, make OSDCLASS::scale floating point
* Build fix
* Add reset HUD layout action, only require fontconfig if libagg is found.
* Try another font in case we could not locate monospace
* Detect screen bytes per pixel instead of hardcoding it, define
AGG2D_USE_VECTORFONTS if fontconfig is found.
* Different pixel formats are handled by different draw target
implementations
* [WIP] gtk: implement GPU scale factor feature
* Replace combobox with spin button, fix taking screenshot
* Fix distorted image, add some checks for scale factor value
* Make OSD at least properly visible
- At their own risk, this option allows the user to add any cheat from the database to any game that they want, regardless of any potential dangers that may arise from doing so. Use this option responsibly.
- This change is to help avoid misclicks on the button that may instantly wipe out the user's cheat list.
- The new Actions pop-up menu also exists to incorporate some new cheat list operations that will be coming very soon.
- Multiple cheat database files may now be opened simultaneously, each in their own individual windows.
- Cheat database files are now fully browsable.
- Game entries are now searchable by game title, serial, and CRC.
- Cheat entries are now viewed in a hierarchical layout, better representing the FAT format of the database entries.
- All cheats within a directory can now be selected or deselected in just a single click.
- Error handling is now more robust, informative, and nicer looking.
- Cheat database files are no longer assigned in DeSmuME Preferences; they are now opened through the "File > Open Cheat Database File" menu.
- Recent cheat database files are now saved, and can be quickly accessed through the "File > Open Recent Cheat Database File" menu.
- It is now possible to remove all cheats at once from the Cheat Manager's cheat list.
- Fix a bug where loading all database game entries from an encrypted database would result in reading gobbledygook.
- Fix a bug where calling CheatDBFile::LoadGameList() for all database game entries would always return 0 entries rather than the actual number of found entries.
- The file description is no longer limited to 16 characters.
- Folder notes are now included in the description strings of exported cheat items. These can be important for cheats that include operating instructions in their associated folder notes.
- Fix a bug where reading the last game entry of the database file would fail.
- Fix a potential bug where reading a game entry from an encrypted database file would fail if the initial entry data resides very close to a 512-byte boundary.
- Fix a bug where deleting a CHEATSEXPORT object without calling CHEATSEXPORT.close() would result in its associated file remaining open.
- Fix a bug where deleting a CHEATSEXPORT object without calling CHEATSEXPORT.close() would result in CHEATSEXPORT.cheats leaking memory.
- Add the following C++ classes: ClientCheatSearcher, ClientCheatDatabase
- Remove the following Obj-C classes: CocoaDSCheatSearch, CocoaDSCheatSearchParams
- Remove duplicate GUI code from EmuControllerDelegate.mm and preferencesWindowDelegate.mm
- All basic functionality for managing game session cheat items, the cheat database list, and cheat search are now managed through CocoaDSCheatManager.
- Add new ClientCheatList C++ class, further reducing dependence on Objective-C code.
- Making any changes to the cheat list or to any cheat items no longer requires the acquisition of an R/W lock.
- Add new ClientCheatItem C++ class to handle cheat items, greatly reducing dependence on Objective-C code.
- Remove a bunch of methods from CocoaDSCheatItem and CocoaDSCheatManager that were never used and are no longer planned to ever be used in the new code refactor.
- The Cheat Manager window may now be resized.
- The Action Replay code editor now uses Monaco 13 font instead of the system default font.
- The command for "Enable/Disable Cheats" has been renamed to "Enable/Disable Cheat System" to help clarify that the command affects the entire cheat system as a whole, as opposed to enabling/disabling individual cheat items.
- Some minor API changes were made, but only Windows and Cocoa were actually tested. Tried to make sure that Linux ports were updated to the new API, but haven't tested it.
- CommonSettings.GFX3D_TXTHack has been repurposed to switch between fixed-point math and float-based math.
- Fix various rendering bugs that were caused by a loss of Z precision introduced in commit 7751b59.
- In Pokemon Diamond/Pearl, the bug that caused random black dots to appear on the ground has been fixed.
- Also remove the unused NDSVertexf struct. There shall be only one representation of the NDS vertex data, and that shall be the fixed-point values of NDSVertex.
- Add SIMD-float32 data types, and also add macros to track SIMD data-type availability.
- Also fix some bugs where 3D would fail to render on big-endian systems. (Regression from commit a67e040.)
- To determine polygon facing, use GFX3D's CPoly.isPolyBackFacing instead of using GLSL's gl_FrontFacing. This eases OpenGL version requirements and improves older GPU compatibility a little.
- Also fix a bug where GPUs that support FBOs, but not shaders, were unable to read out their framebuffers properly.
- Specifically, viewport transformation, face calculation, and face culling are now handled in GFX3D, and are now standard behaviors for all 3D renderers. This reorganization makes more sense since the 3D renderers are primarily responsible for rasterization and framebuffer post-processing, rather than for processing geometry.
- As a positive side-effect, the OpenGL renderer gains a small performance improvement as well as better accuracy in face culling.
- Historically (ever since commit b1e4934 and commit d5bb6fd), VERTLIST_SIZE (based on POLYLIST_SIZE) could reference an index over 66535, but POLY.vertIndexes has always been an unsigned 16-bit value with a max value of 66535, making for a potential overflow. In practice, an overflow has never happened in the past 15 years because a hardware NDS has a limit of 6144 polygons (or 24576 vertices), and that even a VERTLIST_SIZE of 80000 is way more than plenty to accommodate that. Rather than increasing POLY.vertIndexes to 32-bit, it makes more sense to reduce POLYLIST_SIZE to 16383 to keep VERTLIST_SIZE below 66536. Even with this change, POLYLIST_SIZE and VERTLIST_SIZE should be more than plenty in practice.
- Also, now that we're performing polygon clipping for all client 3D renderers (ever since commit e06d11f), we're finally accounting for the fact that the clipped polygon list size is larger than POLYLIST_SIZE. Client 3D renderers have been updated to now reflect this change and avoid theoretical overflow issues that have never actually happened in practice.
- Since the shininess table is only ever used for vertex generation, it makes no sense for the table to included for the rasterization step. The shininess table has been moved to the NDSGeometryEngine class, which is a class dedicated to just vertex and polygon generation. This reorganization seems more sensible.
- There are two significant behavior changes in this commit that will require further testing.
- Behavior change: Before, MTX_LOAD_4x4 and MTX_LOAD_4x3 commands would update the individual values in the current matrix for each command. Now, these commands will batch the values into a temporary matrix until the temp matrix is complete, and then copy the temp matrix into the current matrix. This now matches the batching behavior that the other matrix commands already do.
- Behavior change: Before, there was a single shared temporary multiplication matrix used to batch the values of incoming MTX_MULT_4x4, MTX_MULT_4x3, and MTX_MULT_3x3 commands, theoretically allowing these commands to be used interchangeably and overwrite values from previous commands until the last command made a completed multiplier matrix. Now, there are 3 separate temporary multiplier matrices, one for each of the MTX_MULT_* commands, which means that each command type must now complete its own multiplier matrix before it can perform a matrix multiply.
- The integer-based Box Test should be just as good as the old float-based one, as tested by "American Girl: Julie Finds a Way". Of course, there are still bugs compared to a hardware NDS, but we haven't transitioned to rendering with integer-based vertices yet to make a meaningful test possible.
- Rename a bunch of variables to better reflect their intended usage.
- Add new data types for organizing 3D vectors and coordinates.
- C functions that are called in response to 3D commands now follow the exact same pattern: “static void gfx3d_glFuncName(const u32 param)”
- Be super explicit about the usage of numeric data types and their typecasts to improve their code visibility.
- Remove implementation-specific ambiguity of right bit-shifting signed numerics (Logical-Shift-Right vs Arithmetic-Shift-Right) in the following functions: gfx3d_glLightDirection_cache(), gfx3d_glNormal(), gfx3d_glTexCoord(), gfx3d_glVertex16b(), gfx3d_glVertex10b(), gfx3d_glVertex3_cord(), gfx3d_glVertex_rel(), gfx3d_glVecTest().
- Also remove Viewer3D_State.indexList. This member is obsolete since Viewer3D_State.gList.clippedPolyList is generated in the same order described by indexList.
- Encapsulate and rename some more lists to make their intended purpose more descriptive.
- Most importantly, refactor the GFX3D_Clipper::ClipPoly() method into the standalone GFX3D_GenerateClippedPoly() function, which drops all class member dependencies and is much more straightforward to use.
- Remove the GFX3D_Clipper class. It has been slowly gutted over the years, but the loss of the ClipPoly() method makes the class obsolete, putting the final nail in its coffin.
- Using the new GFX3D_GenerateClippedPoly() function, gfx3d_PerformClipping() operates on the unsorted clipped polygon list directly. This means that the polygon clipping step requires one less buffer copy.
- Clipped polygons no longer retain direct pointers to POLY structs, instead using their index member to reference a POLY struct at a POLY array location. All 3D renderers have been updated to reflect this change.
- Rename a bunch of variables that reference POLY, CPoly, and VERT structs to clearly differentiate them between raw NDS data and our own internally processed data.
- Fix a potential bug in GFX3D_GenerateRenderLists() where sorting clipped polygons would reorder their polygon indices without reordering their other data members along with the indices, causing the data members to desync. OpenGL rendering was immune to this bug, but SoftRasterizer might have possibly seen it. In any case, this fix is the correct behavior.
- Problems only occurred when LOADING a save state. However, SAVING a save state on commit 5426509e should be okay, so such save states should not be broken.
- Viewports are now processed on VIEWPORT register write instead of being processed at render time.
- CLEAR_DEPTH, CLRIMAGE_OFFSET, EDGE_COLOR, FOG_TABLE, and TOON_TABLE register writes are now handled more consistently.
- The fogDensityTable check for force-drawing clear images in gfx3d_VBlankEndSignal() has been removed. Changes done in commit 8438a5a6 will always causes this check to fail, and this commit will always cause this check to fail. Therefore, this check is now obsolete.
- Change a bunch of GFX3D-related structs from C++ style constructed structs into C-style POD structs.
- Better account for UltraSPARC's unique memory page size.
- malloc_alignedCacheLine() no longer returns 16-byte aligned memory if the architecture is neither 32-bit or 64-bit. Now, the function only returns 64-byte alignment for 64-bit architectures OR 32-byte alignment for 32-bit architectures.
- The new code works by pre-swapping big-endian words on disp_fifo.buf write, rather than swapping the big-endian words during disp_fifo.buf read.
- There is a behavior change here. Before, 8-bit and 16-bit writes to disp_fifo.buf would increment disp_fifo.tail. Now, 8-bit and 16-bit writes only increment disp_fifo.tail when the most significant bit within the FIFO value's 32-bit boundary is written to.
- Behavior is unchanged when doing 32-bit writes. In practice, the rare games that use display FIFO have only ever done 32-bit writes, so this scenario is well tested.
The entry index gets converted to a (temporary) string, from which a
pointer to internal data is taken and only consumed outside the loop,
where parent variable no longer exists.
- Fixes a mismatched register warning in arm_enable_runfast_mode() when compiling for AArch64.
- Fix compiling check_arm_cpu_feature() on non-ARM architectures by being super explicit and pedantic about checking for __ARM_ARCH; none of this compiler-assumes-a-macro-equals-zero-if-undefined stuff.
Catmull-Rom can give outputs greater than 16bit, so we must use 15bit precision. Also, ensure to use floor() to force a round-down regardless of host rounding behaviour.
- This commit doesn't actually do anything, but it is the "proper" way for Apple OSes to deal with file paths that interact with lower-level C file functions.
- This change results in a small, yet measurable, performance improvement.
- Note that this change has the side-effect of enabling saturation logic for the following functions: MatrixMultVec3x3(), MatrixTranslate(), MatrixScale(). This is a change in their behavior, since these functions did not perform saturation logic before. This will need additional testing.
The only time we need fatDir to be not-empty is when we are using a custom FAT directory (ie. `slot1_R4_path_type/sameAsRom == false`). When we are using the ROM path (`slot1_R4_path_type/sameAsRom == true`), fatDir is not even used, and we should be able to load the FAT image.
- Replace all references to "OS X" with "macOS".
- Make the example bug report reflect a user running Monterey instead of Mojave.
- Simplify the notes regarding a Penryn-era Core 2 Duo as the minimum recommended processor. I forgot that release builds drop the CPU instruction set from SSE4.1 to SSSE3, and so a 2.4GHz Penryn with SSSE3 isn't that much faster than a 2.4GHz Santa Rosa.
- As a positive side-effect, this fix also allows mipmapped HUD font rendering to work on the OpenEmu plug-in, so that capability has now been enabled.
- Now that we can use OpenGL, we can increase the 3D render scaling to up to 8x for machines that can handle it.
- Also add the Fragment Sampling Hack option for SoftRasterizer for certain games that need it to 'fix' texture rendering.
- Also add the Smooth Textures option for OpenGL for games that can benefit from it.
- So apparently, the buffers used to upload the font texture data must remain in memory for the entire lifetime of the texture when running on Monterey. It is a mystery why the OpenEmu plug-in requires this for Monterey, as this is not required for older macOS versions, nor is it required in any way on the standalone app.
- Also remove the copy of the HUD font path. Since we're now copying the font file itself into memory, retaining a copy of the font path is no longer necessary.
- Brings back compatibility for OpenEmu v1.0.4. (This was done because the v0.9.11 could run on it. Note that the advanced display features still require the latest OpenEmu version.)
- Also fixes issues with HUD font rendering.
- Includes native binaries for Intel 64-bit, Intel 64-bit Haswell, and ARM64.
- The Dynamic Recompiler engine for ARM64 processors is now enabled by default, greatly increasing performance on Apple Silicon Macs.
- Now includes the full suite of dual-screen display layouts, screen rotations, display gap options, and so on.
- The Heads Up Display is now included.
- Can use the latest features of OpenEmu v2.3.3 running on macOS Monterey, but is also backwards compatible with OpenEmu v2.0.9.1 running on macOS El Capitan.
- Also includes general stability improvements.
- If you have an ARM processor, expect anywhere between 10% to 50% improvement to CPU emulation performance for most games.
- But it isn't quite for prime time just yet. There is a crashing bug related to munmap() at emit_core.cpp:93 that causes the app to crash when resetting the emulator after a game has already been run.
- The Cocoa port now enables the new ARM JIT engine for Macs running Apple Silicon processors, but only for dev+ builds. This is due to the crashing bug noted above.
- There should be no functional changes in this commit. We're simply mirroring the latest SDK for changes to come.
- Fix all deprecations, with the exception of [OEGameCore changeDisplayMode]. This one will be quite involved.
- The features contained therein aren't ready for prime time, and so they are being pushed out to the next release.
- These menu options are still accessible on dev+ builds.
- This change basically returns us to using 1D textures for color LUTs instead of using uniform arrays. 1D textures seem to be more compatible for most older hardware.
- Unfortunately, while most older GPUs will work better with this change, this may break the OpenGL renderer on even older GPUs, such as the GeForce 7800 GT (circa 2005).
- I'm estimating that more old GPUs benefit from this change than not, and so using 1D textures is what will stand. The vast majority of users will be using hardware newer than this, and so anyone who can't run OpenGL renderer in 2022 can just switch to SoftRasterizer.
- This bug was found by enabling FIXED_POINT_MATH_FUNCTIONS_USE_ACCUMULATOR_SATURATE. Since this macro is disabled by default, this commit should not affect any normal operation.
- Apparently, there are some Macs that have Intel Haswell CPUs that can run macOS versions earlier than El Capitan, so Metal.framework must be weak-linked for the Final Release.
- All x86_64h Debug builds and all Apple Silicon builds still retain strong linking with Metal.framework.
- This is probably super paranoid and completely overkill, but it makes me feel better to do this. Now there is absolute certainty that nothing can disrupt the drawable order in between rendering and presentation. Microstuttering from mis-ordered drawables can no longer happen.
- Fix a bug where running DeSmuME on a Mac with a non-Haswell 64-bit Intel CPU would fail to switch the GUI icons into Dark Mode, despite the user running Mojave or later.
- Fix a bug where the GUI icons would occasionally fail to correctly switch between Light Mode and Dark Mode if the user changed the system appearance in System Preferences while running Mojave or Catalina.
- Add a new menu option in "Tools > App Appearance Mode" to manually force DeSmuME's app appearance to reflect Light Mode or Dark Mode. (Only available on dev+ builds.)
- Since the executable can now contain 5 binary slices, having the detailed info available can provide extra insight on the user's runtime environment.
- Hardware microphone authorization is now requested on app startup instead of when a ROM is loaded.
- CoreAudioInput is now better at handling situations when the hardware mic is not available, fixing some bugs with the mic level indicator.
- Add some helpful tooltips in the Microphone Settings panel when the hardware mic is not authorized.
- Add a new idle mic icon to denote when the hardware mic is not available. (The gray color should denote a 'software only' status.)
- Further brighten up the microphone icon for when software samples are active to help with visibility when running Dark Mode.
- Access to the matrix stacks has been simplified to the point where MatrixStackInit() and MatrixStackGet() are now obsolete. These functions have been removed.
- Also adds the AVFoundation framework to the "Xcode (Latest).xcodeproj" file in preparation of new UI related to dealing with macOS Mojave's microphone permissions.
- Also do some minor bug fixes with some floating-point functions.
- Also remove __vec4_dotproduct_vec4_fixed_SSE4() since the function didn't work anyways, and since we now have __vec4_dotproduct_vec4_fixed_NEON() to use as an actual working reference.
- This refactor was done to support future additions of SIMD functions using ISAs other than SSE / SSE4.1.
- Add support for fixed-point math functions using accumulators that saturate, following the GEM_TransformVertex() function. This feature requires new testing, and has been disabled for now in order to retain the previous behavior.
- Remove the obsolete and unused functions _MatrixMultVec4x4_NoSIMD() and vector_fix2float(). If non-SIMD testing is required, it should be easy enough to comment out the SIMD code paths in the appropriate function in favor of the plain C code path.
- The only OpenGL version with a revision number is v1.2.1, but the Cocoa port will always use v2.0 or higher. So let's remove the revision number to make things look cleaner.
- Also change the tags "DESMUME RUNTIME INFORMATION" to "DESMUME TROUBLESHOOTING INFORMATION", which is more explanatory when the information is copy/pasted into whatever text field it appears in.
- I solved it by simply reloading the entire outline view instead of picking and choosing specific items to reload. Due to the relatively few amount of items in the outline view (less than 1000 items), reloading the entire outline view is still very fast, even on a PowerPC Mac.
- MainMenu.strings files can still be generated from builds from the "DeSmuME (XCode 3).xcodeproj" file, which does result in a cleaner file to begin with.
- Add line for the Video Output Engine. Backend type (OpenGL or Metal), version, and renderer information are all reported.
- Remove section breaking dashes, as this causes GitHub's comment parser to reformat the runtime info in unpredictable ways.
- To compensate for removing the section breaking dashes, add delimiters for the beginning and the ending of the runtime information.
- If the Active Cheat Count is 0, then just report "NO" for Cheats, as this is functionally equivalent, but less confusing to read.
- Commit 9ccc791 was, more or less, a straight port of the SSE2 code, making it less than ideal. This updated version uses more NEON-only instructions to further improve performance.
- These changes shouldn't change existing functionality, but are more to document what the code should actually be doing. Regardless, these changes are truly correct.
- Note that NEON support is assuming the A64 instruction set. But if there is enough user demand for running the A32 instruction set, and if it is feasible to backport the NEON code to A32, then this may be explored at a later date. But for now, we are sticking with A64.
- Apparently, vec_perm() on ppc32 assumes that vec_perm() will always use vectors with 8-bit elements. However, ppc64 vec_perm() can use elements of different sizes, and so we need to typecast every single case of this so that the correct vec_perm() is called on ppc64.
- Apple Silicon builds target macOS 11.0 SDK, so almost all deprecation warning associated with this have also been fixed. (The remaining deprecation warnings in preferencesWindowDelegate.mm still need to be fixed in some other way.)
- Intel 64-bit developer builds now require macOS 10.12 SDK (Xcode 8 or later). Of note, this produces faster SSE4.1 code by default, but also requires a Penryn-era Core2Duo CPU or later. (Note that Intel 64-bit non-Haswell in release builds still use SSSE3.)
- Improves overall stability when running DeSmuME on macOS 10.5 Leopard.
- In addition, release builds running Intel 64-bit non-Haswell no longer require macOS 10.7 Lion. They can run on Leopard again!
- Finally fix some GUI issues in the About box when running Dark Mode on macOS 10.14 Mojave or later.
* Interface: Added a function to set the ARM9 next instruction
* Comment about potential JIT issues
* Interface: Made setting next instructions no-op with JIT
it's highly annoying to get the red X for any push or pull request
because mac os x interface build is broken since december.
fix it by installing glib which meson complains about.
Since NDS_Init takes no arguments, it should not hurt to call it early
in the gtk frontend, too.
This fixes the segfault in issue #415, although I could not get it to
run a r4 kernel in a quick test.
According to RFC 2396, the single quote character (') is allowed in uri
strings and is not escaped by gtk, so the action string constructed for
the recent files menu must be quoted with " instead of '.
This fixes issue #437
The data returned by Mic_ReadSample() must be transfered in 2 parts.
This must be done by every microphone driver.
I did test with Lunar: Dragon Song which values are considered loud,
since you can run away by screaming/blowing into the microphone. Values
from 33-223 don't trigger the escape, values from 0-32 and from 224-255
trigger an escape attempt. Thus 128 would be considered silence.
the offsets in the dump file are as follows (code snippet taken
from the dump function and annotated with locations from gbatek:
0x2000000
fp.fseek(0x000000,SEEK_SET); fp.fwrite(MMU.MAIN_MEM,0x800000); //arm9 main mem (8192K)
4 DTCM 027C0000h 16KB - - - R/W
or 0x800000 or 0xb000000 . DTCM location in real NDS varies, as the program can select where
it's mapped to, apparently.
fp.fseek(0x900000,SEEK_SET); fp.fwrite(MMU.ARM9_DTCM,0x4000); //arm9 DTCM (16K)
fp.fseek(0xA00000,SEEK_SET); fp.fwrite(MMU.ARM9_ITCM,0x8000); //arm9 ITCM (32K)
fp.fseek(0xB00000,SEEK_SET); fp.fwrite(MMU.ARM9_LCD,0xA4000); //LCD mem 656K
0 I/O and VRAM 04000000h 64MB - - R/W R/W
fp.fseek(0xC00000,SEEK_SET); fp.fwrite(MMU.ARM9_VMEM,0x800); //OAM
0xc000bc
fp.fseek(0xD00000,SEEK_SET); fp.fwrite(MMU.ARM7_ERAM,0x10000); //arm7 WRAM (64K)
fp.fseek(0xE00000,SEEK_SET); fp.fwrite(MMU.ARM7_WIRAM,0x10000); //arm7 wifi RAM ?
fp.fseek(0xF00000,SEEK_SET); fp.fwrite(MMU.SWIRAM,0x8000); //arm9/arm7 shared WRAM (32KB)
this is necessary to load saves from other devices or emulators,
as desmume uses its own incompatible format.
it works for importing .sav files from flashcarts, but only if the
file extension is .sav or .SAV - if using .dsv desmume guesses it's
of its own type and looks for a specific string, then fails.
the right code was taken from windows/importSave.cpp - it might make
sense to add the export item at some point too, however that will
probably require some more effort.
A call to the `SDL_DestroyTexture` method was forgotten, resulting in all the textures created to render the game's frames being stored indefinitely until running out of memory
This is a temporary fix so anyone using the interface (mainly `py-desmume` users) can have it working correctly again. Next step is to mirror the changes from POSIX CLI's main.cpp `Draw` method, if stable
- simplified ring-buffer mechanism
- added proper locking for all variables accessed by 2 different threads
- fixed oob writes that occassionally crashed SDL's "Alsa Hotplug thread"
- make buffer sufficiently large to prebuffer enough samples to survive
the occassional SDL_Delay(1) in the frontend.
- fixed ignoring volume set by the SPU.
- improved speed and robustness by not calling malloc over and over in
SDL callback, and copying directly to the SDL buffer if volume is max
(no need to use mixer to lower the volume in that case).
the new command line parameter --scale allows to scale the window by
a floating point factor. SDL2 stretches it in hardware to the desired
size, which makes the scaled window run at almost identical speed to
1x scale.
1) the float format displayed like 50.123456789123456 wouldn't fit
into the window title bar, and
2) most likely not into the char buffer of length 20, of which half
was already used for the desmume string.
additionally:
- don't allocate memory for surfaces and textures over and over
- use one texture for each NDS screen - this allows to easily
add support for horizontal screen layout.
the command line option existed once, but was turned off when a
new generic commandline parser class was introduced. the entire
array in main.cpp using custom commandline options is currently
unused.
there were 2 logical issues which caused reproducible misbehaviour.
for example when starting up pokemon soulsilver, one can click away
the intro, but it's not possible to click on the "load savegame"
icon.
the issues were:
1) failure to record whether the down event has been
passed to the emulator before abandoning it and turning it into
a click event (on a fast click, both events would happen during
the same SDL_Pollevent loop), and
2) mouse coordinates were discarded and unless the mouse down
event was registered. that means if the down and up events happen
on the exact same coordinate, the .x and .y of the mouse weren't
updated at all.
this is probably helpful for frontends other than cli that have to repaint
and react on events in the user interface, so they can set a timeout like
100 ms, or simply poll whether the stub is active using timeout 0.
the emulator thread was consuming 100% cpu even when the debugger was
active and execution paused.
a second pipe was added to gdb stub, which allows communication in
direction stub -> emulator/frontend, and also to infinitely block
in the frontend until the debugger returns control, for example
by typing "c" (continue) in gdb.
the other frontends use an inefficient method of running usleep(1000)
or similar in a loop, which will cause high cpu usage too, albeit not
a full 100% but more like 10-20%.
in order not to fill up the pipe with data for frontends that don't use
this mechanism, the functionality needs to be explicitly enabled.
(see functions added to gdbstub.h)
the functions added could in theory also be used to communicate
other data to the frontend, and optimally even replace all the locking
between the 2 sides.
- It's a regression from commit 4578728. I'm suspecting that this particular buffer is to be read as 32-bit since all of the other Linux frontends explicitly used 16-bit except for this one.
- Added some additional comments so that I'm not tempted to change the native line tracking paradigm ever again.
- Do some refactoring to make GPUEngineBase::_targetDisplay handle more buffer associations itself instead of relying on GPUEngineBase's copies of the associations.
- For purposes of maintaining a record and make for easier reversions, the code has NOT been fully optimized or cleaned up. This will happen over a period of time as the code settles down through testing.
- All "native" buffers are no longer assumed to be in any color space and are now assumed to always be 15-bit. The native buffers are now referenced using uint16_t pointers and are now suffixed with "16" in order to reflect this change.
- Of note, all clients that reference masterNativeBuffer or nativeBuffer via NDSDisplayInfo must now assume that these native buffers will always be in the 16-bit color space.
- Any 18-bit and 24-bit rendering now happens in the custom buffers.
- Also fixes a bug in PixelOperation_SSE2::_unknownEffectMask32() that would cause 3D layers to appear black if the user was running 15-bit color mode. (Regression from commit 0db9872.)
- This fix has the side effect of greatly increasing the code size.
- Quick testing shows that this fix increases overall graphics performance by 2% - 3%. But is this small performance gain worth the massive increase in code size? Hmmm....
- In practice, no games seemed to be affected by this bug, but even so, this fix is correct.
- While technically unnecessary, when the index is singly incremented, it's better to hard reset an overrunning index to zero in order to improve the theoretical stability of the code.
- Byte swapping can now be independently controlled for both input and output data.
- As an application to this new API, VRAM display mode now shows the correct colors on big-endian systems.
- This also discovers an existing issue with the fog weight calculation code in both OpenGL and SoftRasterizer, since Fog Shift could be zero and thereby cause the calculations to divide by zero. This issue will have to be dealt with at a later time.
- Also rework SoftRasterizerRenderer::_UpdateFogTable() to use the same variable naming scheme as OpenGL. This is done for better code consistency.
- In reality, I'm already looking to scrapping this algorithm in OpenGL for something that could be better in every possible way, but I want to commit this SoftRasterizer-esque algorithm first so that we have a working version of it on record.
- Most notably, each version of the manually vectorized code now resides in their own files.
- Depending on the rendering situation, the new AVX2 code may increase rendering performance by 5% to up to 50%.
- Certain functions automatically gain manual vectorization support since the new GPU code makes use of the new general-purpose copy functions that were added in commit e991b16. In other words, AVX-512 and AltiVec builds also benefit from this.
- Also renames "Altivec" to "AltiVec" to remain consistent with Colorspace Handler's naming.
- Also adds an AltiVec accelerated version of the clear image parser.
- Final Release builds still remain as PowerPC 32-bit, Intel 32-bit, and Intel 64-bit. ARM64 is not supported yet.
- PowerPC 32-bit and Intel 32-bit continue to require macOS v10.5 Leopard like before, but the Intel 64-bit binary now requires macOS v10.7 Lion or later. (Now, the Intel 64-bit binary will simply fail to run on Leopard and Snow Leopard.)
- Specifically, we're now respecting uniform control flow for texture lookups, for which older/stricter drivers will silently fail because they consider texture lookups within conditional blocks to be undefined.
- This change partially reverts commit 87cb2f6, but still preserves the elimination of the destructor, which is probably the code simplification that was originally wanted, I guess.
- Apparently, KVO-based UI updates being made across threads are a big no-no in the macOS v10.14 SDK and later. So now we need to make sure that ALL KVO-based UI updates are done on the main thread only.
- Fixes a crash that can occur on startup or when modifying the Video Pixel Scaler in Preferences. Fixes#321. (Regression from commit 0663661.)
- Fixes a crash on startup where write+execute privileges returned by mprotect() are not supported when compiled against the macOS v10.15 SDK or later. Fixes#335.
- Fixes a possible crash that can occur if an invalid ID is sent when trying to set the 3D Rendering Engine. Maybe fixes#342.
- The script that renames the DeSmuME.app package with the git version now runs as a Build Post-Action script rather than as the last build rule. This is to fix an incompatibility with code signing, which is now forced in Xcode 11 and later.
- Update some variables to comply with newer and stricter compiler rules.
This makes it easier to edit those files in Glade or such, while keeping
it inside the final binary.
As a bonus, XML data is getting minified at the packing step.
It was unmaintained anyway, and the other one is a better base to start
from. And if someone ever needs one of these files, they are preserved
in the git history anyway.
Compilers (at least gcc 10) were already reaching that conclusion, so
this shouldn’t change code generation at all.
This piece of code got introduced in commit
3eb9de4614 when upgrading to save state
version 1 (we’re at version 12), and commit
64073a2558 added some OOP on top of it so
that cp15 was in charge of handling that memory. The code never got
cleaned until now.
getauxval(AT_HWCAP) is the best way to check for features on ARM and
AArch64, as it doesn’t require parsing a file, instead it just returns a
value provided by the kernel in our address space.
This commit should be synchronised with
https://github.com/libretro/libretro-common/pull/176
This was prevending HUD from building. Note that this doesn’t make it
work fully yet, as the pixel format seems wrong, as if AGG was assuming
RGB888 while the buffer is actually RGBx8888 or something like that.
Fixes#375.
This is now using an action parameter to send the slot to save to/load
from.
There was a previous comment about Shift-Fn being broken and a
workaround using Key_Press(), but it doesn’t seem to be broken anymore
so we can use the accelerators instead and remove a static variable.
This one uses the native file chooser the user is used to, which can be
GTK’s on Linux but a more familiar one on other OSes. If
xdg-desktop-portal is installed, it can even use the DE’s native one on
Linux.
At this point we got a fully (?) functional gtk3 port, but it uses a ton
of deprecated functions that will be removed in gtk4. Better enable the
warnings so that we know what to fix before then.
At this point, this version builds. It is full of deprecated widgets
and functions though, which will have to be cleaned over time. It also
doesn’t display any visuals in the DS emulation part yet.
1. didn't like every line in the file being touched
2. DESMUME_NAME is cosmetic; it may have had special meaning in this file. I didnt feel like investigating it any more
a289055e removed the code that updated the old "multisampling enabled"
config. This fix adds a new config value for the multisampling size and
updates the old config in sync. When loading the config, the new
multisampling size value is prioritized, but if it is 0 (possibly not set)
then it falls back to the previous logic using the old boolean config
value.
The build failure(s) come from the fact that the posix frontends currently
use deprecated functions for multi-threading. The offending functions are:
g_thread_create and g_thread_supported, both deprecated since 2.32.
g_thread_create is replaced by g_thread_new and g_thread_supported is no
longer needed at all for glib >= 2.32 threading is automatically initialized
when the program starts.
* desmume/src/frontend/posix/gtk/main.cpp(common_gtk_main): Add a comment to indicate that moving the following instruction implies that a GTK warning is shown.
* desmume/src/frontend/posix/gtk/main.cpp(common_gtk_main): Move a statement later to avoid a GTK warning.
* desmume/src/frontend/posix/gtk/config_opts.h: Add the "firmware_language" option.
* desmume/src/frontend/posix/gtk/main.cpp: Add the "setfirmwarelanguage" menu item.
* desmume/src/frontend/posix/gtk/main.cpp: Add the "setfirmwarelanguage" action entry.
* desmume/src/frontend/posix/gtk/main.cpp(CallbackSetAudioVolume): Fix its parameters ("*" attached to the name rather than the type).
* desmume/src/frontend/posix/gtk/main.cpp(SetAudioVolume): Fix its indentation (spaces replaced by a tab).
* desmume/src/frontend/posix/gtk/main.cpp(SetFirmwareLanguage): Add this function.
* desmume/src/frontend/posix/gtk/main.cpp(CallbackSetFirmwareLanguage): Add this function.
* desmume/src/frontend/posix/gtk/main.cpp(common_gtk_main): If the command line overriding is enabled, then use the language set on the GUI.
* desmume/src/frontend/posix/gtk/main.cpp: Add the "setaudiovolume" menu item.
* desmume/src/frontend/posix/gtk/main.cpp: Add the "setaudiovolume" action entry.
* desmume/src/frontend/posix/gtk/main.cpp(SetAudioVolume): Add this function.
* desmume/src/frontend/posix/gtk/main.cpp(CallbackSetAudioVolume): Add this function.
* desmume/src/frontend/posix/gtk/main.cpp(common_gtk_main): Add the "SNDSDLSetAudioVolume" function call.
* desmume/src/frontend/posix/shared/sndsdl.cpp: Add the "audio_volume" global variable.
* desmume/src/frontend/posix/shared/sndsdl.cpp(MixAudio): Add the "SDL_MixAudio" function call.
* desmume/src/frontend/posix/shared/sndsdl.cpp(SNDSDLGetAudioVolume): Add this function.
* desmume/src/frontend/posix/shared/sndsdl.cpp(SNDSDLSetAudioVolume): Add this function.
* desmume/src/frontend/posix/shared/sndsdl.h: Add the "audio_volume" global variable.
* desmume/src/frontend/posix/shared/sndsdl.h(SNDSDLGetAudioVolume): Add this function.
* desmume/src/frontend/posix/shared/sndsdl.h(SNDSDLSetAudioVolume): Add this function.
- New 16-bit to 32-bit alpha agnostic conversion functions: ColorspaceConvert555XTo888X_*(), ColorspaceConvert555XTo666X_*().
- Minor optimizations to the following functions: ColorspaceConvert555To8888_*(), ColorspaceConvert555To6665_*(), ColorspaceApplyIntensity32_*().
Since full text unable to read when it was 35, changed to 71, so its readable right now.
Typo corrected
Added Windows Border to sound setting to make it more important
“Wi-fi” is used to certify the interoperability of wireless computer networking devices. So Wifi changed to Wi-fi
This is a tiny follow-up to PR #273, with no actual changes in
functionality. In short:
- gui.setlayermask(top, bottom) -> gui.setlayermask(main, sub)
Display layers correspond to GPU (main or sub), not to screen (top or
bottom), so this change just updates the Lua parameter names to reflect
that.
This change adds a Lua function called gui.setlayermask, which allows Lua
scripts to procedurally modify which render layers are visible. It takes
two arguments: one integer bitfield representing the visibility states for
the main GPU, and the other representing the sub GPU. For example: 0b11111
shows all layers, 0b00000 hides all layers, 0b00001 shows only layer 0
(BG0), 0b10000 shows only layer 5 (OBJ), etc.
Since display layer state is coupled to the frontend, and since frontends
are entirely separate between platforms (i.e., on Windows, toggling
display layer state requires updating the GUI state asl well), this
function is only supported and tested on the Windows build. On MacOS and
Linux, gui.setlayermask will simply return 0 without doing anything.
- Specifically, translucent polygons now properly render on top of the back-facing opaque fragments from the opaque polygon rendering pass. Do note that emulating this special NDS rendering quirk is still not complete, as it does not account for drawing any translucent polygons on top of the opaque fragments of back-facing partially-translucent alpha-textured polygons. However, I have not seen any games that actually go so deep into such an edge case. If there is such a game, then this issue will need to be revisited.
- Now that this special rendering quirk is more accurate, this does cost some performance. However, since this rendering quirk is controlled by the "Enable Depth L-Equal Polygon Facing" option, which is OFF by default, this performance loss is deemed acceptable in favor of the increased accuracy.
- 3D renderers no longer perform polygon clipping themselves, instead relying on GFX3D to do it. By default, the clipping mode is ClipperMode_DetermineClipOnly, but 3D renderers can change this by overriding the virtual method Render3D::GetPreferredPolygonClippingMode() and returning their preferred clipping mode.
- Specifically, if the previous frame is determined to draw the entire HD layer directly over the backdrop layer, then the current frame's entire custom framebuffer is asynchronously cleared using line 0's backdrop color since most games will keep the backdrop color constant for all scanlines. Because this is a common rendering case, many 3D games should see a performance improvement when running very large HD framebuffers (8x or higher).
- Also fix a compiling issue for non-SSE2 systems. (Regression from commit 3890431.)
- Most notably, fix a performance regression where polygon drawing was no longer getting batched due to an incorrect polygon-facing test. (Regression from commit dab414c.)
- New Behavior: In addition to emulating the existing Depth Equals Test Tolerance, NDS-Style Depth Calculation accounts for all NDS depth calculations within the fragment shader. Most notably, disabling this option forgoes the W-depth / Z-depth differentiation that the NDS uses, instead preferring the GPU's native Z-depth calculation. Using the GPU's native depth calculation significantly improves performance, but many games use W-depth calculations or are sensitive to subtleties in the Z-depth calculation, and so this option must remain ON by default for compatibility's sake.
- Also fixes a shader initialization issue on the Windows port. (Regression from commit 7080e21.)
- Fix compiling issues for big-endian systems.
- Fix bug where the Recent ROMs menu and also launching the app while loading a ROM file would fail to load the ROM on macOS v10.5 Leopard.
- Fix bug where GPU main memory display mode would show incorrect pixels on big-endian systems when running at 15-bit color depth.
- As an unintended collateral improvement, GPUEngineA::_HandleDisplayModeMainMemory() now has SSE2-accelerated versions for 18-bit and 24-bit color depths. This was done less for its performance benefit (main memory display mode is an extremely rare feature) and more for better code consistency and code completeness.
- After years of testing, no one has reported running into the assert in gfx3d_ysort_compare() so I think we should be safe in reverting std::stable_sort() back to std::sort().
- For the sorting function, use gfx3d_ysort_compare_orig() since this function compiles down to fewer instructions than gfx3d_ysort_compare_kalven() does, resulting in better sorting performance.
- Of note, I'm pretty sure that SF commit r5132 is what fixed the original bug (see SF#1461 for more details) by getting rid of the NaN comparisons that were tripping up std::sort(). In the future, we should research why we're dividing by 0 in the first place, since r5132 is clearly a hack of a fix.
- Also do a minor performance optimization by only doing the framebuffer clear once for each power-off condition, rather than repeatedly and unnecessarily clearing the framebuffer for each and every V-blank.
- Of note, when running at custom resolutions, we are now being more aggressive in performing early tests for rejecting pixels as soon as possible. This may yield a minor performance improvement in some very specific rendering scenarios that require the window test.
- It is still possible to create a PowerPC binary, but this now requires some extra steps. From now on, you must use an Intel Mac running Mavericks or earlier to re-save the .xib files with a deployment target of macOS 10.5 in Interface Builder 3.2, and then use Xcode 3 to build a PowerPC binary using the Xcode 3 project file.
- Apparently, MSVC has a more strict implementation of IEEE-754 single-precision floats (with 23-bit significands) than Clang and GCC, and so we going to drop 2 LSBs during the calculation so that we're multiplying z by a 22-bit significand. Coincidentally, this now matches what we're doing with the OpenGL renderer, so this tends to better code consistency.
- New Behavior: Due to the rarity of needing to emulate 'Depth-LEqual polygon facing' and its guaranteed reduction in performance in all games, this accuracy feature is now OFF by default.
- Expose these new settings in the Cocoa port UI.
- SoftRasterizer may now drop at most one LSB, down from dropping 9 LSBs.
- OpenGL will now drop only 2 LSBs, down from 9 LSBs. In this case, dropping 2 LSBs was specifically chosen to ensure that the Dragon Quest IV overworld map continues to work.
- If this change makes the depth inaccuracy too much worse than before, then we may have to make these particular depth calculations optional in the future. This will need additional testing.
- Now, the only two methods for changing any firmware setting is by modifying CommonSettings.fwConfig or by loading an external NDS firmware binary file.
- All methods for changing the firmware MAC address through the WifiHandler class have been removed.
- The FirmwareConfig struct can now handle the WFC User ID.
- Clients can now retrieve the current MAC address and WFC User ID using NDS_GetCurrentWFCUserID(). It is also possible to retrieve the WFC User ID from CommonSettings.fwConfig.
- Setting up the firmware in NDS_Reset() should now be more consistent. However, this does change some of the loading/unpacking order previously set by NDS_FakeBoot(). This will need additional testing.
- Do a whole bunch of code refactoring and cleanup.
- This change shouldn't actually change any functionality in practice... probably. This change is there to silence a compiler warning more than anything else... hopefully.
- This change obsoletes "CommonSettings.wifi.mode", which now does nothing. Ports that make use of this setting should remove it.
- Also do a bunch of code refactoring and cleanup.
- This fix properly emulates the less-than-or-equal depth test rendering for front-facing polygons drawn on top of opaque back-facing fragments, but only if the front-facing polygon is opaque. Translucent front-facing polygons are not supported at this time due to requiring extensive changes to the rendering logic and shaders in order to emulate this extremely rare and niche NDS feature. (If you require the proper rendering of translucent front-facing polygons on top of back-facing fragments, then you must use SoftRasterizer.)
- The new behavior for the Multisample Antialiasing checkbox: Checked - GFX3D_Renderer_MultisampleSize = 4, Unchecked - GFX3D_Renderer_MultisampleSize = 0. (If someone else wants to make some UI so that GFX3D_Renderer_MultisampleSize can be set to other sizes, then have at it.)
- Add a unique sequence number to fetched frames to ensure that older frames are not drawn after newer frames.
- After much research, finally settle on a method for fetching the NDS framebuffers -- using a MTLBlitCommandEncoder to blit a MTLBuffer to a MTLTexture. It is faster than uploading a texture using [id<MTLTexture> replaceRegion:mipmapLevel:withBytes:bytesPerRow:], and also faster than using a pinned-memory backed linear texture. This method will be the way going forward for fetching framebuffers in Metal.
- All frontends will need to be updated to use the new GFX3D_Renderer_MultisampleSize setting.
- This change obsoletes GFX3D_Renderer_Multisample, which currently does nothing at the moment. It will be removed after all frontends are updated.
problems with the old if, is that the code still compiled(but optimised out)
as g_thread_supported is a macro, #if work well enouth and doesn't generate
warning
It seems gcc have a (new ?) warning that doesn't allow *ncpy functions
to have any source length related value as len argument.
I've use strdunp to fix this, but I guess there is some other solutions
that doesn't require free.
use c++ strings ?
use strcpy(...); tmp1[strlen(filename) - 4] = 0; ... ?
remove the warning in the Makefile ?
but as the strdump solution is simple enouth I've keep this.
- Received packets are now queued properly and should no longer be overwritten or lost.
- Received packets under Ad-hoc mode now use the same transfer delay as Infrastructure mode. (Read one halfword every 8 microseconds.)
- Received packet transfer delay only works when the emulation level is set to WifiEmulationLevel_Compatibility. Transfer delay can be disabled by setting the emulation level to WifiEmulationLevel_Normal, which will cause the entire received packet to be transferred immediately.
- If the WiFi emulation level is Off, then always set POWER_US.Disable = 1.
- If POWER_US.Disable == 1, then do not trigger any further WiFi actions.
- The name contained within DeSmuME's frame header has been changed from "NDSWIFI\0" to "DESMUME\0".
- DeSmuME's frame header size has been increased from 12 bytes to 16 bytes.
- Baseband data is now set to a default dataset on reset.
- Baseband data reads/writes now respect the actual R/W behavior of each data byte.
If the position test register is negative, conversion to unsigned
integer is undefined. This breaks games on arm64 where the behaviour is
defined as 'truncate to zero'. Converting to a signed integer first
guarantees the intended behaviour.
- GPUEngineBase::_LineCopy() optimizations only apply to 2x, 3x, and 4x scaling.
- Add SSE2 version of 3x CopyLineExpand() when using ELEMENTSIZE==1.
- Add SSE2 versions of CopyLineReduce() and add specific 2x/3x/4x versions of CopyLineReduce_*() algorithms.
- CopyLineExpand() now supports vertical scaling in addition to horizontal scaling.
- GPU buffers that were previously only cache-aligned are now page-aligned if appropriate.
- Also fix a depth bug for scrolling clear images on SSE2 systems by disabling the SSE2-specific code. This issue will need to be researched at a later date.
Make SwitchPath check for all directory delimiters when removing trailing delimiter, remove redundant trailing delimiter logic in CFIRMWARE::GetExternalFilePath().
- EXPERIMENTAL_WIFI_COMM no longer disables all of the WiFi-related code. Instead, the WiFi code is always enabled and actually running the code is now controlled using WifiHandler::SetEmulationLevel().
- On the Windows port, EXPERIMENTAL_WIFI_COMM no longer hides all of the WiFi options. Instead, it only affects the user's ability to control the WiFi emulation. (Forces the WiFi emulation level to WifiEmulationLevel_Off if EXPERIMENTAL_WIFI_COMM is undefined.)
- The Cocoa port (and probably other POSIX-based ports) should now work better with the WiFi code.
- WiFi comm interfaces no longer initialize only once upon app startup. Instead, they initialize each time the emulator resets, and then uninitialize each time a ROM is unloaded. Now, users no longer have to restart the app in order to apply any changed WiFi settings. Instead, users only need to reset the emulator or load a new ROM.
- Previously, the SoftAP comm interface wouldn't run if libpcap was unavailable or if a network device wasn't found. Now, the SoftAP comm interface will now run without libpcap or a network device, albeit with significantly reduced functionality.
- Previously, saving pcap files required WIFI_LOGGING_LEVEL >= 3. Now, saving pcap files no longer relies on WIFI_LOGGING_LEVEL, instead relying on WIFI_SAVE_PCAP_TO_FILE to enable the functionality.
Changes:
1- Add wifi.emulated flag to common settings
2- Don't emulate wifi unless wifi.emulated is set
3- Add a check box in windows frontend to toggle it, and read/write setting from/to ini file.
- Automatic setting of the SoftRasterizer thread count (the most common use case) now takes into account systems with many CPU cores/hyperthreads. When using Automatic mode, SoftRasterizer will take advantage of more threads on machines like the Mac Pro and iMac Pro.
- Manually assign the thread priorities of the SoftRasterizer threads and other related high-priority threads to better ensure stable performance. Most importantly, the main emulation thread will no longer preempt any SoftRasterizer thread since the main emulation thread has to wait on the results of SoftRasterizer anyways.
- These changes aren't targeted for improving overall performance -- they help stabilize performance so that CPU cycles are used more consistently, which might translate into slightly improved performance, depending on hardware, as a byproduct of doing these changes.
-Fixes regression from commit 5906d44 where HUD would appear smaller when using HD scaling. (only fixed with OpenGL for now)
-Remove use of backlightIntensity for displaying. Fixes bug where screen would appear dark on the first frame after loading a save state. Underlying cause should probably still be fixed, though. (Why would the backlight level affect the display anyway? That setting on the DS is only present because it has its own physical screens and makes no sense here.)
The right hand screen is allowed to be resized in horizontal screen layout to enable the window or full screen display to better utilize the screen area.
Changes:
1- Modify scaling, resizing and update functions to allow for new screen resizing ratio
2- Modify touch input scaling (incl. HUD editing) to adapt to different screen sizes
3- Add GUI menu for user to select the screen resizing ratio
4- Implement saving/loading settings from file similar to other settings
Commit 7548294333 broke compilation of
desmume/src/NDSSystem.cpp if DEVELOPER is defined by --enable-gdb-stub
Needs commit c9ad909a75b0ad89d0bd84829ed536c5ae0ffc93
XInitThreads() is needed in multi-threading X applications when multiple
threads try to access the Xlib. Add the call to the three frontends in
posix/ and add the required autoconf-stuff, too.
there's really nothing more to it than this. it should have been done from the beginning.
you wouldn't notice this unless you had a game that stopped rendering 3d, though.
(re #141)
- These changes help to stabilize the performance of AVI recording, making it less sensitive to sudden changes in disk writing speed.
- The maximum amount of frames maintained in memory will either be 1.5 GB worth or 180 frames (or 3 seconds) worth, whichever is less.
- aviout.cpp now uses Windows-style line-endings instead of Unix-style line-endings.
- AVI segments now fill up much closer to the 2 GB file limit than before.
- Error handling in the file writing thread is much more robust.
- Since this is a very common occurrence in many games, and since doing a clear is faster than doing an upscaled copy, this should give a small performance improvement for the larger framebuffer sizes.
- Completely encapsulate all stray global variables into the SoftRasterizer class where they belong.
- Framebuffer clears are now fully multithreaded, significantly improving clearing performance.
- Doing multithreaded texture loads and vertex calculations now requires a minimum of 2 threads, down from 4 threads.
- The maximum amount of SoftRasterizer threads has been increased from 16 to 32.
- For all non-Cocoa ports, reduce the number of framebuffer pages from 2 to 1, reducing the memory usage for those ports.
- For the Cocoa port, increase the number of framebuffer pages from 2 to 3 in preparation for a new triple-buffered display scheme.
- Also change the CocoaDSOutput list lock from a mutex to a rwlock, since testing has shown that there is more thread contention here than I previously thought.
I found these dependencies harder to figure out than usual,
since I'm used to installing packages with pregenerated `configure` scripts.
In particular if `glib` is missing then `configure` will generate with unexpanded macros, which is confusing.
This extra paragraph should be helpful for others.
Thanks for a great program :)
- In practice, this should change nothing, since all pointers somehow managed to point to the correct buffer locations. This should be nothing more than a programming consistency and readability improvement.
- This has the side-effect of having the Windows port’s display window
start up with a white screen and HUD showing (if enabled) just like
before, rather than a black screen and HUD possibly hidden.
- GPU Color Depth (from 24-bit to 18-bit), Advanced SPU Logic (from Disabled to Enabled), SPU Interpolation (from Linear to Cosine), Synchronization Mode (from Dual SPU Sync/Async to Synchronous)
- Just like the previous change to the default JIT block size, let the users themselves disable these settings so that they are more aware that they are sacrificing compatibility for speed.
- Also remove TCommonSettings.GFX3D_PrescaleHD. It is a useless setting in core because the internal resolution is not limited to integer-multiplied scaling.
- Also fix spelling on the "Maintain Aspect Ratio" menu option.
- There is only a negligible performance difference between 100 and 12.
- It is better for users to change the JIT block size from 12 to 100
themselves, since it might make them more aware that they are
sacrificing compatibility in favor of speed.
- Also remove the usage of _rwlockFrame and change it to a simple
pthread_mutex_t, since CocoaDSDisplay objects no longer have a need for
a full pthread_rwlock_t.
- Build artifacts are now created in the source code directory itself
instead of in DerivedData, just like how Xcode 3 does it.
- Making Profile builds using the “OS X App” build scheme now
automatically appends the git commit number to the .app bundle name.
- Of special note, Metal display views aren't allowed to run on macOS High Sierra because of an assert bug in [id<MTLDevice> newBufferWithBytesNoCopy:length:options:deallocator:] in this particular version of macOS. Note that Metal display views will continue to work with macOS El Capitan and macOS Sierra.
- Of note, initialization of the 3D rendering engine is also staged, where the pending engine is initialized prior to applying the 3D rendering settings. However, only ports that support this behavior will do this. Ports that do not support this behavior will work the same way as before (initialize the 3D engine immediately).
- Also endian swap the BGnX and BGnY values on big-endian systems. This is a non-functional change, and is only meant to show that the endian swaps are indeed the correct choice for big-endian.
- Changing the display video source now updates the display window properly while the emulator is paused.
- Fix bug in the Screenshot Capture Tool where screenshots would have incorrect colors if taken on a PowerPC Mac.
- Fix bug in the Screenshot Capture Tool where screenshots would be completely black if a CPU-based pixel scaler on OpenGL was used.
- The OpenGL presenter's GPU tiering system has been changed to be more strict. This effectively pushes many older GPUs into lower tiers.
- The following pixel scalers now require at least a Low-Tier GPU (previously only required Bottom-Tier): 2xSaI, Super2xSaI, SuperEagle, HQ3x, HQ3xS, HQ4x, HQ4xS
- The following pixel scalers now require at least a Mid-Tier GPU (previously only required Low-Tier): 2xBRZ, 3xBRZ
- Due to the new changes to the GPU tiering system and allowed pixel scalers per tier, the Screenshot Capture Tool running OpenGL now allows pixel upscaling on the GPU instead of disabling it completely.
- Fix potential crashing bug that may occur if the target directory
isn’t found when clicking the Take Screenshot button.
- In the Display Layout dropdown panel, reposition the Display
Separation menu to be in the center section instead of in the leftmost
section.
- Has the positive side-effect of improving the OpenGL renderer's performance when many vertices are used.
- Also fix the vertex list double-buffering so that it actually works as intended.
- Note: The backlight intensity is only emulated on frontends with 3D-based display methods, such as OpenGL and Metal. CPU-based display methods, such as DirectDraw, SDL and Cairo, are currently unsupported.
- The old Tools > Save Screenshot As has been replaced with the new Screenshot Capture Tool.
- The new Screenshot Capture Tool allows screenshots to be configured to render with the same layout features as a display view.
- Screenshot captures now both render and save to file on their own independent threads.
- Screenshots are now captured using the Take Screenshot button, and files are now automatically named based on ROM name and timestamp.
- All of these features means that users can now rapidly take screenshots with their own custom layouts, all with little to no slowdown to the emulation.
- Also do a bunch of code cleanup and refactoring as a side-effect of adding these new features.
- Also be more consistent when recovering from an internal emulation halt.
- Also apply enabling the external BIOS, external firmware and firmware boot settings at load/reset time instead of at frame time.
- Framebuffer fetches no longer run on a CocoaDSThread, but instead uses a pthread directly. This can be done since framebuffer fetching only serves one function and always receives the same execution message, making a full CocoaDSThread a waste.
- SPU_Emulate_user() is no longer called on a separate thread, and instead is called in the emulation thread directly. For the typical SPU use case (SPU Sound Synchronization w/ Advanced SPU Logic), SPU_Emulator_user() becomes negligible, and so the threading overhead becomes unnecessary. In the use case where Dual SPU Synch/Asynch is used, Advanced SPU Logic is almost always disabled with it, and so the penalty of calling SPU_Emulator_user() on the emulation thread will be more than compensated by the performance increase of turning off Advanced SPU Logic.
- Also do some code cleanup/refactoring here and there.
- Fix a bug where Metal display views can block emulation execution if the user very quickly spams inputs while the input HUD is shown.
- Metal display views are no longer frame capped -- they can now run to their fullest performance potential, as fast as the host hardware will allow. This behavior is now consistent with OpenGL display views.
- As a side-effect, non-layer backed OpenGL display views also have a performance improvement.
- The MSAA sample size limit is now based on the following sizes:
1x Native Resolution - 32xMSAA
2x Native Resolution - 16xMSAA
3x-8x Native Resolution - 8xMSAA
9x and greater Native Resolution - 4xMSAA
- Function/method parameters for EMUFILE objects are now passed by
reference instead of passed by pointers, where appropriate. This is
true for the vast majority of EMUFILE usage.
- Eliminate duplicate little-endian read/write functions in
readwrite.cpp. Use the equivalent methods in EMUFILE instead.
- Completely standardize the design patterns and usage of the various
little-endian read/write methods. Should help stabilize interactions
with save state files, as well as try to make save states
cross-compatible between big-endian and little-endian systems.
- Replace EMUFILE fread()/fwrite()/fputc() calls with equivalent
size-specific methods where applicable.
- Misc. code readability and stability improvements.
- Also add macosx_10_5_compat.cpp back into the normal OS X build of
the Xcode Latest project. It’s still needed for compatibility on OS X
10.5 Leopard for x86/x86-64. (Don’t know how this got disabled — it
just somehow mysteriously did.)
- Now applies the sprite window flags consistently between rotozoomed
and non-rotozoomed modes.
- Applying the sprite window flags in rotozoomed modes now ignores
sprite priority as intended.
- Also fix a compiling issue in the Windows build. (Regression from
commit 6acf781.)
- Also fix an issue in the zero-dst-alpha-fragment pass while running
MSAA in legacy OpenGL. (Related to commit 6acf781.)
- Framebuffer conversion now occurs purely in shaders, and also
performs flipping along with conversion. FBOs and PBOs are no longer
required to do this.
- If shaders are not available, then framebuffer flipping will occur if
FBOs are available. PBOs are no longer required to do this.
- Also fix a minor framebuffer attachment bug in the v3.2 renderer.
- Keep OpenGLRenderer_1_2 as the sole OpenGL v1.x class, and then
remove the following classes: OpenGLRenderer_1_3, OpenGLRenderer_1_4,
OpenGLRenderer_1_5.
- Now composites pixels using explicit functions for simple copy,
masked copy, and masked effect.
- On SSE2 systems, pixels composited using a simple copy no longer
require the destination pixels to be loaded first, since all the pixels
are guaranteed to be overwritten anyways.
- Try and move the window test as far up the pipeline as possible so
that pixel rendering can bail as soon as possible if the window test
fails.
- Clean up and further standardize the code for compositing BG layers,
OBJ layers, and 3D layers.
- Adds more possible conditions for compositing to take the fast path
(simple pixel copying).
- Adds SSE2 optimizations for the 2x and 4x scaling cases in
GPUEngineBase::_LineColorCopy().
- NEW FEATURE: Clients may now call GPUSubsystem::SetColorFormat() to
choose the color format of the GPU output, which can be RGB555
(15-bit), RGB666 (18-bit), or RGB888 (24-bit).
- On a special note, the Deposterize filter for 3D textures can now
show its true smoothing effect when clients run the GPU in 24-bit color
mode.
- New behavior for layer-backed views (OpenGL on Mountain Lion and
later, or Metal): Vertical Sync is always enabled.
- New behavior for non-layer-backed views (OpenGL on Lion and earlier):
Vertical Sync is only enabled if frameskip is enabled or if the
execution speed is set to 1x or less while the speed limiter is engaged.
- Remove all associated UI for manually setting Vertical Sync.
- Also add a new menu option to the Tools menu for disabling Metal,
instead forcing display views to use OpenGL. (For developer builds
only.)
- Revert commit abe2e61997. (But retains
the comments about Mario Kart.)
- Partially revert adf682eb23. (But
retains the removal of the LCDC check in ResetDisplayCaptureEnable().)
--rtc-day and --rtc-hour may be used to override the emulated time.
rtc-day is a day of week (0-6) rtc-hour is the hour (0-23). Time
difference is calculated at emulator start and the emulated RTC then
reports time from the future. This difference is then maintained so that
an hour after the emulator is started means an hour passes according
to the RTC.
- To note: This fix to Pokemon Black/White does not require
CommonSettings.pokehax to be enabled.
- The CaptureEnable flag is now only read at the start of line 0,
instead of being read directly from the DISPCAPCNT register per line.
In addition, this same state is held all the way through line 192.
- The CaptureEnable flag is now reset at the start of line 192, instead
of near the end of line 191 H-blank. (This is the proper behavior
according to GBATEK.)
- The CaptureEnable flag is now only reset when the VRAM configuration
is LCDC, instead of always being reset. This makes it possible for this
flag to remain set on line 192 if the VRAM configuration is changed to
a non-LCDC configuration.
- CommonSettings.pokehax is now initialized to false.
- Fix a small bug when setting CommonSettings.pokehax via the command
line.
- Fixes custom VRAM reads for OBJ bitmap reads when the read location
doesn’t start at line 0. This behavior is now consistent with how BG
extended layers do it.
- Display views now take the Deposterize filter into account when
determining the direct-to-CPU-filtering state.
- GPUSubsystem now combines the RGB666-to-RGB888 conversions and master
brightness steps into a single postprocessing step.
- Do some minor code cleanup.
- An Apple Metal display view requires macOS 10.11 El Capitan or later,
in addition to a Metal-compatible GPU.
- Apple Metal display views have significantly lower CPU usage then
OpenGL display views.
- OpenGL display views now use a shared fetch object to fetch the emulated GPU framebuffers and store them in shared textures within a shared context. In conjunction with the new double-buffering support from the last commit, this eliminates the copying between the framebuffers and each display view.
- OpenGL display views now use shared HQnx LUT textures, rather than having to initialize and maintain a copy of the LUT textures for each display view.
- OpenGL display views no longer perform any rendering while their associated NSView is hidden, improving the performance of creating new display views.
- OpenGL display views can now DMA directly from pinned-memory both custom-sized framebuffers and CPU-pixel-scaled native-sized framebuffers at the same time.
- Framebuffers are now page-aligned on 4KB boundaries. This is to
improve performance when using the framebuffers directly as pinned AGP
memory.
- Framebuffers are now double-buffered. The target buffer index is now
tracked using the bufferIndex field of NDSDisplayInfo.
- Clients may no longer supply their own buffers to
SetCustomFramebufferSize(). Clients must use the pointers supplied by
NDSDisplayInfo.
- The frameskip flag is now set only on line 0 and remains consistent
for all 192 lines of rendering.
- GPUSubsystem no longer needs a special allocater/deallocator for
itself, so it has been reverted back to a standard C++ new/delete.
- Add a GPUClientFetchObject helper class as an aid to clients that
need to read out the framebuffers. (Should probably move to its own
file at some later date.)
This reverts commit 53c4a27aef.
I forgot that these functions are based on element count, not based on
byte count. Rename “length” to “elementCount” for better clarification.
- Fetching and loading of GPU frame data is now performed as two
separate operations.
- Display windows no longer draw concurrently on backgrounds threads;
instead they are updated synchronously.
- Associate the CALayer after the .xib completely loads the NSView for
better compatibility.
- MacOGLDisplayView now creates an NSOpenGLContext instead of a
CGLContextObj, bringing back compatibility with macOS 10.5 Leopard.
- Fix building with the Xcode 3 project.
- Most notably, HandleGPUFrameEndEvent() now sends the entirety of the
NDSDisplayInfo struct to the client.
- The OpenGL blitter now skips the loading, processing and rendering of
disabled screens.
- Begin preparing DisplayView to handle the upcoming Apple Metal
blitter.
- Do some misc. code cleanup and simplification.
- Also disable the 4xBRZ shader for low-tier GPUs. (Testing has shown
that low-tier GPUs have no chance at running this shader in real-time
anyways.)
- Also do some misc. tweaks to other various shaders.
- Move Mac-specific OpenGL code to its own file.
- Eliminate the CocoaDSDisplayDelegate and CocoaDSDisplayVideoDelegate
protocols. Instead, call ClientDisplay3DView class methods directly.
- Fix bug where restored windows would fail to update properly if the
window size would be the same as the one set in user defaults.
(Regression from commit cffc343.)
- Fix bug where changing the rotation to be exactly 180 degrees
different from the old rotation would cause the view to render the
screens with a vertical offset. (Regression from commit cffc343.)
- OGLDisplayLayer respects its own _needUpdateRotationScale and
_needUpdateVertices flags once again, preventing it from repeatedly
uploading already established data to the GPU. (Regression from commit
cffc343.)
- The new Hybrid display orientations show three NDS screens — the top
and bottom screens on the left side, and a larger major screen on the
right side. This feature is intended to better use the widescreen
resolution of most users’ host displays, which are usually 16:9 or
16:10. Three different Hybrid orientations are provided (3:2, 16:9, and
16:10) so that users can choose the display size ratio that they prefer.
- Horizontal display orientation no longer uses the display separation
setting.
- Do some additional code cleanup.
Do note that we need to do this for SoftRasterizer as well, but
SoftRasterizer will need some additional rework on shadow polygon
handling to get all the test cases to work.
Fixes issue #21.
Now that framebuffer sizes can be greater than 256x192, using MSAAx16
is excessive and consumes too much bandwidth. We’re reducing the
maximum sample size of MSAA to 8 since it will significantly reduce
bandwidth consumption at the larger framebuffer sizes while remaining
mostly visually equivalent to MSAAx16.
- Determining the depth write state is now handled purely within the
fragment shader.
- Color and depth write states are now handled more consistently across
fixed-function and fragment shader.
- Force backface culling when drawing shadow polygons.
- Now that Mega Man Zero Collection has proven that it is perfectly
legal to set MASTER_BRIGHT within V-blank, remove the related debug
message.
- Properly silence the texture memory debug message when the
MASTER_BRIGHT intensity is 0.
The —3d-texture-deposterize-enable, —3d-texture-upscale and —3d-texture-smoothing-enable options work independently of the frontend, and so it should be safe to make these options universally available.
Fixes brightness issues in The Legend of Zelda: Spirit Tracks.
(Regression from #ac69f1e.)
The user may still choose to apply master brightness on a
per-framebuffer basis by passing ‘false’ to
GPUEngineBase::SetWillApplyMasterBrightnessPerScanline(). For better
consistency, this behavior has also been changed to only use the master
brightness settings as they were in line 0.
Fix bug where the screen may end up fully black or white if the master
brightness is modified in the middle of the frame. Fixes SourceForge
#1603. (Regression from r5538.)
Moves the Deposterize Textures and Texture Scaling Factor UI items from
OpenGL Options to General Settings in the 3D Rendering Settings panel
and DeSmuME Preferences.
cleanup: move valuearray into guid which was the only place using it. we should probably then move guid into movie later, since it's unlikely anything else will use it.
- Finish refactoring and cleaning up TexCache (now renamed to “TextureCache”) and TexCacheItem (now renamed to “TextureStore”).
- TextureCache items are now evicted based on age and usage instead of arbitrarily.
- The 3D renderers are now responsible for managing the texture unpack buffers instead of relying on the TexCacheItem itself to do it.
- The OpenGL 3D renderer now uses a fixed 4MB buffer for unpacking textures, instead of maintaining extra copies of each unpacked texture in main memory even after they’ve been uploaded to the GPU.
- Rework TexCacheItem::GetTexture() so that instantiating a new object, dumping the packed data, and dumping the palette are performed as separate operations.
- Invalid OpenGL textures are now updated instead of being completely replaced.
- NDSTextureUnpack4x4() now uses the srcIndex pointer parameter instead of recalculating the palette address.
- Delete the now obsolete MemSpan-based texture unpacking functions.
- Texture items in cache are now searched using std::map instead of std::multimap.
- Texture item search keys now ignore the render-specific bits of the texture attributes (repeat mode, flip mode, and coordinate transformation mode bits are ignored). This is to help reduce the number of duplicate textures in the cache.
- Searching a texture and unpacking a texture are now performed as separate operations.
- Texture unpacking functions now use restrict pointers instead of normal pointers.
- Revert the last resort execution of workFunc in Task::Impl::finish(). Windows now has much better compliance with the behavior of pthread_cond_wait(), so the last resort execution is no longer necessary.
- Add additional checks for workFunc in Task::Impl::execute() and Task::Impl::finish() to make their reentrancy more robust on Windows.
- Add a last resort execution of workFunc in Task::Impl::finish() in the case where taskProc() misses the wake up signal from Task::Impl::execute() when running on Windows.
- EXPERIMENTAL: Revert task.cpp and pthreads.c to what they were back in r5538, but change scond_wait() to explicitly unlock the mutex before calling WaitForSingleObject().
- When shutting down, ensure that the existing task is finished if its running before continuing with the shutdown process.
- Explicitly declare thunkTaskProc() as static.
- If a GPU engine is disabled or has master brightness at full intensity, fill the output framebuffer on line 191 instead of on line 0.
- Replace global variable Render3DFramesPerSecond with accessor method GPUSubsystem::GetFPSRender3D().
- Factor out the generic colorspace handling routines out of GPU.cpp/GPU.h into their own separate files.
- Add vectorized routines using AVX2 and AltiVec.
- Fix bug where the OBJ layer wasn’t doing the window test. Fixes graphical issues in Mario Kart DS. (Regression from r5515. Fixes bug #1572 and #1574.)
- The NOWINDOWSENABLEDHINT template parameter is no longer an optional hint; it is now required functionality. It has been renamed to WILLPERFORMWINDOWTEST to reflect this change.
- Window testing is now a per-scanline operation instead of a per-pixel operation. Removes the performance penalty of window testing at larger framebuffer sizes.
- In the OpenGL blitter, replace some calls to glBufferSubDataARB() with glMapBufferARB(). This, maybe, possibly, fixes an intermittent crash that can occur with the Intel HD Graphics 3000 OpenGL driver.
- Move towards completing support for changing the output framebuffer color format to RGB666 or RGB888. Significantly increases the generated code size, but this is necessary for performance. (Related to r5433. This rework is still incomplete.)
- Parse and cache the WININ and WINOUT registers, instead of using them directly.
- Parse and cache the Target1 bits of the BLDCNT register.
- Remove some template parameters which are now suspected to no longer improve performance, most notably LAYERID. Should significantly reduce the generated code size.
- Do a tiny optimization for GPUEngineBase::_RenderPixel16_SSE2().
- Fix a bug where the 3D layer would fail to draw correctly on non-SSE2 systems if the output framebuffer’s color format is RGB666 or RGB888. (Regression from r5492.)
- Do some minor code cleanup.
- 2D layer compositing now supports RGB666 and RGB888 color formats. (Related to r5433. This rework is still incomplete.)
- Fix a couple of bugs in GPUEngineBase::_ColorEffectBlend3D() when dealing with RGBA6665 or RGBA8888 color formats.
- Establish some assumptions about what the 3D layer’s color format will be with respect to the output framebuffer’s color format. This is being done in order to simplify the code.
- The new rules are as follows: If the output framebuffer’s color format is RGB666 or RGB888, then the 3D layer’s color format will be RGBA6665 and RGBA8888, respectively. If the output framebuffer’s color format is RGB555, then the 3D layer’s color format will be RGBA6665.
- 3D layer compositing now supports RGB666 and RGB888 color formats. (Related to r5433. This rework is still incomplete.)
- Fix a bug in GPUEngineBase::_ColorEffectBlend3D() where variables were left undefined when the source and destination color formats were mismatched.
- Partially fix a bug with affine and extended BG layers on big-endian systems. Such layers that perform rotation or scaling aren’t fixed yet.
- Loosen a restriction on taking the faster code path in GPUEngineBase::_RenderPixelIterate_Final().
- Silence a compiler warning on non-SSE2 systems.
- Fix a bug where if both flipping and colorspace conversion occur on the CPU, then the 3D framebuffer would flush incorrectly. (Regression from r5455.)
- Nope! Apparently, GPUEngineBase::_RenderPixel_CheckWindows16_SSE2() does need to be forced inline, or else performance will drop! (Regression from r5485.)
- Fix builds that were broken due to new libretro-common API additions. (Regression from r5398.)
- KNOWN REGRESSION: In order to hasten the process of restoring the ability to build the Linux ports, the additional command-line options that are available in the Linux ports have been disabled. Maybe someone else can restore their functionality.
- Once again, tell the 3D renderer which framebuffers need to be flushed per frame so that we can avoid flushing unneeded framebuffers. This fixes a performance regression with many 3D games. (Regression from r5383.)
- Include SSSE3 versions for unpacking the following texture types: I2, I4, and A5I3.
- As a side-effect of working on these optimizations, the SSE2 versions of ConvertColor555To6665Opaque() and ConvertColor555To8888Opaque() are now a little faster.
- Remove GPUEngineBase::_RenderPixel_CheckWindows8_SSE2() and GPUEngineBase::_RenderPixel8_SSE2(). I don’t see us ever needing to use these methods in the future.
- Replace patterns of por(pand,pandn) with pblendvb where appropriate. (Requires SSE4.1)
- The need to read the 3D framebuffer is now checked on a per-line basis instead of solely at line 0. Once more, this fixes the map rendering in Advance Wars: Dual Strike during some conversations. (Regression from r5429.)
- Remove duplicate lookup table.
- Better optimize Render3D_SSE2::ClearFramebuffer(). Should improve performance for games that do their clears using image buffers.
- GPUEngineBase::_ColorEffectBlend() now supports RGB666 and RGB888 color formats.
- Use some SSSE3-specific optimizations in GPUEngineA::_RenderLine_DispCapture_BlendFunc_SSE2().
- Do some minor cleanup.
- Continue rework towards supporting RGB666 and RGB888 color formats. (Related to r5433. This rework is still incomplete.)
- More basic blending methods now support RGB666 and RGB888 color formats.
- Don’t reset some sprite-related state buffers if the OBJ layer is disabled.
- Replace instances of std::min() with ternary operators.
- Better optimize SSE2 versions of ConvertColor8888To5551() and ConvertColor6665To5551().
- Use some SSSE3-specific optimizations in GPUEngineBase::_ColorEffectBlend() and GPUEngineBase::_ColorEffectBlend3D().
- Fix some compiling issues with some SSE2 color conversion functions on older compilers.
- Fix a performance issue where if the status bar is hidden while Vertical Sync is enabled, then status text updates will cause a severe slowdown due to conflicting vertical syncs. (Fixed by setting the ‘hidden’ flag of the statusText control to YES while the status bar is hidden.)
- Revert a change in setting the fog render bit for translucent fragments. Fixes the appearance of the Air Robo GP in Solatorobo: Red the Hunter. (Regression from r5464.)
- In the OpenGL blitter, only allow source filters (such as Deposterize) to run on native-sized framebuffers. This is being done since the visual impact on custom-sized framebuffers, even those at 2x size, is not enough to warrant the additional GPU load. This behavior is now consistent with the pixel scalers, which only run on native-sized framebuffers and not on custom-sized framebuffers.
- Fix a bug in the OpenGL blitter where the Deposterize filter wouldn’t run if the pixel scaler was set to None.
- Add 555-to-6665 opaque color conversion.
- Add UNALIGNED switch to 555-to-8888, 555-to-6665, 8888-to-5551, and 6665-to-5551 color buffer conversion functions, allowing clients to inform these functions that the incoming buffer pointers may not be 16-byte aligned.
- Rendered lines from GPUEngineBase::_HandleDisplayModeOff(), GPUEngineA::_HandleDisplayModeVRAM(), and GPUEngineA::_HandleDisplayModeMainMemory() now output colors with the alpha bits filled in. This is working towards a time when clients that work directly in 16-bit and 32-bit colorspaces don’t have to fill in the alpha bits themselves.
- Unify more color conversion code.
- In the SSE2 version of ConvertColor555To8888Opaque(), change the algorithm to use computation instead of memory lookups. Although memory lookups are faster on newer CPUs, computation is much faster on older CPUs, which have smaller caches and longer memory latencies. I believe this is the correct decision, since older CPUs are the ones that need as much performance as they can get.
- Fix compiling on Windows due to new color conversion code. (Regression from r5455.)
GPU:
- The SSE2 version of ConvertColor555To8888Opaque() now uses memory lookups instead of calculating things through.
- Add color 555 to 8888-opaque conversions.
- In the new color buffer conversion functions, change the FragmentColor data types to u32. (Related to r5455.)
- Unify all colorspace conversion code.
- Fix bug with VRAM-to-VRAM capture.
OpenGL Renderer:
- Try and fix a possible bug with applying fog to transparent fragments.
- Texture sampling now works with bilinear filtering, mipmapping, and anisotropic filtering! These texture smoothing features can be used by enabling the new CommonSettings.GFX3D_Renderer_TextureSmoothing flag.
- The framebuffer pointers in NDSDisplayInfo are no longer assumed to be 16-bits per pixel in size. This is being done now in preparation for higher color depth processing. (This feature is not yet implemented.)
- Instead, clients should be reading NDSDisplayInfo.colorFormat to determine the color format of the framebuffers. NDSDisplayInfo.pixelBytes is a convenience field that reports the number of bytes per pixel (either 2 or 4 bytes).
- By default, the framebuffers will continue to be in 16-bit BGR555_Rev format for backwards compatibility.
- Textures can now be automatically upscaled using the xBRZ filter. Textures can be upscaled to 2x or 4x.
- Textures can now be smoothed using a deposterization filter. This can be helpful in smoothing some of the hard color banding that sometimes occurs with xBRZ.
- Only flush the 3D rendering buffers and update the rendering properties if the frame is not skipped.
- Be more accurate when using callbacks for DidRender3DBegin and DidRender3DEnd.
- Make the 3D rendering stage more multithreading friendly.
- HACK: Drop the acknowledgment bits when writing the DISP3DCNT register. Fixes the title screen in “Planet Rescue: Animal Emergency”. (Regression from r5259. Fixes bug #1538.)
- Custom rendering is now determined on a per-scanline basis rather than on a per-framebuffer basis. This greatly improves rendering accuracy and fixes any remaining graphical glitches associated with rendering at custom sizes.
- Fix crashing bug that can occur if BMPAddress maps exactly to the head of the custom VRAM blank region, such as in Hotel Dusk: Room 215. (Regression from r5366.)
- Do some code cleanup.
- Fix crashing bug that can occur if BMPAddress maps into the custom VRAM blank region. (Regression from r5366.)
- Fix bug where a 128-width display capture would actually perform a 256-width capture in custom VRAM. (Regression from r5243.)
- Fix bug where if the display mode is Off or MainMemory, then the destination buffer may not always be the native buffer.
- Remove VRAM display mode’s dependence on the isCustomRenderingNeeded flag.
- Fix possible memory corruption with display capture, at the cost of some performance. (Regression from r5243.)
- Add a couple more rules for determining if the 3D framebuffer will be read directly for display capture.
- Keep track of render states that are updated while rendering, even when the frame isn’t rendered.
- Use the proper address when reading custom VRAM during a BG layer affine extended direct render. Fixes the pencil drawing background in the title screen of Super Mario 64 DS when rendering at a custom resolution.
- Fix bug where if converting the framebuffer on GPU is not supported, but PBO is still supported, then the resulting framebuffer would be flipped with incorrect colors. (Regression from r5359.)
- Read back the pixels in RGBA format instead of BGRA on OpenGL 3.2 devices, since such devices should natively support that type of pixel transfer.
- By default, do not create a separate RGBA6665 buffer for rendering. Instead, directly render to GPUEngineA’s RGBA6665 buffer.
- SoftRasterizer no longer needs to flush the RGBA6665 buffer now that it is rendered to directly.
- Fix the OpenGL renderer’s RGBA5551 buffer flushing on big-endian systems.
- Change the HUD font from Source Sans Pro Semibold to Source Sans Pro Bold.
- HUD text rendering is now more crisp and handles scaling better.
- HUD objects are now clamped to a minimum size.
- HUD objects now scale with the display window instead of remaining at a fixed size. Scaling is linear up to 2x, and then logarithmic up to 3x.
- HUD text now looks sharper on Retina displays.
- Fix bug where restoring full screen windows on startup would fail. (Regression from r5349.)
- Fix bug where the dock would fail to reappear when the last window exited full screen mode. (Regression from r5349.)
- Disable logging when EXPERIMENTAL_WIFI_COMM is disabled.
- Now that Nintendo has discontinued their WFC service, we will no longer block users from trying to connect to it.
- In the OpenGL blitter, use DMA texture uploads for all possible video source cases. Doing this removes a longstanding MAJOR performance bottleneck.
- Native-sized video sees up to a 15% performance improvement, while higher-resolution video can see up to a 100% performance improvement!!!!!
- Return to using Snow Leopard style Audio Components. Requires building with Xcode v7.2 or later, since Xcode v7.0 and v7.1 have bugs that will cause AudioUnits to crash. (Related to r5280.)
- Remove the other reference to the DISPCNT.BG0_Enable flag for determining when 3D rendering is enabled. Fixes minimap rendering at custom resolutions in Advance Wars: Dual Strike during some conversations. (Related to r5334.)
- Fix bug where 3D layers still needed to be rendered even when the DISPCNT.BG0_Enable flag is disabled. Fixes minimap rendering in Advance Wars: Dual Strike during some conversations. (Regression from r5255.)
- Begin the process of applying SSE2 optimizations to BG layer compositing.
- In this revision, only Text mode layers use the new SSE2 optimizations. Other BG layer modes have yet to be implemented.
- Replace _mm_set1_epi64x() with _mm_set1_epi32() where appropriate.
- Complete GPUEngineBase::_RenderPixel_SSE2() method.
- Fix potential bug with window checks in GPUEngineBase::_RenderPixel3D_SSE2().
- Do some minor code cleanup.
- Do SSE2 optimization when compositing the 3D layer.
- Add SSE2 optimized version of GPUEngineBase::_RenderPixel() for future use (currently inactive).
- Explicitly make the Render3D class allocate itself with a cache-aligned base pointer. Fixes SSE2-related alignment crashes with OS/compiler combinations that don’t 16-byte align the base pointer for you.
- When clicking one of the Save Settings as Default buttons in one of the settings panels, force the user defaults file to synchronize immediately. This fixes updating the user defaults file on OS X v10.11 El Capitan.
- Fix bug where video settings wouldn’t update immediately while the emulation is paused. (Regression from r5310).
- Fix bug where if a ROM is unloaded, the previous video frame would remain instead of blacking out as intended. (Regression from r5310).
- Fix bug where using Frame Jump or executing the emulation faster than 1.00x would cause the execution speed to be limited by Vertical Sync.
- Do some code cleanup on CocoaDSOutput.
- Expand the text box further when the RTC is shown.
- Fix writing to the sub engine’s MASTER_BRIGHT register. Fixes the touch screen display output for “Pirates of the Caribbean: At World’s End”. (Regression from r5261.)
- Account for the fact that extended palette mappings can change independently of the BGnCNT register. Fixes the BG3 layer in Phoenix Wright: Ace Attorney. (Regression from r5286.)
- Fix a bunch of graphical corruption regressions on big-endian systems.
- Also fix rotation/scale sprite colors and the 3D clear color on big-endian systems.
- In the Support Request Form and Bug Report Form, update the reported configuration to reflect the current 3D rendering features.
- Do some minor code cleanup.
- Auto-resolving the native framebuffer is now only performed if the frame isn’t skipped.
- Add some callback routines for the beginning and ending of rendering a frame, and for the beginning and ending of rendering the 3D layer.
- Begin unifying pixel rendering. Rendering the BG and OBJ layers now use the same method.
- Pass the destination buffer pointer and line index by means of function parameters, instead of using object variables.
- Rendering a BG layer (for debugging purposes) is now completely handled in the core code.
- Do some other code cleanup.
- Clearing to the backdrop color has been changed from a pixel operation to a scanline operation.
- Clearing to black when the GPU engine is disabled has been changed from a scanline operation to a framebuffer operation.
- Applying the master brightness has been changed from a scanline operation to a framebuffer operation.
- Resetting the BGnX and BGnY registers now occurs at the end of line 191 instead of at the start of line 0.
- Per zeromus’ suggestion, remove GetNativeFramebuffer() and GetCustomFramebuffer() from the GPUSubsystem class. Users must parse the NDSDisplayInfo struct returned from GetDisplayInfo() instead.
- Per zeromus’ suggestion, rename Get/SetWillAutoBlitNativeToCustomBuffer() to Get/SetWillAutoResolveToCustomBuffer().
- Add some more notes to the NDSDisplayInfo struct to help clarify the meaning of each field.
- Fix bug where 3D rendering may not always finish on line 0, causing lingering 3D artifacts in certain games. Now it is always forced to finish. (Regression from r5255.)
- Bring back the backdrop clearing optimization from r5198 when rendering in the native resolution.
- Do some minor code cleanup.
- Fix possible crash when doing a direct-color sprite render due to aligned access, since incoming sprite coordinates can cause access to become unaligned. (Regression from r5256.)
- Do SSE2 optimization for direct-color sprite renders.
- Make ARM9_LCD cache-aligned. Allows for SSE2 to perform aligned load/stores on certain operations, improving performance.
- Further templatize some methods.
- Do some misc. code cleanup.
- Do heavy code cleanup.
- Split the engine-specific functionality of the main and sub engines into the new GPUEngineA and GPUEngineB subclasses.
- Templatize some parameters. Greatly increases the generated code size, but restores (and possibly improves) performance from r5251.
- Be smarter about manually inlining functions. Greatly reduces the generated code size, and fixes making optimized builds on MSVC. (Regression from r5248.)
- This change may affect performance. This will need additional testing.
- Add support for handling combination native/custom rendering sizes.
- As a side-effect of supporting this feature, pixel scalers now work as intended when high-resolution rendering is enabled (but only if the incoming display framebuffer is at the native size).
- Finish support for combination native/custom rendering sizes. Can give a significant performance improvement when running the GPUs at a custom size, but only for frontends that support this feature.
- Cleanup and optimize OAM attributes handling. (Special thanks to Twinaphex from libretro for pointing this out to us.)
- Add SSE2 optimizations to display capture operations.
- Do a whole bunch more code cleanup.
- Don’t evict the texture cache in the middle of geometry rendering! Fixes app crashing with games like Advance Wars: Days of Ruin that actually need to evict the texture cache. (Regression from r5175.)
- Do some minor code cleanup.
- Revert r5176 until polygon IDs can be handled correctly in one go. Fixes missing polygon issues in certain games such as missing rings in Sonic Chronicles: The Dark Brotherhood and missing loop traces in the Pokemon Ranger: Shadows of Almia title screen. (Addresses one of the issues noted in bug #1253.)
- Change toon highlight blending to match SoftRasterizer. Fixes the “Shadows of Almia” logo in the Pokemon Ranger: Shadows of Almia title screen. (Addresses one of the issues noted in bug #1253.)
- Revert the SSE2 bit shift optimizations that were done in r5216. Fixes a regression related to fog, as well as a regression that caused a flickering problem in the title screen of Pokemon Ranger: Shadows of Almia. (Fixes bug #1487.)
- In addition to the UI controls in the Show Video Settings panel, also add the “Use Vertical Sync” and “Run Filters on GPU” options to the View menu.
- Disable UI controls for Depth Comparison Threshold, since the setting is now obsolete. (Will need to delete UI controls before release.)
- Also add HQ3x/HQ3xS filters to the Pixel Scaler menu in Display Preferences.
- Fix bug where the HQ3x/HQ3xS filters running on the GPU sometimes wouldn’t draw correctly.
- Add new malloc_alignedN() functions for easier dynamic allocation of aligned memory blocks.
- Rework buffer allocations using the new malloc_alignedN() functions.
- To enable SSSE3, also require ENABLE_SSE2 and ENABLE_SSE3.
- Add some more SSE2/SSSE3 optimizations.
- CACHE_ALIGN and malloc_alignedCacheLine() now set 64 byte alignment on 64-bit systems.
- Do a bunch more code cleanup.
- Fix compiling for CLI port. (Regression from r5198.)
- Fix issue with GTK port where the video output framebuffer wasn't getting cleared on reset. (Regression from r5198.)
- Fix bug where the depth LUT wasn’t being generated correctly, causing the clear image depth buffer to malfunction. (Regression from r5187.)
- In SoftRasterizer, obsolete GFX3D_Zelda_Shadow_Depth_Hack for depth-equals tests. We’re now using a fixed tolerance of +/-0x200, according to GBATEK.
- In SoftRasterizer, z-depth is now calculated using the depth LUT instead of with << 9. This spreads the depth value more evenly across the range of [0 - 0x00FFFFFF]. This change will need additional testing.
- Do some small optimizations to SoftRasterizer.
- Do more code cleanup.
- In SoftRasterizer, do multithreading optimization for the fog and edge mark pass. This involved a change to the edge marking algorithm, so this will need additional testing.
- Fix bug where a SoftRasterizer renderer object wouldn’t get destroyed properly. (Regression from r5187.)
- Fix bug where the user wasn’t able to switch between different threaded versions of SoftRasterizer. (Regression from r5187.)
- Fix a potential bug that might occur if an OpenGL renderer object failed to create. (Regression from r5188.)
- Upload the toon table through the render state UBO instead of through a 1D texture. (OpenGL 3.2 only.)
- Small optimization in the edge mark fragment shader.
- Fix compiling issues. (Regression from r5162. Fixes bug #1468.)
- Fix crashing issue when selecting the OpenGL 3.2 renderer. Failure to init should now fallback properly. (Regression from r5180. Fixes bug #1470.)
- In the OpenGL renderer, do better handling of the geometry index buffer, and also load its data in OpenGLRenderer::BeginRender().
- In the OpenGL renderer, remove a bunch of extraneous binds.
- Fix bug in SoftRasterizer where clear-image depth wasn’t being written correctly. (Regression from r5176.)
- Reduce the buffer sizes in the core 3D engine.
- Do even more refactoring.
- In the OpenGL renderer, optimize framebuffer clearing (OpenGL v3.2 only).
- In SoftRasterizer, multithread the rendering state setup (requires at least 4 threads).
- Do more code refactoring.
- In the OpenGL renderer, optimize edge marking performance.
- In SoftRasterizer, fix a bug where edge marking and fog weren’t being drawn if multithreading was off.
- In the 3D Rendering Settings panel in the DeSmuME Preferences view, move the Enable Edge Marking and Enable Fog checkboxes to the General Settings section.
- Update tooltips to reflect the new behavior.
- Bring back vertex draw batching from r4522, fixing the bug that caused it to fail on Metroid Prime Hunters. This gives a small performance improvement for users with older drivers.
- When doing the depth buffer calculation, clamp the depth value to GL_DEPTH_RANGE {0.0, 1.0} in the fragment shader itself. Fixes 3D rendering on older drivers that won’t do the clamp for you. (Regression from r5133.)
- Revert depth buffer calculation change in r5133 for the z-buffer mode. Keep the w-buffer mode change, since that’s the one that works. Fixes the buttons in Blazer Drive. (Regression from r5133.)
- Fix texture coloring bugs with 4x4 compressed textures on big-endian systems. (This should be the last of the texture coloring bugs.)
- Do small optimization to 4x4 compressed texture conversion.
- Do some minor code cleanup.
- Fix bug when using dual display mode with a screen separation where the displays could mistakenly draw ghost lines at the top or bottom of each screen. (Partially addresses bug #1435.)
- Fix compiling when using Xcode 3.
- Don’t set the output frame size with multiple glViewport() calls per frame. Just set it once for the output.
- Do some minor code cleanup.
- Fix bug where the user can force activate the mic while the emulator is idle by manipulating its mute control.
- Fix bug where the mic icon remains black if the emulator resets while in execute.
- Further optimize mic icon updates.
- Fix crashing bug where the app would crash if the SaveRam path is invalid or does not allow for read/write access. (Fixes bugs #1394 and #1426.)
- New behavior: If the SaveRam path is invalid or does not allow for read/write access, warn the user. After the warning, continue emulation as normal.
mc.cpp:
- Allow backup memory to operate inside RAM if read/write file access is unavailable.
- Update UI tooltips to reflect current knowledge of DeSmuME’s behavior.
- Move the 3D Rendering Settings tab in DeSmuME Preferences from Display to Emulation.
- Fix bug where the mic level value wasn’t being displayed correctly on OS X v10.5 Leopard.
- Remember the sound output volume setting between app launches.
- Add the Microphone Settings panel for easier control and more verbose info. The new panel also allows the user to change and monitor mic settings without needing a display window's status bar.
- Implement a more proper mic level UI that reports the level per frame instead of per sample. This should improve UI performance.
- Make all methods in the CocoaDSControllerDelegate protocol optional and remove all related protocol methods from the OpenEmu plug-in.
- Try attaching a new hardware input device on startup.
- Remove debug printf stuff when attaching a new hardware input device. The new Microphone Settings panel makes the extra printf stuff unnecessary.
- Remember the hardware mic mute setting between app launches.
- Fix some UI bugs with mic level checks where the mic status icon wasn't showing the correct color under certain conditions.
- Fix UI bug where the mic mute control wasn't being respected under certain conditions.
- The mic status icon now includes a tooltip that reports the name and sample rate of the current input device.
- Handle mic hardware state changes more gracefully when the input device is changed externally.
- Add full support for using hardware microphones on the host machine for emulating the NDS microphone. Finally, mic driven games, such as Nintendogs, are fully playable on the Mac!
- Display windows no longer include an output volume slider directly in the status bar. Instead, the slider has been moved inside a popup button, which now behaves just like OS X's volume menu. This was done for better space efficiency.
- Display windows now include a microphone icon alongside the output volume icon. Like the output volume, there is a slider control to adjust microphone gain and also a mute control.
- The microphone icon changes color depending the microphone's state (as seen by the NDS, not the host).
- Replace the existing microphone icon with one that looks better and is more modern.
- Adjust the size of the output volume icon from 20x20 pixels to 16x16 pixels.
- Fix yet another font rendering bug in the Input Profile viewer on OS X Yosemite.
- Completely revamp the ROM Info panel to have a more modern and mainstream look and feel.
- The ROM Info panel can now be resized and scrolled through.
- Each individual info section in the ROM Info panel can now be expanded or collapsed.
- Fix bugs where the ROM capacity and ROM used capacity info weren't being calculated correctly.
- Do a bunch of random UI text clipping fixes when running on OS X Yosemite.
- Do a major revamp of the File Migration Assistant. It has been renamed "Game Data Migration Assistant". All following notes will pertain to the new Game Data Migration Assistant.
- Files no longer appear in a single list. They are now organized by app version and file type.
- Users can now select multiple files at once by clicking the checkbox of their corresponding app version or file type.
- File selection is much smarter. If the same file exists in multiple versions, then if the user selects one version of a file, all other versions of that same file are automatically deselected. This also works in multiple selection cases.
- Remove the Select All and Select None buttons. With the smarter selection UI, these buttons are no longer necessary.
- Provide better user feedback when no files need to be migrated.
- Rework the outline view to be more space efficient.
- Fix bug where switching between CPU-based and GPU-based filters in the DeSmuME Preferences display preview would sometimes fail.
- Fix bug where the DeSmuME Preferences display preview would show incorrect colors when using a CPU-based filter on PowerPC Macs.
Video Filters:
- Fix bug where the Scanline filter would show incorrect colors on big-endian systems.
- Fix some compiling issues if the C++ standard library is set to libc++ w/ C++11 support instead of libstdc++.
- Fix some compiling issues if compiling on OS X Leopard w/ Xcode 3.1.4.
- Fix the behavior of the Display Preferences filter preview.
- Fix an intermittent crash that sometimes occurs when creating a new display window.
- Fix a rare and mysterious crashing bug that sometimes occurs when initializing the HQ4x LUT.
- Fix a longstanding bug where audio frames were accidentally getting dropped when using N-sync and Z-sync methods. Greatly improves the audio quality of the N-sync method.
- Now that N-sync actually works as intended, it is now the default sync method. (N-sync has much better latency compared to the other sync methods, especially compared to P-sync, which was the previous default.)
- Update sync method tooltips to better reflect their actual behavior.
- Fix UI bug where the Advanced SPU Logic control text would get truncated on OS X Yosemite.
- Store the HQnx LUTs on the heap instead of on the stack. Fixes app builds from the Xcode 3 project, where the default stack size is smaller than when using the latest Xcode. (Regression for r5087.)
- Update Xcode 3 project so that builds actually work. (Regression from r5070.)
- Add shader-based equivalents to the following pixel scalers: 2xBRZ, 3xBRZ, 4xBRZ, 5xBRZ. (And yes, these are exact GLSL ports of Zenju's xBRZ scalers, not Hyllian's xBR scalers. These shaders are very demanding on your GPU, so users with older GPUs may want to continue using the CPU-based versions instead.)
- Add a preliminary GPU tiering system to help detect GPU capabilities and allow for better optimizations to be used on newer GPUs.
- Do some optimizations to the following shaders: Bicubic B-Spline, Bicubic Mitchell-Netravali, Lanczos3, EPX.
- Change the shader-based EPX+ color comparisons to be more true to the original CPU-based algorithm.
- Improve color blending on the Deposterize shader.
- Fix possible invalid memory access crashes when Y-sorting, most notably, in Super Mario 64 adventure mode. Using std::stable_sort() instead of std::sort() should have little to no performance impact since we're not sorting a lot of elements here. (Regression from r2436.)
- Initialize the HQnx LUTs only once, instead of doing it per display window.
- Fix issue where the HQnx LUT init code was causing extremely long compile times. (Regression from r5087.)
- Added CPU mutex functions gdbstub_mutex_init/destroy/lock/unlock, which govern access to NDS_ARM9 and NDS_ARM7 structs.
- Added locking and unlocking of the mutex to gdbstub.cpp/processPacket_gdb() and NDSSystem.cpp/NDS_exec()
Cocoa, CLI, GTK, Windows ports:
- Added mutex initialization and destruction to main() functions (cocoa/cocoa_core.mm, cli/main.cpp, gtk/main.cpp, windows/main.cpp)
- Have video from CPU-based pixel scalers transfer to GPU via DMA. Should improve performance on pixel scalers with large scaling sizes, such as HQ4xS and 5xBRZ.
- Add method VideoFilter::SetDstBufferPtr() - allows users to use their own destination buffer instead of having to use the VideoFilter object's internal buffer.
- Delete the Legacy Cocoa port. (Not only was the Tiger build broken in several ways, but all features from the Legacy port have been subsumed into the main Cocoa port now. Therefore, the Legacy port is no longer necessary.)
- Remove the "Xcode 4" and "Xcode 5" project files. These files have been superseded by the one project file "Xcode (Latest)".
- Do a massive cleanup of the #include and header structure.
- Remove a lot of unnecessary dependencies in the headers.
- Make headers responsible for including what they need for themselves. This makes the headers more independent of where they are in the #include order.
- Relocate some structs/classes to more logical locations.
- Clean up some platform-specific #ifdef stuff.
- Add a new developer-oriented build scheme called "OS X App; dev+" to the Xcode4 and Xcode5 projects.
- Add preliminary GDB stub support to the dev+ build. (Use the menu option Tools > Show GDB Stub Control.)
GDB Stub:
- Do some minor cleanup on the GDB stub init code.
---
add system in EMUFILE_FILE to switch correctly between read/write modes; add system in EMUFILE_FILE to (optionally) track current file position and avoid redundant fseeks - this code is UNTESTED.
---
add better console feedback when gbagame .sav is being scanned, to make developers less likely to think the emulator is frozen
- Fix bug where video filters weren't preferring to use the GPU by default.
- Display threads now each pull a copy of the video frame from the emulation thread, rather than the emulation thread pushing copies of the video frame to each display thread. (Slight performance improvement when many display threads are used.)
- Do a huge refactor of the display code.
- Add support for shader-based filters.
- New feature: The display pipeline has been separated into three parts - Source --> Pixel Scale --> Output. Different sets of filters may be applied to each part of the pipeline.
- Add the following source filters: Deposterize
- Add the following output filters: Bicubic (B-Spline), Bicubic (Mitchell-Netravali), Lanczos2, Lanczos3.
- Add shader-based equivalents to the following pixel scalers: Nearest 2x, Scanline, EPX, EPX+, Super Eagle, 2xSaI, Super 2xSaI. These will be used instead of the CPU-based scalers if "Run filters on GPU if possible" is enabled (default is enabled).
- Remove the following pixel scalers from the UI: Nearest 1.5x, Nearest+ 1.5x, Bilinear 2x, EPX 1.5x, EPX+ 1.5x. The reasoning behind this is because these pixel scalers aren't necessary due to the automatic sizing of display view to window. Also, the new output filters make it so that running similar pixel scalers along with an output filter will always yield superior results.
- When using multiple threads, ensure that all lines are accounted for when the line count isn't evenly divisible by the thread count.
- Add static method VideoFilter::GetAttributesByID().
- Reallocating the destination buffers now uses its own method. Reverts the changes from r5000.
- Prepare the code for the use of multi-pass filters.
- "Fix" opening ROM files of unknown file extension.
- Fix buffer overflow when last character of ROM game code is not a recognized country code.
- Add country code Chinese (iQue DS).
- Don't render HUD directly to gpu screen.
- Redraw display for some operations which updates HUD display.
- Make HUD aware of swapped screen and/or single screen.
- Fix possible stack overflow if video filter resolution is high enough.
- Fix not being able to toggle HUD when paused.
- Fix HUD 3d fps display when there is frameskip.
- Remove gtk pixbuf usage on drawing DS screens, use only Cairo.
- Use transformation matrix to handle touchscreen coordinates.
- Adapt RGB555-to-RGBA8888 conversion code from Cocoa port, should result in brighter colour.
- Re-enable fullscreen menu item on start.
- Fix a possible (but slim) buffer overflow caused by the usage of sprintf.
Linux (gtk):
- Show error instead of warning if --enable-hud is configured but libagg not found.
- Change F10 to use savestate slot 0 instead of 10.
- Show savestate time on savestate menu.
- Change startup window size back to resizable.
- Add HUD display toggle menu (require --enable-hud on configure)
- Reorganize menu items to be more alike the Windows port
- Change fullscreen hotkey to F11
- Change default video filter to None (user can still switch via menu)
- Rewrite fps limiter and frame skipping code.
- Decrease EmuLoop priority to force screen redraw at maximum rate.
- Add menu option for fps limiter.
- Support boost button.
- Force disable Ubuntu's global menu.
- Add new xBRZ family of filters.
Cocoa Port:
- Refactor all display code. OpenGL code is pushed to a lower level and filter code is pushed towards the UI level.
- Add support for the new xBRZ filters.
- The Execution Control panel no longer always appears on app startup.
- Don't completely block output threads when they are set to idle, since we need to assume that messages will be passed to them at any time. It seems like NSRunLoop is smart enough not to unnecessarily wake the CPU on idle, so the thread block was not necessary.
- Fix bug where user settings were not being applied while the emulator was paused. (Regression from r4970.)
- Optimize the emulator idle state to achieve 0% CPU usage. This greatly reduces the app's energy usage when the emulator is idle.
Cocoa Port (OpenEmu Plug-in):
- Remove some dependencies needed to compile the OpenEmu plug-in.
- Add controls for frame advance, frame jump, and display mode toggle.
- Add new execution control panel. (Emulation > Show Execution Control)
- Reorganize several menu items in the Emulation and View menus.
- Do a bunch of optimization and cleanup of the input handler.
- Add support for using analog inputs in their native format.
- The paddle controller now supports native analog control.
- NOTE: Due to the changes in the input handler, users will need to rebind any hatswitch inputs that were previously used. Only hatswitch inputs were affected by these changes.
- Clean up GBA Cartridge device code, and also add the ability to have the SRAM file be on a different file path from the ROM file.
- Force Rumble Pak to turn off rumble upon disconnect.
- Fix file path issue when trying to use a disk image file for an MPCF device.
- The use of the BOOL datatype in a function pointer for front-ends is not as portable as one might think. Switch to using a standard C++ bool datatype instead.
- Add auto-selection to Rumble Pak for Metroid Prime Pinball.
- Make frame rate transitions more smooth. This has the effect of making video much smoother, with only a negligible effect on execution speed accuracy.
- Make auto frame skip slightly less aggressive.
- Load cheat item icons higher up in the front-end code.
- Remove CocoaDSCheatManager's and CocoaDSCheatSearch's dependence on CocoaDSCore. Just use basic mutex pointers instead.
- Remove embedded scripts for generating SVN_REV from Xcode projects; move the script to a single common script file.
- Make the SVN_REV generation script a bit more robust. Building no longer fails if svnrevision doesn't support SVN 1.7.
- added new games to auto-selection list;
winport:
- added customizing the keys for Taito Paddle Controller;
- now don't need reset DS when you change the slot2 device;
- fix R4 (now scan directory on reset);
- add R4 new path type. now can select youself a directory to scan or auto select same as ROM;
winport:
- now in select Slot1 dialog (folder dialog) cursor autosets to current selected directory;
- add streaming ROM data from disk. I was broke all ports except windows, on linux/mac ports need fix rom_init_path in NDSSystem.cpp but i can't test this;
- fix ROM mask (use card size from header instead file size);
- fix read range from DS card (real DS can't read data from ROM for DSi console);
- temporary fix "write enable/disable" mc command;
- Condense the UI in DeSmuME Preferences for autoloading ROMs on startup.
- Rename the "Combo" display mode to "Dual Screen" display mode.
- Do some minor code cleanup.
- Fix building for all Xcode projects. (Regression from r4731).
Core:
- Fix include path in slot1comp_protocol.cpp. (Regression from r4731).
- Remove strongly-typed enum in MMU.h for compilers configured to build for (or can only support) C99. (Regression from r4731).
- Fix building for all Xcode projects. (Regression from r4723).
- Add support for "Retail (Auto-detect) and "Standard Retail Memory Card + ROM" devices in the SLOT-1 Manager.
General:
- Fix compiler warnings in bios.cpp. (Regression from r4722).
- Fix include path in advanscene.cpp. (Regression from r4723).
- Fix compiling when using GCC and Clang. (Regression from r4692).
- added part a new boot code (not finished yet and disabled by default).
entry point ARM CPUs set to xxxx0000h and now booting a firmware from BIOS as on a real DS.
all versions iQue and DS (patched with FlashME too) firmware works now, but can't run a games from firmware yet.
- Actually use -Ofast and -fvectorize in the Xcode 5 project.
- Replace data type unsigned int with size_t where appropriate.
- Do some minor code cleanup.
- Replace data type unsigned int with size_t where appropriate.
- In the VideoFilter ctor, now require srcWidth and srcHeight to be specified.
- Be more conservative when generating SSurface structs.
- Add Xcode 5 project, based on the Xcode 5 Developer Preview.
- This project is identical with the Xcode 4 project, except that it now uses -Ofast instead of -O3, and also adds the -fvectorize optimization.
- Use Relax IEEE Compliance (-ffast-math) optimization for all builds on all projects.
- Use Enforce Strict Aliasing (-fstrict-aliasing) optimization for all Debug builds on the Xcode 3 and Legacy projects.
- initialize fake firmware user settings on NDS reset when not using external firmware;
winport:
- using global structure for fake firmware user settings;
- add memory viewer features (navigate using a keyboard, toggle view all ASCII chars, saving latest 20 addresses to ini-file, go to address on [ENTER] key hit..)
- Fix bug where using an input mapped Set Speed control did not properly reset the execution speed if the Set Speed Limit slider was previously used.
- Move the display window related methods from the EmuControllerDelegate to DisplayWindowController.
- Fix bug where the SLOT-1 R4 directory path wasn't being saved properly.
- Fix bug where loading an external audio file with a sample rate less than 16000 Hz would cause a crash.
- Fix bug where creating a new display window with a default display mode of Main or Touch would cause the display window to draw incorrectly.
enh: game-specific hack for popular games that randomly corrupt their sprites when going in and out of doors
enh: officially supported arm and arm64 jits and overall improvements on arm hosts
enh: emulator now makes "(backups)" states on every loadstate, for in case you hit loadstate on accident
enh: User-selectable MSAA level for OpenGL renderer.
enh: "interface" for dll/so control of a desmume core
enh: optimizations, cpu arch-specific, and otherwise, to all 3d and 2d rendering, ranging from SSE to AVX2
enh: lua - Add raw joystick API and setlayermask API (windows only)
enh: lua - Add gamecode APU for game-specific hacks in scripts and 'freelook' script functionality
enh: add options to emulate game cards more badly, to trip AP on purpose
enh: fix some save type / slot type autodetections and save memory import codepaths
enh: add fake impossible debug AR code to select CPU: DFFFFFFF 77777777/99999999
enh: add --rtc-day and --rtc-hour to specify an offset from host RTC
enh: support newer duc files
enh: upgrade and add some upscalers, hq3x, 6xBRZ, etc.
enh: add "interface" frontend for use via dll/so
Windows:
note: windows xp and x86 support is dropped for official builds. windows 7 support will be dropped over my dead body.
bug: fix numerous bugs involving filenames and path with non-latin characters
bug: fix bugs in various display layout, rotate, vsync, gaps, and display method configurations
bug: fix bugs in user configured paths
bug: aviout/wavout is now more robust
bug: fix bugs in window clearing and various display method configurations which leave garbage on screen
enh: add fullscreen display options
enh: major revisions to mic sample feature, loaded as a bunch and rotated with hotkeys
enh: add user-facing option to control console window visibility
enh: add some crude capability for breakpoints to cpu debugger and memory viewer, and other bugfixes
enh: add "screen size ratio" for smaller sub-screens, etc.
enh: add some hotkeys
enh: add option to kill stylus input when outside the NDS screen
enh: improve cheat list UX
enh: optimizations to reduce cpu usage overall and during idle especially for high resolutions, scalers, etc.
enh: improve pen&touch support
Cocoa:
bug: fix issues with v-sync causing frame rate issues under various circumstances
bug: fix issues when running a display window in fullscreen
enh: add native binary support for Apple Silicon CPUs
enh: Macs with an Intel Haswell or later CPU now benefit from the new AVX2 optimizations
enh: add support for Apple's Dark Mode user interface introduced in macOS Mojave
enh: add some new toolbar items for the following: Frame Advance, Enable/Disable HUD, Toggle Displays
enh: turbo inputs can now be configured with a frame-by-frame press/release pattern
enh: display windows now run their video output using Metal, if available
enh: display windows now support HiDPI monitors like Apple's Retina monitors
enh: display windows have new "Hybrid" layouts for better fit on modern widescreen monitors (View > Display Layout)
enh: display windows can now change the video source going to each individual DS screen (View > Display Video Source)
enh: display windows can now run a Heads-Up Display for reporting useful info (View > Show HUD Settings)
enh: add support for changing the NDS stylus pressure (Emulation > Show Stylus Settings)
enh: screenshots can now be captured using a dedicated tool for it (Tools > Show Screenshot Capture Tool)
enh: lots of miscellaneous stability and performance improvements
Linux:
note: SDL2 now employed
note: GTK3 port added, built with meson
note: CLI and GTK ports improved, according to their respective niche (gaming vs functionality)
note: CLI: added horizontal screen layout
note: CLI: added floating-point scale factor support with HW stretching
note: CLI/GTK2: improved gdb stub for game debugging
note: CLI/GTK2/GTK3: various other improvements
0.9.11 -> 0.9.12
We decided to start skipping even versions to disambiguate official releases from several years of interim builds,
and to insulate ourselves from consideration for world record of "longest time between consecutive releases" by
creating confusion as to what constitutes a consecutive release.
0.9.10 -> 0.9.11 (r4908-r5146)
In this version, we have focused on the Cocoa frontend, but there have been some good core fixes over so long.
Notably, the save-related issues resulting in the advice "dont use 0.9.10" have been resolved.
General/Core:
bug: fix large numbers of games not being able to save anymore
bug: fix some missing sound effects due to wrong volumes in some boot scenarios and other things
bug: fix freezes due to tiny looping sounds
bug: fix many big endian issues
bug: fix some apparently rarely-used CPU instructions, no known consequences
bug: fix (block) reading of some GPU registers
bug: fix action replay code type 0xE
bug: fix reading of last 4 bytes of rom
bug: large improvements to stability of GDB stub
bug: support w-buffer support in OpenGL renderers
bug: fix unpredictable crashes in some 3d scenes from w=0
enh: better loading of roms (bad patches) with wrong size info in header
enh: warn user sometimes when 'stream rom from disk' will create malfunctions
enh: add xBRZ filters
enh: add "TXT Hack" for software rasterizer to improve text rendering in some games
Windows:
bug: fix 5x filters
enh: support import of action replay save files (.dss)
enh: add antialiasing option for OpenGL renderers
enh: don't malfunction if saveram is unavailable or read-only
Cocoa:
bug: 16-bit to 32-bit color space conversions no longer darken video or images
bug: fix intermittent issues with loading user defaults on app startup
bug: fix rendering inaccuracies of the video preview in the app display preferences
bug: fix various UI font rendering and text alignment issues on OS X Yosemite
bug: fix crackly sound from N-sync and Z-sync methods
enh: make N-sync method the default sound sync method since it has much lower latency than P-sync method
enh: add support for gdbstub (Tools > Show GDB Stub Control) (only available on custom builds using the dev+ build target)
enh: optimize input handling to use less CPU
enh: add support for App Nap when the app is in an idle state (only supported on OS X Mavericks and later)
enh: add Execution Control panel (Emulation > Show Execution Control), now with frame advance and frame jump controls
enh: auto frame skip is now smoother
enh: further improve execution timing accuracy
enh: improve overall video performance
enh: render video through a 3-stage filtering pipeline, (Video Source)-->(Pixel Scaler)-->(Video Output)
enh: add the following video source filters - Deposterize
enh: add the following video output filters - Bicubic B-Spline, Bicubic Mitchell-Netravali, Lanczos2, Lanczos3
enh: add ability to run all existing pixel scalers on either the CPU or the GPU
enh: add ability to toggle the main and touch display positions (View > Toggle All Displays)
enh: add preliminary support for replay playback and recording
enh: add support for turbo and autohold
enh: add support for the entire suite of slot-2 devices (Emulation > Show SLOT-2 Manager)
enh: add support for using the host machine's audio input device for emulating the NDS microphone (Emulation > Show Microphone Settings)
enh: change the sine wave tone generator's range from 100Hz-5000Hz to 40Hz-4000Hz
enh: reorganize the menu options to more logical locations
enh: greatly improve the File Migration Assistant (now renamed Game Data Migration Assistant) and ROM Info panel with a more modern and space efficient look and feel
enh: miscellaneous user interface improvements
Linux:
bug: fix screen gap bug
bug: workaround for std::bad_alloc exceptions compiler bugs
enh: add experimental AV recording
enh: generally improve main loop throttling and skipping
enh: massive improvements to HUD and menu layout
enh: add window sizing options and sound interpolation options
enh: add Lid button; disallow U+D, L+R; manual option saving
0.9.9 -> 0.9.10 (r4623-r4908)
In this version, we have focused on trying to clean up some complexities in the user experience and emulator internals. Pretty unglamorous stuff, but some games are newly compatible.
General/Core:
enh: break savestate back-compatibility
bug: improve save size autodetection for some games
bug: cpu: fix many basic jit cpu bugs
bug: 3d: tweak softrasterizer edge marking
bug: 3d: fix stale 4x4 texture palettes
bug: fix some GPU sprite blending scenarios
bug: fix bios HLE BitUnPack, UnCompHuffman
enh: modular slot-1 device system, emulate GC bus more faithfully
enh: support NAND slot-1 device
enh: auto-detect appropriate slot-1 and slot-2 device
enh: many revisions to firmware boot process for more authenticity. iQue and FlashME versions function, .dfc rewritten.
enh: support streaming NDS file from disk (like an ISO, to avoid long initial load time)
enh: run .dsv directly on disk, to save long flushing times. should speed backup operations.
enh: spu synch mode and method on commandline
Windows:
bug: fixes to advanscene DB import
bug: save opengl display method filter option
bug: general bugfixes to various screen layout modes
enh: add option to stop non-integer scaling during fullscreen or maximize
enh: improvements to save import dialog
enh: improved memory viewer tool
enh: operate better when run, against our advice, from a zipfile
enh: add slot-1 Nitro Filesystem viewer tool
Cocoa:
bug: fix slot1-R4 path saving
bug: fix bug with mic samples < 16khz
bug: fix bugs and enhancements in multi display windows
bug: fix handling of some joystick analog inputs
enh: save display windows configuration and emulation speed on app exit
0.9.8 -> 0.9.9 (r4228-r4623)
Yes, it's been a while since the last release, but we haven't been completely idle. There's a brand new jit cpu core which yields some impressive speedups!
@ -17,7 +216,6 @@ Graphics:
bug: 3d: fix some polygon and texture coloring bugs on big-endian systems
Windows:
bug: fixes to advanscene DB import
bug: fix some full screen stretching bugs
enh: add xaudio2 output driver
enh: add opengl display method (as opposed to directdraw), with controllable bilinear filter
@ -130,8 +328,8 @@ Linux:
enh: cli: better fps limiting (Thomas Jones)
Wx:
bug: some small fixes here and there (Jan Bücken)
enh: lot of code cleanup (Jan Bücken)
bug: some small fixes here and there (Jan Bücken)
enh: lot of code cleanup (Jan Bücken)
0.9.6 -> 0.9.7 (r3493-r3812)
@ -516,7 +714,7 @@ CPU/MMU:
bug: Fixed Thumb LDMIA (fixes ingame Dead'n'Furious) [shash]
AC_CHECK_LIB(z, gzopen, [], [AC_MSG_ERROR([zlib was not found, we can't go further. Please install it or specify the location where it's installed.])])
dnl - Check for zziplib
AC_CHECK_LIB(zzip, zzip_open, [
LIBS="-lzzip $LIBS"
AC_DEFINE([HAVE_LIBZZIP])
AC_MSG_CHECKING([[whether zzip use void * as second parameter]])
staticconstbooltouchshadow=false;//true; // sorry, it's cool but also distracting and looks cleaner with it off. maybe if it drew line segments between touch points instead of isolated crosses...
staticstd::vector<TouchInfo>touch(8);
staticvoidTextualInputDisplay(){
// drawing the whole string at once looks ugly
// (because of variable width font and the "shadow" appearing over blank space)
// and can't give us the color-coded effects we want anyway (see drawPad for info)
//this file contains the components used for emulating standard gamecard protocol.
//this largely means the complex boot-up process.
//i think there's no reason why proprietary cards couldn't speak any protocol they wish, as long as they didn't mind being unbootable.
//TODO - could this be refactored into a base class? that's probably more reasonable. but we've gone with this modular mix-in architecture so... not yet.
#ifndef _SLOT1COMP_PROTOCOL_H
#define _SLOT1COMP_PROTOCOL_H
#include"../types.h"
#include"../MMU.h"
classEMUFILE;
enumeSlot1Operation
{
//----------
//RAW mode operations
//before encrypted communications can be established, some values from the rom header must be read.
//this is the only way to read the header, actually, since the only reading commands available to games (after KEY2 mode is set) are
//it seems that etrian odyssey 3 doesnt work unless we mask this to cart size.
//but, a thought: does the internal rom address counter register wrap around? we may be making a mistake by keeping the extra precision
//but there is no test case yet
this->_address&=gameInfo.mask;
//feature of retail carts:
//B7 "Can be used only for addresses 8000h and up, smaller addresses will be silently redirected to address `8000h+(addr AND 1FFh)`"
if(CommonSettings.RetailCardProtection8000)
if(this->_address<0x8000)
this->_address=(0x8000+(this->_address&0x1FF));
//1. as a sanity measure for funny-sized roms (homebrew and perhaps truncated retail roms) we need to protect ourselves by returning 0xFF for things still out of range.
//2. this isnt right, unless someone documents otherwise:
//if (address > gameInfo.header.endROMoffset)
// ... the cart hardware doesnt know anything about the rom header. if it has a totally bogus endROMoffset, the cart will probably work just fine. and, the +4 is missing anyway:
//3. this is better: it just allows us to read 0xFF anywhere we dont have rom data. forget what the header says
//note: we allow the reading to proceed anyway, because the readROM method is built to allow jaggedy reads off the end of the rom to support trimmed roms