With this, JitAsm code doesn't have any reliance on the JIT global
variable. This means the core JIT64 code no longer relies on said
global at all. The Jit64 common code, however, still has some offenders.
Notably, EmuCodeBlock and Jit64AsmCommon are the remaining places in the
common code that make use of the global variable.
The AutoUpdate module is a generic update checker mechanism which can be
used by UI backends to trigger an auto-update check as well as the
actual update process.
Currently only configurable through .ini and the Qt implementation is
completely placeholder-y -- blocking the main thread on a network
request on startup, etc.
Ensures that upon construction of a JitBase instance, that all
underlying members within the option and state structs are guaranteed
to be initialized.
This prevents potentially using a member uninitialized in some form.
Also amends the condition that was being checked. Previously it was
checking if the destination register value was 0x80000000, however it's
actually the source register that should be checked.
This macro (that has unfortunately become the de-facto way of
introducing targets) has a lot of disadvantages that outweigh the fact
that you avoid writing two extra lines of CMake script.
- It encourages the use of variables. In a build system the last thing
we want to care about is mutable state that can be avoided.
- It only handles linking in the libraries and nothing else. It's a
laziness macro.
- We should be explicit about what we're doing by introducing the target
first, not last.
This gets the ball rolling by migrating Core off the macro. Note that
this is essentially 1-to-1 unrolling of the macro, therefore we're
still linking in all libraries as public, even though that may not be
necessary.
This can be revisited once everything is off the macro for a quicker
transition period.
Trims the direct usages of the global by making the code go through the
JIT interface (where it should have been going in the first place).
This also removes direct JIT header dependencies from the breakpoints as
well. Now, no code uses the JIT global other than JIT code itself, and
the unit tests.
All of these with the record bit set update condition register 1 with the
contents of the FPSCR's FX, FEX, VX and OX bits.
Helper_UpdateCR1() does exactly that, so we use it here to perform this
for us.
These are only used internally. This also allows us to eliminate some
symbols that get dumped into the exposed Gen namespace.
By extension this also hides the Write[X] functions from OpArg's public
interface. This is only used internally by XEmitter, so they shouldn't
be usable by anything else.
This wouldn't be much of a data reader if it can't access the
read-only data pointer in read-only contexts. Especially if it
can get a writable equivalent in contexts that aren't read-only.
It's questionable to not return a reference to the instance being
assigned to. It's also quite misleading in terms of expected behavior
relative to everything else. This fixes it to make it consistent with
other classes.
Allows the default constructor to be defaulted and ensures the default
values are associated with the member variables directly.
Also corrects a prefixed underscore in the two parameter constructor.
Given that this only contains functions from the VideoBackendBase class,
it makes more sense to move these to the relevant cpp file to keep them
all together.
This is only ever memset to zero and never used again.
This also gets rid of an instance of undefined behavior considering the
draft standard for C++17 (N4659) states at [dcl.type.cv] paragraph 5:
"
The semantics of an access through a volatile glvalue are implementation-defined.
If an attempt is made to access an object defined with a volatile-qualified type
through the use of a non-volatile glvalue, the behavior is undefined.
"
Before this, DolphinQt2 would crash at boot with an assertion error
when using a Windows debug build, at least if the Dolphin GUI
language was set to English.
Depending on which constructor is invoked, m_id or m_compute_program_id
can end up in an uninitialized state. We should ensure that the object
is completely initialized to something deterministic regardless of the
constructor taken.
This prevents Dolphin from crashing when the emulated software crashes.
AFAIK, there is absolutely no performance to enabling this with the
x64 JIT.
Eventually, we should probably just remove bMMU
(https://github.com/dolphin-emu/dolphin/pull/1831). We can't do that
yet because of the ARM JIT.
A very basic hardware test shows that the ARMMSG doesn't change until
IOS replies. (People could have disassembled IOS to verify this too...)
Console:
sending request at 00034640 - ARMMSG 133e0fa0
00000000000000000000000000000010(ack) - ARMMSG 133e0fa0
00000000000000000000000000000100(reply) - ARMMSG 00034640
Dolphin, prior to this fix:
sending request (00034640) - ARMMSG 133e0fa0
00000000000000000000000000000011(ack) - ARMMSG 00034640
00000000000000000000000000000100(reply) - ARMMSG 00034640
Dolphin, after this fix:
sending request at 00034640 - ARMMSG 133e0fa0
00000000000000000000000000000011(ack) - ARMMSG 133e0fa0
00000000000000000000000000000100(reply) - ARMMSG 00034640
(Yes, note that the X1 bit is still set. This is a bug that I will
fix in the next commit.)
The IPC interrupt is triggered when IY1/IY2 is set and Y1/Y2 is written
to even when this results in clearing the bit.
This shouldn't change anything in practice but it's a difference
that Dolphin wasn't taking into account, which made me waste some time
when I was writing a hwtest :/
This adjusts IOS IPC timing to be closer to actual hardware:
* Emulate the IPC interrupt delay. On a real Wii, from the point of
view of the PPC, the IPC interrupt appears to fire about 100 TB ticks
after Y1/Y2 is seen.
* Fix the IPC acknowledgement delay. Dolphin was much, much too fast.
* Fix Device::GetDefaultReply to return more reasonable delays. Again,
Dolphin was way too fast. We now use a more realistic, average reply
time for most requests.
Note: the previous result from https://dolp.in/pr6374 is flawed.
GetTicketViews definitely takes more than 25µs to reply.
The reason the reply delay was so low is because an invalid
parameter was passed to the libogc wrapper, which causes it to
immediately return an error code (-4100).
* Fix the response delay for various replies that come from the kernel:
fd table full, unknown resource manager / device, invalid fd,
unknown IPC command.
Source: https://github.com/leoetlino/hwtests/blob/af320e4/iostest/ipc_timing.cpp
This replaces usages of the non-standard __FUNCTION__ macro with the standard
mandated __func__ identifier.
__FUNCTION__ is a preprocessor definition that is provided as an
extension by compilers. This was the only convenient option to rely on
pre-C++11. However, C++11 and greater mandate the predefined identifier
__func__, which lets us accomplish the same thing.
The difference between the two, however, is that __func__ isn't a
preprocessor macro, it's an actual identifier that exists at function
scope. The C++17 draft standard (N4659) at section [dcl.fct.def.general]
paragraph 8 states:
"
The function-local predefined variable __func__ is defined as if a
definition of the form
static const char __func__[] = "function-name ";
had been provided, where function-name is an implementation-defined
string. It is unspecified whether such
a variable has an address distinct from that of any other object in the
program.
"
Thankfully, we don't do any macro or string concatenation with __FUNCTION__
that can't be modified to use __func__.
Currently, when immediately compile shaders is not enabled, the
ubershaders will be placed before any specialized shaders in the compile
queue in hybrid ubershaders mode. This means that Dolphin could
potentially use the ubershaders for a longer time than it would have if
we blocked startup until all shaders were compiled, leading to a drop in
performance.
9fa2470 changed how the ubershader setting is stored in INIs but didn't
update the Android GUI code to reflect that, so the setting in the
Android GUI has been broken starting with that commit.
Fixes https://bugs.dolphin-emu.org/issues/10947
- In D3D, shaders could be compiled on the main thread, blocking
startup.
- Reduced the latency between a pipeline being requested and used in all
backends in hybrid ubershader mode, when no shader stages were present.
- Fixed a case where async compilation could cause the same UID to be
appended multiple times to the UID cache.
- Fix incorrect number of threads being used when immediately compile
shaders was enabled.
Fixes a crash which could occur in platforms which do not support
buffer_storage, and EFB2RAM is enabled (which indirectly uses the
attributeless buffer).
While the code is namespaced out properly, the files weren't separated
into their own directory. This moves the files so that introducing a general
interface is easier in the future for supporting other architectures.
Lowest hanging fruit I could find with a profiler.
Not sure this stuff actually needs to be done, but assuming it is, why
not do it quickly? 10x faster, goes from 1% CPU to 0.09%.
Yes, this commit is only to blame OSX and Mali. Through the former supports unsynchronized mappings, the latter supports *no* way to stream dynamic data at all. Let's try to make bad news, as they ignore friendly feature requests. Maybe we just need to make more noise...
Some locales use non-breaking spaces as separators, so getting the
encoding right is important. If DolphinWX gets a string that isn't
valid UTF-8, it flat out won't display the string.
This enables shaders to be compiled while the game is starting, instead
of blocking startup. If a shader is needed before it is compiled,
emulation will block.
As these are stored in a map, operator< will become a hot function when
doing lookups, which happen every frame. std::tie generated a rather
large function here with quite a few branches.
We would want to improve the granularity here in the future, but for
now, this should avoid any performance loss from switching to the
VideoCommon shader cache.
This saves us from having to call GetPath when the same file is being
read over and over. (GetPath is more expensive than GetOffset due to
it iterating through parts of the file system and creating strings.)
The original reason I wanted to do this was so that we can replace
the Android-specific code with this in the future, but of course,
just deduplicating between DolphinWX and DolphinQt2 is nice too.
Fixes:
- DolphinQt2 showing the wrong size for split WBFS disc images.
- DolphinQt2 being case sensitive when checking if a file is a DOL/ELF.
- DolphinQt2 not detecting when a Wii banner has become available
after the game list cache was created.
Removes:
- DolphinWX's ability to load PNGs as custom banners. But it was
already rather broken (see https://bugs.dolphin-emu.org/issues/10365
and https://bugs.dolphin-emu.org/issues/10366). The reason I removed
this was because PNG decoding relied on wx code and we don't have any
good non-wx/Qt code for loading PNG files right now (let's not use
SOIL), but we should be able to use libpng directly to implement PNG
loading in the future.
- DolphinQt2's ability to ignore a cached game if the last modified
time differs. We currently don't have a non-wx/Qt way to get the time.
This saves us from having to hardcode strings, and it also gives
us strings in whatever format is appropriate on the current OS
(for instance, IIRC Windows uses Alt+F where other OSes use Alt-F).
This commit changes devices to always return IPCCommandResult rather
than just a return code for Open() and Close() in order to be able
to better emulate reply timing.
In hindsight, I should have considered we would want to emulate
timing when I cleaned up the device interface, but alas.
This rectifies that mistake.
There is code below that assumes the presence of those macros (by #undef'ing them), but none of the included headers provided them.
This fixes a build failure on OpenBSD where the undef'd macros _do_ get picked up later on in a compilation unit (through which include, I don't know), and thus shadow the Common::swap* functions.
tl;dr: This PR speedups dolphin on mobiles with the Mali GPU and ES 3.2
drivers by a factor of 10 by using the method with the biggest overhead.
Please keep care not to buy this shit!
The ARM driver team seems to care very well about their customers. But
bad luck, users and open source developers are *not* their customers. So
even device-independent feature requests are just ignored for *years*:
https://community.arm.com/graphics/f/discussions/4645/gl_ext_buffer_storage-support
The bad point, they neither implement any of the other common ways to
stream dynamic content in unextented GL:
- They just ignore the GL_MAP_UNSYNCHRONIZED_BIT flag
- They don't support on-device buffer updates and just stall with
glBufferSubData
It seems like no benchmark is using any dynamic content - and like no
customer cares about anything but benchmarks, or users...
We have a flag to disable the glBufferSubData way, this PR adds the flag
to also disable the unsychronized mapping way. The second one is
available since their ES 3.2 update, but slow as hell.
So how to continue? The last remaining technical way to stream dynamic
content at all is to alloc a new buffer per draw call with glBufferData.
This is very gross, but still a factor 10 speedup compared to stalling
the GPU. Small tests shows that you can expect another 3-5 times speedup
with EXT_buffer_data, so Mali would be on pair with Adreno here. So if
you have bought such a device unfortunately, please try to make noise on
your vendor forums/support and ask for this extension. If you are going
to buy a new mobile, I'd recormend to avoid *any* mobile with a Mali GPU
in it.
Previously, this could cause a race condition which resulted in the
Vulkan backend attempting to acquire a swap chain image from a now
non-existant surface. By ensuring the backend knows about the surface
before a frame is presented, this race does not happen.
We now differentiate between a resize event and surface change/destroyed
event, reducing the overhead for resizes in the Vulkan backend. It is
also now now safe to change the surface multiple times if the video thread
is lagging behind.
The option is named DisableCopyToVRAM under the Hacks section in
GFX.ini. It is intentionally not exposed to the GUI, as users should not
need to use it under normal circumstances. The main use is debugging
issues in the EFB-to-RAM shaders.
This could cause glReadPixels() calls which assume no buffer is bound
(e.g. CPU EFB access) to fail. The problem was limited to devices which
don't support persistent mapping, as the map path is not otherwise.
Whenever udev_monitor_receive_device() returns a non-null pointer,
the device must be unref'd after use with udev_device_unref().
We previously missed some unref calls for non-evdev devices.
Both libusbhid (system library) and libhidapi (3rd party library)
provide a function called hid_init. Dolphin was being linked to both.
The WiimoteScannerHidapi constructor was calling hid_init without
arguments. libusbhid's hid_init expects one argument (a file path).
It was being called as if it was defined without arguments, which
resulted in a garbage path being passed in, and because of that,
the Qt GUI was failing to launch with the following error:
'dolphin-emu-qt2: @ : No such file or directory'
It's not guaranteed that the eventfd is smaller than the monitor fd,
because fds are not always monotonically allocated. To select()
correctly in all cases, use the max between the monitor fd and eventfd.
The console appears to behave against standard IEEE754 specification
here, in particular around how NaNs are handled. NaNs appear to have no
effect on the result, and are treated the same as positive or negative
infinity, based on the sign bit.
However, when the result would be NaN (inf - inf, or (-inf) - (-inf)),
this results in a completely fogged color, or unfogged color
respectively. We handle this by returning a constant zero for the A
varaible, and positive or negative infinity for C depending on the sign
bits of the A and C registers. This ensures that no NaN value is passed
to the GPU in the first place, and that the result of the fog
calculation cannot be NaN.
- Smplification of graphics backend startup/shutdown.
- Don't send complete message until CPU is ready to execute.
- Remove redundant stop message.
- Remove OSD message with backend name.
Android is allowed to pick any path for the external storage media and
that's why we should always use Environment.getExternalStorageDirectory()
to get the external storage path
onCreateView might not be called after resuming the emulatoin wich will
leed the game to stuck in the middle since mRunWhenSurfaceIsValid will be
true but surfaceChanged will never be called in this case.
If Animator Duration Scale is Off, the Enhancements/Hacks screens were
not visible unless you enable the Animator Duration Scale back. This
make sure screens will be visible regardless of your animation settings.
Modified NativeLibrary to display alerts in AlertDialogs rather than Toast notifications, and allow yes/no options.
Modified MainAndroid to use the new displayAlertMsg, and to return its output.
Otherwise we might get UB if the value we cast is larger than the
max value of the underlying type that the compiled picked for the enum.
I haven't done any extensive check through Dolphin to find cases
of this, I'm just fixing the cases I already know of.
1. Allow users to pick games dircetory from external storage.
2. Better UX experince to distinguish between selecting a directory or
a game. The later is needed when we implement change disk for android.
Tested on a linux Intel Skylake integrated graphics with
blend_func_extended force-disabled, as it's the only platform I have
that doesn't crash with ubershaders and supports fb_fetch
It seems it doesn't like modifying inout variables in place - so instead
use a temporary for ocol0/ocol1 and only write them once at the end of
the shader
The advantage of std::list is that elements can be removed from the
middle efficiently, but we don't actually need that, because the
ordering of the elements doesn't matter for us. We can just replace the
element we want to remove with the last element and then call pop_back.
Replacing list with vector should speed up looping through the elements.
GameTracker's usage of GameFileCache is thread-safe even without
using a mutex. All of its access to GameFileCache happens on the
thread m_load_thread, except for the call to GameFileCache::Load,
which finishes before m_load_thread starts executing.
Starting with 5.0-5504, trying to launch DolphinQt2 from Visual Studio
shows the error message "The operation could not be completed. Undefined
error" instead of launching the exe file. (The exe gets created
correctly, it just doesn't get launched. It's possible to work around
the problem by launching the exe manually outside of Visual Studio, but
then you won't have an attached debugger automatically.) This commit
fixes that by removing headers from DolphinQt2.vcxproj's ClInclude list
that already are in the QtMoc list. (The problem was originally about
LogWidget.h and LogConfigWidget.h, but 5.0-5600 made the problem be
about CheatWarningWidget.h and GeckoCodeWidget.h instead.)
1) Most of the times the native heap is kept around even after the activity
is killed, we can ask the native code if it is still running and resume
the emulation if that is the case.
2) In case the native heap is freed and the emulation can't resume we used
a temporary state to load on the game boot.
I couldn't find a way to test this, if you want to test this schnario,
add this block to EmulationFragment.
public void onDestroy()
{
stopEmulation();
super.onDestroy();
}
onDestroy is only called if the acivity killed by the OS and not be rotation
change whihch in this case will make sure to kill the emulation and start
again when the activiy is re-created.
Note: By "HandleInit" in this commit message, I mean the code that is
in HandleInit in master except the part that launches EmulationActivity.
In other words, I mean the call to SetUserDirectory and the call to
DirectoryInitializationService.startService. Couldn't think of
something more accurate to call that than "HandleInit"...
In master, HandleInit only runs when the main activity is launched.
This is a problem if the app ends up being launched in some other way,
such as resuming EmulationActivity after the app has been killed in
order to reclaim memory. It's important that we run HandleInit, because
otherwise the native code won't know where the User and Sys folders are.
In order to implement this, I'm dropping the ability to set a custom
User folder in an intent. I don't think anyone is using that anyway.
It's not impossible to support it, but I can't see a way to support
it that doesn't involve something ugly like having code for calling
HandleInit in every activity (or at least MainActivity + TvMainActivity
+ EmulationActivity, with more activities potentially needing it in
the future if we expand the usage of native code for e.g. settings).
If we want to support setting a custom user directory, we should
consider another way to do it, such as a setting that's stored in
getFilesDir() or getExternalStorageDirectory(). Intents are intended
to control the behavior of a specific activity, not the whole app.
There are two reasons for this change:
1. It removes many repetitive lines of code.
2. I think it's a good idea to enable the use of old-style section
names even for settings that previously haven't been settable in game
INIs. Mixing the two styles in INIs (using the new style only for new
settings) is not ideal, and people on the forums don't even seem to
know that the new style exists (nobody knew a way to set ubershader
settings per game, for instance). Encouraging everyone to start using
only the new style might work long-term, but it would take take time
and effort to make everyone get used to it. Considering that this commit
*reduces* the amount of code by adding the ability to use old-style
names for more settings, I'd say that adding this ability is worth it.
You will see an empty gc pad settings screen until you open any other
settings screen and change something, at that moment the Dolphin.ini will
be created and the gc pad settings will be loaded successfully the next
time you open the screen.
This fixes the bug by putting the default gc pad settings section even if
the Dolphin.ini file doesn't exist
Trying to force the XSI version by undefining _GNU_SOURCE can lead
to compilation errors on some systems because of headers expecting
that _GNU_SOURCE is defined.
This commit uses define checks to detect which version we have.
I tried making an overloaded function (int and const char*) instead,
but that led to a warning about one of the variants being unused.
This solves the following issues:
1. If user uninstall Dolphin and install it again the resources will
not be copied unless the user manually clear the app cache because we
are enabling the allowBackup flag in the manifest which will make the app
restore the settings saved in the shared prefernces including the flag
assetsCopied. This PR always copy the files everytime you open Dolphin.
2. If the AssetCopyService took long time and you tried to open the settings
screen or start a game the behaviour was not expected or the emulator will
crash, this PR make sure that whatever we add to the DirectoryInitializationService
or how long it will take will still work as expected by blocking both
the settings screen and the emulaion screen to wait untill all resources
needed are copied.
3. Better communication between the DirectoryInitializationService and the
UI screens using brocast messages.
The Time Base Register was added under the BAT registers. TBL and TBU
were ORed together to get one 64-bit value to display. It is labeled TB
The Graphics Quantisation Registers were added under the Segment
Registers. They are Labeled GQR0-GQR7.
All new registers are read only.
GLES doesn't support C-style array initialisers, so stuff like:
Type var[2] = {
VALUE_A,
VALUE_B
};
isn't supported in GLES (it was added in
GL_ARB_shading_language_420pack).
The texture conversion shader used this, so would fail to compile on
GLES.
The PC offset ADRP() path takes a s32 value, but the input offset was
being tested as abs(ptr) < 0xFFFFFFFF. This caused values between
0x80000000 and 0xFFFFFFFF to incorrectly use this path, despite the
offsets not being representable in an s32.
This caused a crash in the VertexLoader on android 8.1 immediate in wind
waker (and possibly all other apps on android 8.1) as the jit and data
sections happened to be loaded 4gb apart in virtual memory, causing some
pointers to hit this
Some homebrew expect exception handlers to be present -- which is
almost always the case on console, since most of the time homebrew are
launched from either a libogc or SDK title) -- and break if they are
not. To fix this, we just need to include default, dummy handlers.
Set HID0, HID4, GPR1 to values that are used by libogc for
initialisation. This makes boots more similar to a launch
from the HBC or another loader, since normally the registers
have already been initialised by the loader.
This fixes a crash in homebrew that assume GPR1 points to a correct
location and attempt to use it before initialising registers.
This only works if GCPadNew.ini and Dolphin.ini files are not already in the dolphin config folder. Any advice would be helpful. I'd really like to tackle the UI and wiring for deadzones, radius, etc.
HLSL does not define roundEven(), only round(). This means that the
output may differ slightly for OpenGL vs Direct3D. However, it ensures
consistency across OpenGL drivers, as round() in GLSL can go either way.
These three instructions use the B field (bits 16-20 of the opcode)
to determine what the operand register is. However, the code was
using the path that uses the C field (bits 21-25).
This amends the code to use the B field (and also fixes the 64-bit PPC
opcodes, because why not?).
Fixes issue 10683.
This fixes the rendering of the scan visor in Metroid Prime 2: Echoes,
as seen in https://fifoci.dolphin-emu.org/dff/mp2-scanner/
The alpha channel was off-by-one on Ivy Bridge due to the rounding
after multiplication with colmat. This commit removes this matrix
altogether in most cases, making them simple GLSL swizzles.
This will generate one shader per copy format. For now, it is the same
shader with the colmat hard coded. So it should already improve the GPU
performance a bit, but a rewrite of the shader generator is suggested.
Half of the patch is done by linkmauve1:
VideoCommon: Reorganise the shader writes.