Decreases total Wii state save time (not counting compression) from
~570ms to ~18ms.
The compiler can't remove this check because of potential aliasing; this
might be fixable (e.g. by making mode const), but there is no reason to
have the code work in such a braindead way in the first place.
- DoVoid now uses memcpy.
- DoArray now uses DoVoid on the whole rather than Doing each element
(would fail for an array of STL structures, but we don't have any of
those).
- Do also now uses DoVoid. (In the previous version, it replicated
DoVoid's code in order to ensure each type gets its own implementation,
which for small types then becomes a simple load/store in any modern
compiler. Now DoVoid is __forceinline, which addresses that issue and
shouldn't make a big difference otherwise - perhaps a few extra copies
of the code inlined into DoArray or whatever.)
(1) Rename ABI_ALL_CALLEE_SAVED to ABI_ALL_CALLER_SAVED, because that's
what it was actually defined as (and used as). Derp.
(2) RegistersInUse is always used for the purpose of saving registers
before calling a C++ function in the middle of a JIT block (without
flushing). There is no need to save callee-saved registers in this
case. Change the name to CallerSavedRegistersInUse and mask with
ABI_ALL_CALLER_SAVED.
Nothing obvious broke when starting up a Melee game. (I added a test
for anything actually being masked out; it happens, but in this
particular case seemed to occur at most a few dozen times per second, so
the actual performance benefit is probably negligible.)
This class loads all the common PP shader configuration options and passes those options through to a inherited class that OpenGL or D3D will have.
Makes it so all the common code for PP shaders is in VideoCommon instead of duplicating the code across each backend.
This was actually never used as far as I can tell. There was no wx event handling done whatsoever for the global ID, So this is basically a dead function.
This moves the Gekko disassembler to Common where it should be. Having it in the Bochs disassembly Externals is incorrect.
Unlike the PowerPC disassembler prior however, this one is updated to have an API that is more fitting for C++. e.g. Not needing to specify a string buffer and size. It does all of this under the hood.
This modifies all the DebuggingInterfaces as necessary to handle this.
On error mmap returns MAP_FAILED(-1) not null.
FreeBSD was checking the return correctly, Linux was not.
This was noticed by triad attempting to run Dolphin under valgrind and not getting a memory space under the 2GB limit(Because -1 wraps around on
unsigned obviously)
Previously using the new "lower 8 bits" registers (SIL, SPL, ...) caused SETcc
to write to other registers (for example, SETcc SIL would generate SETcc DH).
It was only used for Windows XP and lower.
This also bumps the _WIN32_WINNT define in the stdafx precompiled headers to set the minimum version as Windows Vista.
A previous PR changed a whole lot of min/maxes to std::min/std::max
but made a mistake here and used a templated min which cast it's
arguments to unsigned instead of casting return value.
This resulted in glitchy artifacts in bright areas (See issue 7439)
I rewrote the code to use a proper clamping function so it's cleaner
to read.
This shouldn't really be exposed as a public function and should only be called through other Do class functions that take a container type as a parameter.
Sometimes (in particular when using non-typesafe functions) it can be convenient to have a getter method rather than performing a potentially lengthy explicit cast.
This allows for removal of the strcpy calls, also it's technically way more safe, though I doubt we'll ever have a log name larger than 128 characters or a short description larger than 32 characters.
Also moved these assignments into the constructor's initializer list.
If a CPU string was incapable of being found we would return a null pointer, which would crash with strncpy.
Also if we couldn't get a CPU implementer we would call free() to a null pointer.
In addition, detect 64bit ARM running.
MemoryUtil.cpp was incorrectly using the old __x86_64__ define when it should be using _M_X86_64.
It was also using _ARCH_64 when it shouldn't have which was causing an errant PanicAlert to come up in my development.
Some compilers we care about (mostly g++) do not support std::make_unique yet,
but we still want to use it in our codebase to make unique_ptr code more
readable. This commit introduces an implementation derivated from the libc++
code in the Dolphin codebase so we can use it right now everywhere.
Adapted from delroth's pull request.
Set the x87 precision, even on x64. Since we are using x87 instructions
in the JIT now, we can't guarantee that x87 precision will never
influence Dolphin on x64.
The new NOP emitter breaks when called with a negative count. As it
turns out, it did happen when deoptimizing 8 bit MOVs because they are
only 4 bytes long and need no BSWAP.
Fixes issue 6990.
This uses a bit of templating to remove the duplicate code that is the CodeBlocks in each emitter headers.
No actual functionality change in this.
When creating a Fixupbranch we were swapping the BL and B targets.
I think this was found by PPSSPP a while ago, but they never send PRs to merge their changes upstream.
Between C++11 and C++14, volatile types stopped being trivially
copyable. The serializer has no reason to care about this distinction,
so tack on remove_volatile.
The underlying storage type of a bitfield can be any intrinsic integer type,
but also any enumeration.
Custom storage types are supported if the following things are defined on the storage type:
- casting 0 to the storage type
- bit shift operators (in both directions)
- bitwise & operator
- bitwise ~ operator
- std::make_unsigned specialization
Previously he function was misbehaving because of a missing check for
whether an 8-bit operand was a register operand; it would therefore
emit unnecessary REX prefixes, incorrectly assert on 32-bit targets, and
could potentially emit wrong code in rare cases (like a memory to register
operation involving AH.)
Also, some cleanup while I was in the area to make the function easier to
read.
The workaround of using fixed underlying types produces lots of warnings
in GCC because now the bit-fields are too small for the value range used
for conversion semantics.
Breakpoints have one, but memchecks don't, despite being cleared directly in the breakpoint window.
Now DolphinWX should call the interface functions and not the direct functions of the breakpoints or memchecks for clearing.
Our defines were never clear between what meant 64bit or x86_64
This makes a clear cut between bitness and architecture.
This commit also has the side effect of bringing up aarch64 compiling support.
- remove unused variables
- reduce the scope where it makes sense
- correct limits (did you know that strcat()'s last parameter does not
include the \0 that is always added?)
- set some free()'d pointers to NULL
This breaks Linux stdout logging.
This reverts commit 7ac5b1f2f8, reversing
changes made to 9bc14012fc.
Revert "Merge pull request #77 from lioncash/remove-console"
This reverts commit 9bc14012fc, reversing
changes made to b18a33377d.
Conflicts:
Source/Core/Common/LogManager.cpp
Source/Core/DolphinWX/Frame.cpp
Source/Core/DolphinWX/FrameAui.cpp
Source/Core/DolphinWX/LogConfigWindow.cpp
Source/Core/DolphinWX/LogWindow.cpp
Note I do not mean the Logging window, but the console window.
It's literally rarely, if at all used, and offers less advantages over the built-in logging window (ie. it breaks on different locales: http://i.imgur.com/Cs92tQE.png)
This commit should remove all of the console logging.
Floating-point is complicated...
Some background: Denormals are floats that are too close to zero to be
stored in a normalized way (their exponent would need more bits). Since
they are stored unnormalized, they are hard to work with, even in
hardware. That's why both PowerPC and SSE can be configured to operate
in faster but non-standard-conpliant modes in which these numbers are
simply rounded ('flushed') to zero.
Internally, we do the same as the PowerPC CPU and store all floats in
double format. This means that for loading and storing singles we need a
conversion. The PowerPC CPU does this in hardware. We previously did
this using CVTSS2SD/CVTSD2SS. Unfortunately, these instructions are
considered arithmetic and therefore flush denormals to zero if non-IEEE
mode is active. This normally wouldn't be a problem since the next
arithmetic floating-point instruction would do the same anyway but as it
turns out some games actually use floating-point instructions for
copying arbitrary data.
My idea for fixing this problem was to use x87 instructions since the
x87 FPU never supported flush-to-zero and thus doesn't mangle denormals.
However, there is one more problem to deal with: SNaNs are automatically
converted to QNaNs (by setting the most-significant bit of the
fraction). I opted to fix this by manually resetting the QNaN bit of all
values with all-1s exponent.
I give up. Merging the ppc_fp branch has caused issues in numerous games
and I can't find the bug. I'm leaving this merged to enable easy
recompilation for people who would like to play games that benefit from
non-IEEE mode emulation (e.g. Starfox Assault).
MemArena mmaps the emulated memory from a file in order to get the same
mapping at multiple addresses. A file which, formerly, was located at a
static filename: it was unlinked after creation, but the open did not
use O_EXCL, so if two instances started up on the same system at just
the right time, they would get the same memory. Naturally, this caused
extremely mysterious crashes, but only in Netplay, where the game is
automatically started when the client receives a broadcast from the
server, so races are actually quite likely.
And switch to shm_open, because it fits the bill better and avoids any
issues with using /tmp.
bDAZ is now called bFlushToZero to better reflect what it's actually
used for.
I decided not to support any hardware-based flush-to-zero on systems
that don't support this for both inputs _and_ outputs. It makes the code
cleaner and the intersection of CPUs that support SSE2 but not DAZ
should be very small.
- Add support for std::set and std:pair.
- Switch from std::is_pod to std::is_trivially_copyable, to allow for
types that have constructors but trivial copy constructors. Easy,
except there are three different nonstandard versions of it required
on different platforms, in addition to the standard one.
* Currently there is no DEBUGFAST configuration. Defining DEBUGFAST as a preprocessor definition in Base.props (or a global header) enables it for now, pending a better method. This was done to make managing the build harder to screw up. However it may not even be an issue anymore with the new .props usage.
* D3DX11SaveTextureToFile usage is dropped and not replaced.
* If you have $(DXSDK_DIR) in your global property sheets (Microsoft.Cpp.$(PlatformName).user), you need to remove it. The build will error out with a message if it's configured incorrectly.
* If you are on Windows 8 or above, you no longer need the June 2010 DirectX SDK installed to build dolphin. If you are in this situation, it is still required if you want your built binaries to be able to use XAudio2 and XInput on previous Windows versions.
* GLew updated to 1.10.0
* compiler switches added: /volatile:iso, /d2Zi+
* LTCG available via msbuild property: DolphinRelease
* SDL updated to 2.0.0
* All Externals (excl. OpenAL and SDL) are built from source.
* Now uses STL version of std::{mutex,condition_variable,thread}
* Now uses Build as root directory for *all* intermediate files
* Binary directory is populated as post-build msbuild action
* .gitignore is simplified
* UnitTests project is no longer compiled
Note that before pushing those changes, they were initially tested in a branch, and passed the compilation testing. Sorry that I didn't catch this before.
This implements a partial JITIL based off of the JIT64IL. It's enough to run most games, albiet at a slow speed.
Implementing instructions for this IL is really simple since it basically is just enabling based on what is already in JIT64IL, and then enabling each individual IL instruction.
And fix some stuff up. It would probably be good to unify the stack
handling some more rather than having ABI_PushRegistersAndAdjustStack do
part of it and ABI_AlignStack the rest, causing unnecessary subtract
instructions on Linux x86 (only).
As part of that, change SafeLoadToEAX to SafeLoadToReg, and have JitIL
use that, which should fix fastmem on JitIL.
This should also fix a potential stack corruption issue with x86.
Also define _M_* in a common location, and clean up code that these
changes break (including DSPJit files that assume X86 yet are compiled
on ARM for some reason...)
- For GCC, use intrinsics that will work on ARM.
- Add AtomicExchangeAcquire.
- Make Atomic{Load,LoadAcquire,Store,StoreRelease} work for any suitable type.
- Call ABI_AlignStack even on x86-64.
- Have ABI_AlignStack respect the difference in current alignment
between the root JIT function, which has a prolog, and
ProtectFunction thunks, which do not. This was causing many games
to crash on start on OS X. Since this might otherwise mean changing
the stack pointer before every call...
- Have one prolog/epilog function rather than two (one of which
definitely did not do what it was thought to do), and make it
actually work like a normal one, so that the stack frame shows up
properly in the debugger. There should be no performance impact.
Changes a lot of parsing code which previously was not aware of the notion of
key/value, and operated only with raw lines. Now key/value is the default and
lines are handled as raw only if they do not contain =, or they start with $ or
+ (for Gecko/AR compatibility).