Floating-point is complicated...
Some background: Denormals are floats that are too close to zero to be
stored in a normalized way (their exponent would need more bits). Since
they are stored unnormalized, they are hard to work with, even in
hardware. That's why both PowerPC and SSE can be configured to operate
in faster but non-standard-conpliant modes in which these numbers are
simply rounded ('flushed') to zero.
Internally, we do the same as the PowerPC CPU and store all floats in
double format. This means that for loading and storing singles we need a
conversion. The PowerPC CPU does this in hardware. We previously did
this using CVTSS2SD/CVTSD2SS. Unfortunately, these instructions are
considered arithmetic and therefore flush denormals to zero if non-IEEE
mode is active. This normally wouldn't be a problem since the next
arithmetic floating-point instruction would do the same anyway but as it
turns out some games actually use floating-point instructions for
copying arbitrary data.
My idea for fixing this problem was to use x87 instructions since the
x87 FPU never supported flush-to-zero and thus doesn't mangle denormals.
However, there is one more problem to deal with: SNaNs are automatically
converted to QNaNs (by setting the most-significant bit of the
fraction). I opted to fix this by manually resetting the QNaN bit of all
values with all-1s exponent.
I give up. Merging the ppc_fp branch has caused issues in numerous games
and I can't find the bug. I'm leaving this merged to enable easy
recompilation for people who would like to play games that benefit from
non-IEEE mode emulation (e.g. Starfox Assault).
MemArena mmaps the emulated memory from a file in order to get the same
mapping at multiple addresses. A file which, formerly, was located at a
static filename: it was unlinked after creation, but the open did not
use O_EXCL, so if two instances started up on the same system at just
the right time, they would get the same memory. Naturally, this caused
extremely mysterious crashes, but only in Netplay, where the game is
automatically started when the client receives a broadcast from the
server, so races are actually quite likely.
And switch to shm_open, because it fits the bill better and avoids any
issues with using /tmp.
bDAZ is now called bFlushToZero to better reflect what it's actually
used for.
I decided not to support any hardware-based flush-to-zero on systems
that don't support this for both inputs _and_ outputs. It makes the code
cleaner and the intersection of CPUs that support SSE2 but not DAZ
should be very small.
- Add support for std::set and std:pair.
- Switch from std::is_pod to std::is_trivially_copyable, to allow for
types that have constructors but trivial copy constructors. Easy,
except there are three different nonstandard versions of it required
on different platforms, in addition to the standard one.
* Currently there is no DEBUGFAST configuration. Defining DEBUGFAST as a preprocessor definition in Base.props (or a global header) enables it for now, pending a better method. This was done to make managing the build harder to screw up. However it may not even be an issue anymore with the new .props usage.
* D3DX11SaveTextureToFile usage is dropped and not replaced.
* If you have $(DXSDK_DIR) in your global property sheets (Microsoft.Cpp.$(PlatformName).user), you need to remove it. The build will error out with a message if it's configured incorrectly.
* If you are on Windows 8 or above, you no longer need the June 2010 DirectX SDK installed to build dolphin. If you are in this situation, it is still required if you want your built binaries to be able to use XAudio2 and XInput on previous Windows versions.
* GLew updated to 1.10.0
* compiler switches added: /volatile:iso, /d2Zi+
* LTCG available via msbuild property: DolphinRelease
* SDL updated to 2.0.0
* All Externals (excl. OpenAL and SDL) are built from source.
* Now uses STL version of std::{mutex,condition_variable,thread}
* Now uses Build as root directory for *all* intermediate files
* Binary directory is populated as post-build msbuild action
* .gitignore is simplified
* UnitTests project is no longer compiled
Note that before pushing those changes, they were initially tested in a branch, and passed the compilation testing. Sorry that I didn't catch this before.
This implements a partial JITIL based off of the JIT64IL. It's enough to run most games, albiet at a slow speed.
Implementing instructions for this IL is really simple since it basically is just enabling based on what is already in JIT64IL, and then enabling each individual IL instruction.
And fix some stuff up. It would probably be good to unify the stack
handling some more rather than having ABI_PushRegistersAndAdjustStack do
part of it and ABI_AlignStack the rest, causing unnecessary subtract
instructions on Linux x86 (only).
As part of that, change SafeLoadToEAX to SafeLoadToReg, and have JitIL
use that, which should fix fastmem on JitIL.
This should also fix a potential stack corruption issue with x86.
Also define _M_* in a common location, and clean up code that these
changes break (including DSPJit files that assume X86 yet are compiled
on ARM for some reason...)
- For GCC, use intrinsics that will work on ARM.
- Add AtomicExchangeAcquire.
- Make Atomic{Load,LoadAcquire,Store,StoreRelease} work for any suitable type.
- Call ABI_AlignStack even on x86-64.
- Have ABI_AlignStack respect the difference in current alignment
between the root JIT function, which has a prolog, and
ProtectFunction thunks, which do not. This was causing many games
to crash on start on OS X. Since this might otherwise mean changing
the stack pointer before every call...
- Have one prolog/epilog function rather than two (one of which
definitely did not do what it was thought to do), and make it
actually work like a normal one, so that the stack frame shows up
properly in the debugger. There should be no performance impact.
Changes a lot of parsing code which previously was not aware of the notion of
key/value, and operated only with raw lines. Now key/value is the default and
lines are handled as raw only if they do not contain =, or they start with $ or
+ (for Gecko/AR compatibility).
It isn't easily accessible with sigaction or Mach exceptions (well,
requires an additional system call in the latter), and isn't necessary.
(and get rid of the enum, because it's only used once, and the comments
are more expressive than enum names)
MSVC insisted on using a copy assignment where a move was intended and
ought to be used. This would have been caught, because the class in
question inherits from NonCopyable, which declares a move assignment
operator, which is supposed to delete the implicitly declared copy
assignment operator, but of course MSVC didn't do that either, causing a
class that should have been safe to be unsafe.
(Intertwined enough that's it's easier to do in one patch.)
(1) /dev/es did not support state save, which could cause crashes and
incorrect behavior after loading.
(2) NANDContentLoader tried to read all of a title's contents into
memory when it was first opened. Two issues:
- If any contents were missing, it bailed out. However, with DLC,
only some of the contents may be downloaded, as determined by the
permission bits in the ticket. Instead, return an appropriate error
when a content is accessed that doesn't exist on the filesystem
(don't bother checking the permission bits though).
- Everything was loaded into memory - even if it consisted of 3 GB of
songs, which caused Dolphin to lag out for quite a while (and would
fail on 32-bit). Instead, open content on demand.
This is required to be able to move objects that inherit from it.
(Note that this patch also #ifs out the class for the externals that
include it yet are compiled in pre-C++11 mode. It shouldn't matter,
since those externals don't use it.)
It's not enough to check for the CPUID bit to know if AVX is supported since
AVX requires OS support (new set of registers == more registers to be saved
when context switching). If the OS does not support, the cpuid bit will still
be set but using YMM registers will cause an illegal exception fault.