Commit Graph

1860 Commits

Author SHA1 Message Date
Fiora c72a133206 JIT: implement frsqrte
Mostly a straightforward translation of the interpreter code, with a few
tricksy optimizations and fallbacks for rare paths.
2014-09-03 11:21:04 -07:00
Dolphin Bot e1248599eb Merge pull request #868 from FioraAeterna/bmi
x64Emitter: add BMI1/BMI2 support
2014-09-03 19:24:27 +02:00
lioncash f69e6ef16f Common: Remove unnecessary define check in Log2 2014-09-03 13:04:48 -04:00
Fiora 5088a2b4e2 x64Emitter: add BMI1/BMI2 support
TZCNT and LZCNT use a completely different encoding scheme, so they should
probably go in a separate patch.

Also add some tests.
2014-09-03 10:04:01 -07:00
shuffle2 db84a22109 Merge pull request #770 from lioncash/panic
Core: Fix case where a panic alert wouldn't be shown in MemoryUtil.cpp
2014-09-03 00:10:12 -07:00
comex 64575d565a Merge pull request #923 from FioraAeterna/fixcallersave
JIT: Fix caller-save registers on WIN64
2014-09-03 02:27:44 -04:00
shuffle2 532b7bb7da Merge pull request #893 from rohitnirmal/scan-build-fixes
Scan build fixes
2014-09-02 23:15:18 -07:00
Fiora 9e4419e786 x64Emitter: add support for shorter EAX forms of instructions
Should save a few bytes of code size here and there.
2014-09-02 21:52:41 -07:00
Fiora 6655c7775e JIT: Fix callee-save registers on WIN64 2014-09-02 10:56:14 -07:00
Pierre Bourdon e72146d19c x64Emitter: Do not assert-fail on redundant MOVs, instead show an error log 2014-09-02 10:17:32 +02:00
Pierre Bourdon a79ced2fc2 x64Emitter: Make it clear for both SSE to int conv that X64 regs are expected 2014-09-02 09:55:47 +02:00
Pierre Bourdon c428c5999f x64Emitter: UNPCKLPS/HPS are now tested 2014-09-02 09:53:00 +02:00
Pierre Bourdon cc0b048c0b x64Emitter: Support FLD/FSTP with 80 bits operands 2014-09-02 09:52:59 +02:00
Pierre Bourdon f99f302c91 x64Emitter: assert instead of crashing when generating MOVZX with a wrong size 2014-09-02 09:52:04 +02:00
Pierre Bourdon b1738b60fc x64Emitter: Fix MUL with AH/BH/CH/DH registers. 2014-09-02 09:52:04 +02:00
Pierre Bourdon f0e8b1fda8 x64Emitter: Error out on 8 bits CMOV, and emit 16 bits CMOV properly 2014-09-02 09:52:04 +02:00
Pierre Bourdon d4ec9737bd x64Emitter: Assert when using an invalid POP instead of generating an INT3 2014-09-02 09:52:04 +02:00
Pierre Bourdon 9c4daac3a4 x64Emitter: RDTSC now without a typo'd name 2014-09-02 09:52:04 +02:00
Pierre Bourdon 88af225070 x64Emitter: Remove a declared function that is never implemented 2014-09-02 09:52:04 +02:00
Pierre Bourdon 5941653d47 Merge pull request #920 from shuffle2/msvc-gtest
Provide a way to build and run unittests on Windows
2014-09-02 07:40:49 +02:00
Pierre Bourdon 9b10d36a85 Merge pull request #938 from lioncash/statics
Common: Make the LUTs in ColorUtil static
2014-09-02 07:36:17 +02:00
Lioncash 824a0a19f1 Common: Make the LUTs in ColorUtil static 2014-09-02 00:52:13 -04:00
Shawn Hoffman 839cace5ff msvc: get UnitTests compiling
Choose it from VS or pass /p:RunUnitTests=true to msbuild
2014-09-01 21:27:45 -07:00
Shawn Hoffman 266992684d msvc: remove some remnants of SDL and DSound from projects and general cleanup. 2014-09-01 21:27:44 -07:00
Fiora b51aa4fa89 Rename Log2 and add IsPow2 to MathUtils for future use
Also remove unused pow2/pow2f functions.
2014-09-01 20:41:07 -07:00
Lioncash ec9fc6bfc1 Common: Remove unnecessary "using namespace Gen;" from x64emitter 2014-09-01 23:10:56 -04:00
Lioncash ba4934b75e Common: Clean up brace placements 2014-08-30 18:06:35 -04:00
Lioncash 1d706b2311 Get rid of C-style empty function parameter indicators 2014-08-30 15:23:48 -04:00
comex 683191b6c6 Merge pull request #892 from comex/oh-the-abstraction
Optimize PointerWrap.
2014-08-28 17:28:16 -04:00
comex faa2666393 PointerWrap currently checks its mode for every individual byte of everything it 'does', including all of RAM. Make it not do that.
Decreases total Wii state save time (not counting compression) from
~570ms to ~18ms.

The compiler can't remove this check because of potential aliasing; this
might be fixable (e.g. by making mode const), but there is no reason to
have the code work in such a braindead way in the first place.

- DoVoid now uses memcpy.
- DoArray now uses DoVoid on the whole rather than Doing each element
(would fail for an array of STL structures, but we don't have any of
those).
- Do also now uses DoVoid.  (In the previous version, it replicated
DoVoid's code in order to ensure each type gets its own implementation,
which for small types then becomes a simple load/store in any modern
compiler.  Now DoVoid is __forceinline, which addresses that issue and
shouldn't make a big difference otherwise - perhaps a few extra copies
of the code inlined into DoArray or whatever.)
2014-08-28 15:35:19 -04:00
Fiora f9d4ff0d5d x64Emitter: add support for some missing CVT instructions 2014-08-27 20:15:42 -07:00
Rohit Nirmal 4c14ebdf32 Remove pointless initializations. 2014-08-27 20:36:49 -05:00
comex de7294ecc1 Add Flag support to ChunkFile.h 2014-08-26 12:43:39 -04:00
comex d128795594 Merge pull request #862 from comex/registersinuse
Reduce my idiocy in register saving code.
2014-08-24 16:16:32 -04:00
Dolphin Bot f31ebd23bb Merge pull request #864 from FioraAeterna/avx2bmi
Add AVX2/BMI1/BMI2 detection support
2014-08-24 18:55:01 +02:00
Fiora ce6d09ca5d Add AVX2/BMI1/BMI2 detection support
Also clean up the formatting in a bit of the CPU detection code.
2014-08-24 09:14:54 -07:00
comex d19ec35363 Reduce my idiocy in register saving code.
(1) Rename ABI_ALL_CALLEE_SAVED to ABI_ALL_CALLER_SAVED, because that's
what it was actually defined as (and used as).  Derp.

(2) RegistersInUse is always used for the purpose of saving registers
before calling a C++ function in the middle of a JIT block (without
flushing).  There is no need to save callee-saved registers in this
case.  Change the name to CallerSavedRegistersInUse and mask with
ABI_ALL_CALLER_SAVED.

Nothing obvious broke when starting up a Melee game.  (I added a test
for anything actually being masked out; it happens, but in this
particular case seemed to occur at most a few dozen times per second, so
the actual performance benefit is probably negligible.)
2014-08-23 15:46:10 -04:00
Shawn Hoffman 327d35377d windows: remove now-extraneous NOMINMAX and WIN32_LEAN_AND_MEAN #defines from dolphin code.
Wrap dinput.h in a header defining DIRECTINPUT_VERSION instead of repeating it multiple places.
2014-08-23 10:48:48 -07:00
Tillmann Karras 80be585fef x64Emitter: remove redundant "Gen::" 2014-08-20 02:56:07 +02:00
Tillmann Karras a363f4fa3e x64Emitter: make 'packed' parameter a bool 2014-08-20 02:54:30 +02:00
Shawn Hoffman 87c324c55a Add Common::Event::WaitFor(), which has the same semantics as std::condition_variable::wait_for() (with millisecond units). 2014-08-17 21:52:40 -07:00
Shawn Hoffman 375be67158 Add Common/Event.h to the VS project files. 2014-08-17 21:52:40 -07:00
Fiora 802b28daf9 x64Emitter: refactor to support longer opcodes
Also add some new SSE4 opcodes.
2014-08-17 04:48:17 -07:00
Lioncash 4759510f70 Get rid of instances of "using namespace std;" in the project 2014-08-17 02:05:33 -04:00
Lioncash 6ee2267b2d Merge pull request #771 from lioncash/32bit-cruft
Core: Remove leftover Windows 32-bit functions in MemArena.cpp
2014-08-17 01:40:23 -04:00
Lioncash 1b92c68f05 Common: Add Flag.h to the Visual Studio project. 2014-08-16 23:33:28 -04:00
Lioncash d18d3e1f3e Common: Get rid of StdConditionVariable, StdMutex, and StdThread.
All of the compilers we support have support for these now.
2014-08-16 23:33:19 -04:00
shuffle2 2270c3e90a Merge pull request #797 from shuffle2/msvc-pch
Windows: Use a shared precompiled header for dolphin code under Source/
2014-08-16 14:58:28 -07:00
Lioncash cf46ac7dc9 Core: Kill off Host_ShowJitResults
Another host function that can be killed off by simple wx event handling
2014-08-15 15:18:28 -04:00
Shawn Hoffman f1b82a34b2 Windows: Use a shared precompiled header for dolphin code under Source/ 2014-08-14 23:51:13 -07:00
Shawn Hoffman 66fdbdd18d Windows: Give SCMRevGen a configuration for x64 instead of Win32. 2014-08-13 03:57:10 -07:00
Ryan Houdek 6bdc32c54a Add the VideoCommon PostProcessing class.
This class loads all the common PP shader configuration options and passes those options through to a inherited class that OpenGL or D3D will have.
Makes it so all the common code for PP shaders is in VideoCommon instead of duplicating the code across each backend.
2014-08-13 01:05:10 -05:00
Ryan Houdek 3a657fe33c Add TryParseVector to StringUtil.
Adds a nice way to have options be in a vector for results
2014-08-12 23:45:14 -05:00
shuffle2 43010818fa Merge pull request #785 from lioncash/string
Common: Fix AsciiToHex returning true on overflow values
2014-08-12 14:43:02 -07:00
shuffle2 9f4008c5bc Merge pull request #760 from lioncash/swap
Common: Use the OSX equivalent byte-swap functions
2014-08-12 14:31:32 -07:00
shuffle2 b8d126c101 Merge pull request #754 from FioraAeterna/immediateopt
x64Emitter: optimize immediate sizes
2014-08-12 14:19:31 -07:00
Lioncash 5afb9cc5c4 Common: Fix AsciiToHex returning true on overflow values
We should be checking errno against ERANGE.
2014-08-12 02:49:04 -04:00
shuffle2 36af1b518d Merge pull request #777 from shuffle2/vs2013-update3
Use official flag for detailed symbols. Require VS2013 with update 3 or later to build.
2014-08-11 20:12:52 -07:00
Shawn Hoffman d0c3e46c80 Windows: Improve XSaveWorkaround to behave correctly when XSAVE processor feature is enabled, but AVX support isn't available for whatever reason. 2014-08-10 14:50:29 -07:00
Shawn Hoffman 5eccc08c7e Use official flag for detailed symbols. Require VS2013 with update 3 or later to build. 2014-08-10 14:47:29 -07:00
Lioncash eb10899b53 Core: Remove leftover Windows 32-bit functions in MemArena.cpp 2014-08-10 05:35:14 -04:00
Lioncash 9a61cfc650 Core: Actually show MemoryUtil.cpp allocation error messages on Linux 2014-08-10 05:28:00 -04:00
Lioncash be3428cf8e Core: Fix case where a panic alert wouldn't be shown in MemoryUtil.cpp 2014-08-10 04:50:58 -04:00
Lioncash 7bf82f1989 Core: Kill off Host_UpdateLogDisplay()
This was actually never used as far as I can tell. There was no wx event handling done whatsoever for the global ID, So this is basically a dead function.
2014-08-08 19:21:40 -04:00
lioncash 85ace9751e Common: Use the OSX equivalent byte-swap functions 2014-08-08 13:26:26 -04:00
Fiora 75b3e425fd x64Emitter: optimize immediate sizes
A nice alternative than trying to do it throughout the JIT.
2014-08-07 13:07:27 -04:00
Lioncash 0718937237 Common: Introduce the new Gekko disassembler to Common.
This moves the Gekko disassembler to Common where it should be. Having it in the Bochs disassembly Externals is incorrect.

Unlike the PowerPC disassembler prior however, this one is updated to have an API that is more fitting for C++. e.g. Not needing to specify a string buffer and size. It does all of this under the hood.

This modifies all the DebuggingInterfaces as necessary to handle this.
2014-08-04 00:45:07 -04:00
Pierre Bourdon 4c42b38de1 Merge pull request #428 from Sonicadvance1/x86_32-removal
Remove x86_32 support from Dolphin.
2014-08-03 21:17:28 -07:00
Ryan Houdek 0c24e1dcf2 Remove the rest of x86_32 support from Common. 2014-08-03 13:49:46 -05:00
Pierre Bourdon 8b26d7bf1e UnitTests: make it possible to build tests for code that has global dependencies 2014-08-02 09:34:39 -07:00
Lioncash 77c2b6829a Merge pull request #702 from lioncash/netplay-version
Common: State OS instead of 32/64 bit in the netplay lobby
2014-08-01 22:51:30 -04:00
Lioncash 4b32dcbc33 Merge pull request #707 from lioncash/strip
Common: Simplify StripTailDirSlashes
2014-08-01 16:01:37 -04:00
Lioncash 1dc5294629 Common: Simplify StripTailDirSlashes 2014-07-31 22:18:45 -04:00
Lioncash c5188c76b3 Merge pull request #705 from Sonicadvance1/fix-memoryutil-check
Fixes a check for what mmap returns.
2014-07-31 03:06:00 -04:00
Ryan Houdek 33450c80c3 Fixes a check for what mmap returns.
On error mmap returns MAP_FAILED(-1) not null.
FreeBSD was checking the return correctly, Linux was not.
This was noticed by triad attempting to run Dolphin under valgrind and not getting a memory space under the 2GB limit(Because -1 wraps around on
unsigned obviously)
2014-07-31 00:53:00 -05:00
Pierre Bourdon 9b9817f927 x64Emitter: Fix REX encoding for SETcc
Previously using the new "lower 8 bits" registers (SIL, SPL, ...) caused SETcc
to write to other registers (for example, SETcc SIL would generate SETcc DH).
2014-07-30 06:41:29 -07:00
Lioncash 22e9d6977b Common: State OS instead of 32/64 bit in the netplay lobby 2014-07-30 02:04:17 -04:00
Lioncash b03c12764d Really get rid of the MSVC 2005 workaround completely 2014-07-29 21:20:43 -04:00
Lioncash 4fa71dd59e Remove fakepoll.h.
It was only used for Windows XP and lower.

This also bumps the _WIN32_WINNT define in the stdafx precompiled headers to set the minimum version as Windows Vista.
2014-07-26 22:53:40 -04:00
Shawn Hoffman cfc7bb35c2 Windows: Also look for git.exe in the registry (for Git Extensions installs). 2014-07-20 12:33:56 -07:00
Tillmann Karras 6df48ed432 x64Emitter: add CVTTPD2DQ 2014-07-15 23:53:56 +02:00
Lioncash 79eb0f8d0c Merge pull request #623 from phire/sw-xfb-clamping
Fix incorrect clamping in SWRenderer.
2014-07-15 12:04:06 -04:00
Scott Mansell b8695a57da Fix incorrect clamping in SWRenderer.
A previous PR changed a whole lot of min/maxes to std::min/std::max
but made a mistake here and used a templated min which cast it's
arguments to unsigned instead of casting return value.

This resulted in glitchy artifacts in bright areas (See issue 7439)

I rewrote the code to use a proper clamping function so it's cleaner
to read.
2014-07-15 21:15:49 +12:00
Lioncash 8087a89afd Remove some unnecessary defines in Log.h 2014-07-14 15:51:28 -04:00
Tillmann Karras d398e9ef61 Common: add macros for assisting branch prediction 2014-07-14 01:52:14 +02:00
Tillmann Karras 0ccee6c87b Fix warnings unearthed by #579 2014-07-13 02:16:51 +02:00
degasus 7e79806efc remove unused globals
Also change globals into statics which are only used in one file
2014-07-11 16:10:20 +02:00
degasus 81ed17be53 avoid the extern keyword in .cpp files 2014-07-11 16:10:20 +02:00
degasus 22e1aa5bb4 mark all local functions as static 2014-07-11 16:07:23 +02:00
Lioncash 09eb1acc5e Common: Using size_t in PointerWrap's DoContainer apparently causes crashes. Fixes this. 2014-07-06 03:05:27 -04:00
Lioncash b97d2853a7 Common: Make DoContainer within PointerWrap private.
This shouldn't really be exposed as a public function and should only be called through other Do class functions that take a container type as a parameter.
2014-07-05 23:03:43 -04:00
Lioncash bd377b9580 Merge pull request #443 from magumagu/loadstore-cleanup
Loadstore cleanup
2014-06-26 21:32:59 -04:00
Lioncash ca5340ebde Centralize the logging code into its own folder in Common. 2014-06-25 22:11:42 -04:00
Lioncash 177658aed6 Merge pull request #513 from lioncash/vs-x64
Remove the 32-bit config platform from the VS solution file
2014-06-25 21:23:45 -04:00
Lioncash 8b13afbb8e Remove the 32-bit config platform from the VS solution and projects 2014-06-24 22:07:26 -04:00
Lioncash eb3de73ab9 Merge pull request #531 from LPFaint99/memcard
GCI Folder: correctly identify region of sysmenu
2014-06-24 21:18:27 -04:00
Jordan Woyak 516369594f Store ini sections in a std::list (rather than vector) to prevent unexpected pointer invalidation with use of GetOrCreateSection. 2014-06-24 12:37:38 -05:00
LPFaint99 c65a6f5fb0 GCI Folder: correctly identify region of sysmenu 2014-06-23 19:58:27 -07:00
Pierre Bourdon ce74752e91 Common: Add a PCAP writer module 2014-06-22 20:04:46 +02:00
Pierre Bourdon 5dff577339 Merge pull request #500 from lioncash/ini
Use only section-based ini reading.
2014-06-22 17:21:45 +02:00
magumagu 06864e9fee JIT: Clean up float loads and stores.
Less code is good, and this should make future changes to memory handling
easier.
2014-06-20 12:52:39 -07:00
Lioncash f05d3f6e5d Use only section-based ini reading. 2014-06-16 01:31:23 -04:00
magumagu ee0c5bdc20 Try to fix android build. 2014-06-15 15:56:42 -07:00
magumagu d905cbfd5d Don't set DAZ on x86 in non-IEEE mode.
I have no idea why we were using it in the first place; it doesn't match
the behavior of PPC NI flag.
2014-06-15 03:51:51 -07:00
Tony Wasserka 3d6f9ef897 BitField: Add an explicit getter function for retrieving the BitField value.
Sometimes (in particular when using non-typesafe functions) it can be convenient to have a getter method rather than performing a potentially lengthy explicit cast.
2014-06-11 20:58:40 +02:00
Tony Wasserka b3c7f003da BitField: Delete copy assignment to prevent obscure bugs. 2014-06-11 20:58:40 +02:00
Ryan Houdek 6e1d312091 Make it so ARMv7 isn't a generic target.
Rearranges a bit of code so that ARM isn't a generic build anymore. Because it obviously isn't
2014-06-07 20:26:31 -05:00
Ryan Houdek b6db0d0ab8 Merge pull request #457 from Tilka/jcc
x64Emitter: J_CC: use 32 bit offset automatically
2014-06-06 20:53:50 -05:00
Lioncash 3843848ed4 Use std::string in LogContainer's constructor.
This allows for removal of the strcpy calls, also it's technically way more safe, though I doubt we'll ever have a log name larger than 128 characters or a short description larger than 32 characters.

Also moved these assignments into the constructor's initializer list.
2014-06-05 18:50:14 -04:00
Tillmann Karras f8280401f6 x64Emitter: J_CC: use 32 bit offset automatically 2014-06-03 23:08:58 +02:00
Ryan Houdek 3a06907653 Merge pull request #455 from lioncash/arm-cpudetect-fix
Stringify ArmCPUDetect.cpp.
2014-06-02 20:10:39 -05:00
Lioncash 7d7b3d6156 Stringify ArmCPUDetect.cpp. 2014-06-02 21:08:26 -04:00
Ryan Houdek 5d3382fb56 Fix a crash in ARM's CPUDetect on a malformed /proc/cpuinfo.
If a CPU string was incapable of being found we would return a null pointer, which would crash with strncpy.
Also if we couldn't get a CPU implementer we would call free() to a null pointer.
In addition, detect 64bit ARM running.
2014-06-01 23:55:38 -05:00
Ryan Houdek 87e671404a Make MemoryUtil.cpp use the correct x86_64 define.
MemoryUtil.cpp was incorrectly using the old __x86_64__ define when it should be using _M_X86_64.
It was also using _ARCH_64 when it shouldn't have which was causing an errant PanicAlert to come up in my development.
2014-06-01 23:45:44 -05:00
Lioncash 49b0eef393 Remove the min/max functions in CommonFuncs.
The algorithm header has the same functions.
2014-05-29 21:44:41 -04:00
Lioncash eca70d1562 Get rid of the temporary buffer in IniFile's Load function.
std::getline is the string-based equivalent.
2014-05-28 20:26:15 -04:00
Rachel Bryk 80ab567a5b When reading an ini file, if there is an error, check if it is simply because the eof was reached. 2014-05-28 14:06:18 -04:00
Pierre Bourdon 10efd5b8c0 Merge pull request #430 from shuffle2/xsave-workaround-fix
msvc C initializers return int...fix EnableXSaveWorkaround when rax doesn't implicitly equal zero.
2014-05-28 12:25:25 +02:00
Pierre Bourdon ce139622f6 Merge pull request #392 from RachelBryk/error-check-ini-file
Check for errors when reading lines from ini files.
2014-05-28 11:57:00 +02:00
Shawn Hoffman 47a2eb47a0 msvc C initializers return int...fix EnableXSaveWorkaround when rax doesn't implicitly equal zero. 2014-05-28 00:16:33 -07:00
Shawn Hoffman 58bcc3d12a Redo commit 932945d480
This time, make sure the object for disabling XSave support in msvcr can't be dropped by the linker.
2014-05-27 13:41:19 -07:00
Rachel Bryk 0782d106db Check for errors when reading lines from ini files.
Fixes issue 7283.
2014-05-23 03:17:19 -04:00
Lioncash 5f796e919b Move bn.h and ec.h into the correct filter section 2014-05-17 16:47:41 -04:00
Lioncash d0bd4119d1 Use size_t in std::string operations in IniFile.cpp, not int 2014-05-14 20:38:08 -04:00
shuffle2 36720e6822 Merge pull request #351 from Tilka/make_unique
Add a std::make_unique implementation (Common/StdMakeUnique.h)
2014-05-11 01:46:09 -07:00
Tillmann Karras 4400e511c0 fixmeup rename 2014-05-11 10:40:18 +02:00
Shawn Hoffman 700c135386 Revert "x64FPURoundMode: always set x87 precision"
This reverts commit 9de77b7c23.
Setting x87 precision control is only supported on x86 platforms (not ARM or x64).
2014-05-10 20:21:07 -07:00
Tillmann Karras 81d4d7368a Add a std::make_unique implementation
Some compilers we care about (mostly g++) do not support std::make_unique yet,
but we still want to use it in our codebase to make unique_ptr code more
readable. This commit introduces an implementation derivated from the libc++
code in the Dolphin codebase so we can use it right now everywhere.

Adapted from delroth's pull request.
2014-05-06 12:32:03 +02:00
Tillmann Karras 9de77b7c23 x64FPURoundMode: always set x87 precision
Set the x87 precision, even on x64. Since we are using x87 instructions
in the JIT now, we can't guarantee that x87 precision will never
influence Dolphin on x64.
2014-05-01 01:10:00 +02:00
Tillmann Karras ed762a3eda x64FPURoundMode: use fesetround() instead of asm 2014-05-01 01:09:55 +02:00
Tillmann Karras e659f5ac58 JitBackpatch: fix NOP padding
The new NOP emitter breaks when called with a negative count. As it
turns out, it did happen when deoptimizing 8 bit MOVs because they are
only 4 bytes long and need no BSWAP.
2014-04-30 15:26:11 +02:00
Pierre Bourdon aef24d509b Merge pull request #304 from Tilka/nop
Optimize NOPs
2014-04-27 11:52:05 +02:00
Tillmann Karras 886c887e80 Fix Fastmem on CPUs without MOVBE
The problem was that when BSWAP was used, UnsafeWriteRegToReg() returned
the address of that instead of the MOV.
2014-04-25 01:11:52 +02:00
Tillmann Karras acfd9ee76c Add remaining possible uses of MOVBE
Also fixes a missing 'break' statement in DisassembleMov().
2014-04-24 16:36:03 +02:00
Tillmann Karras 957649b7af Optimize NOPs 2014-04-23 21:15:09 +02:00
Tillmann Karras b3c7395a23 Atomic: support clang 3.4+ 2014-04-17 10:39:02 +02:00
Pierre Bourdon cf315a487f Merge pull request #271 from delroth/threading-stuff
Threading improvements: add Common::Flag and improve Common::Event
2014-04-14 23:23:16 +02:00
Pierre Bourdon 7074feacbe Common::Event: Add a faster Windows specific implementation based on the concurrency runtime. 2014-04-14 23:13:15 +02:00
Pierre Bourdon 48bd904028 Common::Event: Implement in terms of Common::Flag to get rid of a volatile and optimize with early returns and atomic swaps 2014-04-14 23:13:15 +02:00
Pierre Bourdon e24cad0780 Common::Flag: Add support for TestAndSet + test by implementing basic spinlocks. 2014-04-14 23:13:15 +02:00
Tony Wasserka ccc04944b2 BitField: Fix alignment issues.
At least one platform (ARM with NEON instructions enabled) generates SIGBUSes if BitField objects aren't aligned properly.
2014-04-14 20:04:44 +02:00
Pierre Bourdon 6bdcbad3e4 Common: Move the Event class to a separate file, and add tests for it. Fix includes everywhere to match this. 2014-04-14 10:54:07 +02:00
Pierre Bourdon f9fb39d383 Common: Add a 'Flag' class that is used to encapsulate a boolean flag manipulated from several threads 2014-04-14 10:54:07 +02:00
Tony Wasserka 12841928df BitField: Optimize generated assembly by forcing inlining. 2014-04-13 13:27:01 +02:00
Pierre Bourdon b2597739ff x64Emitter: Add the MOVBE instruction. 2014-04-11 23:33:21 +02:00
Pierre Bourdon d2de1ddabc CPUDetect: add support for MOVBE detection 2014-04-11 23:29:03 +02:00
Ryan Houdek 87d106d65c Remove dumb CodeBlock duplication in the emitters.
Fixes issue 6990.
This uses a bit of templating to remove the duplicate code that is the CodeBlocks in each emitter headers.
No actual functionality change in this.
2014-04-09 13:53:43 -05:00
Ryan Houdek 93c871522f Fix a bug in the ARMEmitter.
When creating a Fixupbranch we were swapping the BL and B targets.
I think this was found by PPSSPP a while ago, but they never send PRs to merge their changes upstream.
2014-04-06 10:28:41 -04:00
Pierre Bourdon 664c8d30a0 Remove all trailing whitespaces from our codebase. 2014-03-29 11:05:44 +01:00
comex 4d5df0d008 Fix IsTriviallyCopyable for volatile (fixes Mac build).
Between C++11 and C++14, volatile types stopped being trivially
copyable.  The serializer has no reason to care about this distinction,
so tack on remove_volatile.
2014-03-27 23:42:52 -04:00
magumagu 4eab240e25 Compute stack usage correctly in ABI_CallFunctionPC.
(The numbers need to be consistent with the actual usage, or else the stack gets corrupted.)
2014-03-25 20:48:25 -07:00
magumagu e4081b29f9 Use unaligned stores to save XMM regs to stack.
On Win32, the stack isn't aligned, so aligned stores will cause crashes.
2014-03-25 20:46:36 -07:00
Pierre Bourdon 5fc6ce59c3 Merge pull request #210 from magumagu/writerex-fix
Fix OpArg::WriteRex with 8-bit memory operand.
2014-03-26 02:34:44 +01:00
Tony Wasserka 48a1790d81 Common: Add a generic class for accessing bitfields in a fast and endianness-independent way.
The underlying storage type of a bitfield can be any intrinsic integer type,
but also any enumeration.

Custom storage types are supported if the following things are defined on the storage type:
- casting 0 to the storage type
- bit shift operators (in both directions)
- bitwise & operator
- bitwise ~ operator
- std::make_unsigned specialization
2014-03-25 23:33:04 +01:00
magumagu 03292eabc2 Fix OpArg::WriteRex with 8-bit memory operand.
Previously he function was misbehaving because of a missing check for
whether an 8-bit operand was a register operand; it would therefore
emit unnecessary REX prefixes, incorrectly assert on 32-bit targets, and
could potentially emit wrong code in rare cases (like a memory to register
operation involving AH.)

Also, some cleanup while I was in the area to make the function easier to
read.
2014-03-25 14:09:15 -07:00
Tillmann Karras af525266d4 MathUtil: add constructors to IntFloat/IntDouble 2014-03-24 16:14:22 +01:00
Ryan Houdek 3586ab1d4c Fix the Android build when using clang 3.4 2014-03-17 17:56:22 -05:00
Tillmann Karras fa3cc05753 Turn some non-const refs into pointers 2014-03-17 02:55:57 +01:00
Lioncash a82675b7d5 Kill off some usages of c_str.
Also changes some function params, but this is ok.
Some simplifications were also able to be made (ie. killing off strcmps with ==, etc).
2014-03-14 13:51:23 -04:00
Matthew Parlane 31cfc73a09 Fixes spacing for "for", "while", "switch" and "if"
Also moved && and || to ends of lines instead of start.
Fixed misc vertical alignments and some { needed newlining.
2014-03-11 00:35:07 +13:00
Tillmann Karras d802d39281 clang-modernize -use-nullptr
and s/\bNULL\b/nullptr/g for *.cpp/h/mm files not compiled on my machine
2014-03-09 21:14:26 +01:00
Tillmann Karras f28116b7da clang-modernize -add-override 2014-03-09 21:12:01 +01:00
Tillmann Karras c89f04a7c5 clang-modernize -loop-convert
and some manual adjustments
2014-03-09 21:11:59 +01:00
Tillmann Karras 9ef64245fa MathUtil: fix IsQNAN()
The constants were one nibble too short and the lower 51 bits don't
actually have to be zero.
2014-03-09 19:34:58 +01:00
Tillmann Karras d05e205a24 FPURoundMode: revert use of enums in bit-fields
The workaround of using fixed underlying types produces lots of warnings
in GCC because now the bit-fields are too small for the value range used
for conversion semantics.
2014-03-09 15:24:35 +01:00
Pierre Bourdon edba8096bf x64Emitter: Add functions to call a C++ std::function from JITed code 2014-03-08 23:32:43 +01:00
Pierre Bourdon 9869c53859 x64ABI: Add two more CallFunction functions (for additional parameter types). 2014-03-08 23:32:43 +01:00
Pierre Bourdon 6d6abfa61f x64Emitter: Allow const pointers where it makes sense to do so. 2014-03-08 23:32:43 +01:00
Pierre Bourdon 248f5d7f22 Merge pull request #130 from lioncash/breakpoint-clear
Actually make PPCDebugInterface::ClearAllBreakpoints have functionality.
2014-03-07 20:42:52 +01:00
Matthew Parlane 57f2eda130 Fix MAC address reading on Windows. 2014-03-07 21:40:59 +13:00
Shawn Hoffman 932945d480 Implement workaround for Windows versions which do not support XSAVE.
Fixes CRT math routines using FMA instructions from causing illegal instructions.
2014-03-06 14:38:10 -08:00
Shawn Hoffman 8995d299f2 windows: move arch defines to base.props 2014-03-06 14:37:40 -08:00
Lioncash 610a6f9b23 Add ClearAllMemChecks to DebugInterface
Breakpoints have one, but memchecks don't, despite being cleared directly in the breakpoint window.

Now DolphinWX should call the interface functions and not the direct functions of the breakpoints or memchecks for clearing.
2014-03-05 21:50:23 -05:00
Shawn Hoffman 3647dfa711 Allow VS builds to be speedy again. 2014-03-05 11:17:14 -08:00
Shawn Hoffman 7733463e65 commit 1a428de189 introduced a bug by using a signed enum in a bitfield, the value of which is then used in a ldmxcsr instruction. The sign-extension corrupts the value, causing an exception by attempting to load mxcsr with an invalid value. 2014-03-05 10:19:29 -08:00
Ryan Houdek 4f02132f93 Make our architecture defines less stupid.
Our defines were never clear between what meant 64bit or x86_64
This makes a clear cut between bitness and architecture.
This commit also has the side effect of bringing up aarch64 compiling support.
2014-03-04 09:36:59 -06:00
Lioncash 279a8c0148 Change the DebugInterface, PPCDebugInterface, and DSPDebugInterface to use CamelCase names.
This is the standard coding convention in the codebase, so our interfaces should use it too.
2014-03-03 00:39:08 -05:00
Tillmann Karras 7a66a3ded1 ArmEmitter: make it more readable 2014-02-28 12:43:22 +01:00
Tillmann Karras 46e7c0657f Crypto: small cleanup 2014-02-28 12:43:22 +01:00
Tillmann Karras 315a8ba1c0 Various changes suggested by cppcheck
- remove unused variables
- reduce the scope where it makes sense
- correct limits (did you know that strcat()'s last parameter does not
  include the \0 that is always added?)
- set some free()'d pointers to NULL
2014-02-28 12:43:20 +01:00
Tillmann Karras 5f0a8008f4 Convert MemoryUtil.cpp to Unix-style line endings 2014-02-28 12:28:21 +01:00
Tillmann Karras 1a428de189 x64FPURoundMode: move things around a bit 2014-02-28 12:28:21 +01:00
degasus 94da4e1aa2 MathUtil: Change Log2 return value to int
Log2(u64) can't be bigger than 63, so there is no need in forcing a 64 bit value.
So just using a common int seems more natural.
2014-02-26 11:37:28 +01:00
Pierre Bourdon 70f3a069f2 Revert "Merge pull request #83 from lioncash/remove-console"
This breaks Linux stdout logging.

This reverts commit 7ac5b1f2f8, reversing
changes made to 9bc14012fc.

Revert "Merge pull request #77 from lioncash/remove-console"

This reverts commit 9bc14012fc, reversing
changes made to b18a33377d.

Conflicts:
	Source/Core/Common/LogManager.cpp
	Source/Core/DolphinWX/Frame.cpp
	Source/Core/DolphinWX/FrameAui.cpp
	Source/Core/DolphinWX/LogConfigWindow.cpp
	Source/Core/DolphinWX/LogWindow.cpp
2014-02-23 07:48:06 +01:00
Pierre Bourdon 311caef094 Merge pull request #25 from Tilka/ppc_fp
Fix non-IEEE mode
2014-02-23 04:15:37 +01:00
Pierre Bourdon 83b7bb64aa Make Common/ mostly IWYU clean (and fix errors in rest of the project detected by this change). 2014-02-22 23:37:29 +01:00
Lioncash 91286f5021 Fix the Windows build in relation to the recent changes. 2014-02-20 01:01:11 +01:00
Pierre Bourdon 9a8ea53195 Fix LinearDiskCache.h relying on a file not directly included. 2014-02-20 01:01:11 +01:00
Pierre Bourdon 3f9c38d231 Fix more header sorting issues in Common/ (now check-includes clean). 2014-02-20 01:01:10 +01:00
Lioncash 2afe215271 Convert all includes to relative paths. 2014-02-18 02:19:10 -05:00
Lioncash 3fd87a7636 Second and final pass of clearing out tabs. 2014-02-17 02:19:41 -05:00
Lioncash cd8196f5db Turns out Console.cpp and ConsoleListener.cpp were being built into the Linux builds too. Fixes the build. 2014-02-16 20:55:07 -05:00
Lioncash ca7bdf1d5d Remove the embedded Console from the possible logging options.
Note I do not mean the Logging window, but the console window.
It's literally rarely, if at all used, and offers less advantages over the built-in logging window (ie. it breaks on different locales: http://i.imgur.com/Cs92tQE.png)

This commit should remove all of the console logging.
2014-02-16 20:40:33 -05:00
Pierre Bourdon 92f8d93e96 Remove the old MMIO access "interface". 2014-02-16 19:22:40 +01:00
Tillmann Karras 404624bf0b Turn loops into range-based form
and some things suggested by cppcheck and compiler warnings.
2014-02-13 09:05:50 +01:00
Tillmann Karras 2ff794d299 Fix some warnings 2014-02-13 09:02:43 +01:00
Matthew Parlane 88526be3b5 Merge pull request #50 from Parlane/inifile_tidy
Fix IniFile to use string& instead of char*
2014-02-13 19:04:27 +13:00
Matthew Parlane 3fe05e0a9f Fix IniFile to use string& instead of char*
Also removes .c_str() usages where found.
2014-02-13 17:06:30 +13:00
Scott Mansell cf5938c4df x64Emitter: Fix the PSUBQ instruction's opcode 2014-02-12 23:12:17 +01:00
Scott Mansell 1eb8168488 x64Emitter: Add the xmm, xmm form of PSRLQ instruction. 2014-02-12 23:12:16 +01:00
Tillmann Karras 1f34ed2c25 Re-enable non-IEEE mode support 2014-02-12 23:12:16 +01:00
Tillmann Karras db196d8c5b Jit64[IL]: fix float conversions
Floating-point is complicated...

Some background: Denormals are floats that are too close to zero to be
stored in a normalized way (their exponent would need more bits). Since
they are stored unnormalized, they are hard to work with, even in
hardware.  That's why both PowerPC and SSE can be configured to operate
in faster but non-standard-conpliant modes in which these numbers are
simply rounded ('flushed') to zero.

Internally, we do the same as the PowerPC CPU and store all floats in
double format. This means that for loading and storing singles we need a
conversion. The PowerPC CPU does this in hardware. We previously did
this using CVTSS2SD/CVTSD2SS. Unfortunately, these instructions are
considered arithmetic and therefore flush denormals to zero if non-IEEE
mode is active. This normally wouldn't be a problem since the next
arithmetic floating-point instruction would do the same anyway but as it
turns out some games actually use floating-point instructions for
copying arbitrary data.

My idea for fixing this problem was to use x87 instructions since the
x87 FPU never supported flush-to-zero and thus doesn't mangle denormals.
However, there is one more problem to deal with: SNaNs are automatically
converted to QNaNs (by setting the most-significant bit of the
fraction). I opted to fix this by manually resetting the QNaN bit of all
values with all-1s exponent.
2014-02-12 23:12:15 +01:00
Tillmann Karras c25c4a6e20 x64: add support for some x87 instructions 2014-02-12 22:45:01 +01:00
Tillmann Karras 3218f6cca8 x64: drop instructions that don't exist
These instructions don't exist in hardware although I agree that they
would be useful for our purposes ;)
2014-02-11 05:22:53 +01:00
lioncash d2038049f5 Replace all include guard ifdefs with "#pragma once" 2014-02-10 18:07:16 -05:00
Matthew Parlane 32bfcc034f Some tidy up of sprintf to StringFromFormat
Includes a small fix to SetupWiiMemory
2014-02-10 17:25:18 +13:00
Lioncash ebb48d019e Clean up some struct indentations
Also cleaned up the indentations of some variable declarations.
2014-02-09 19:40:11 -05:00
Lioncash 40182a48a5 Cleanup enum indentations. 2014-02-09 16:16:10 -05:00
Pierre Bourdon e59f770ccb Revert "Merge pull request #49 from Parlane/sprintf_tidy"
Change broke the build on Debian stable.

This reverts commit 28755439b3, reversing
changes made to 64e01ec763.
2014-02-09 16:14:13 +01:00
Matthew Parlane ebff7974c3 Some tidy up of sprintf to StringFromFormat 2014-02-08 14:32:48 +13:00
Matthew Parlane 6b980cbf30 Tidy up SetupWiiMemory 2014-02-07 19:00:34 +13:00
Pierre Bourdon 70e2ed320d Revert "Merge pull request #47 from lioncash/remove-stringfromint"
Breaks Android build.

This reverts commit 12d026c544, reversing
changes made to 6d678490f5.
2014-02-07 00:26:33 +01:00
Pierre Bourdon 12d026c544 Merge pull request #47 from lioncash/remove-stringfromint
Remove function StringFromInt from StringUtil.cpp/.h. C++11 has std::to_string for this now.
2014-02-07 00:19:15 +01:00
Lioncash 05742ffd48 Remove function StringFromInt from StringUtil.cpp/.h. C++11 has std::to_string for this now. 2014-02-06 18:08:31 -05:00
Matthew Parlane 70d2592ffb Fix warnings found by StringFromFormat having printf style checking. 2014-02-07 01:38:08 +13:00
Matthew Parlane 09cc7e2ddf Give StringFromFormat a printf format attribute.
It gives StringFromFormat printf style arguments that should be type-checked against a format string.
2014-02-07 01:10:04 +13:00
Lioncash 249b00c469 Change the modified parameter in the Clamp function to be a pointer.
Makes it easier to identify the one being modified.
2014-02-05 04:04:35 -05:00
Lioncash 6b87a0ef20 Introduce a generic clamp function to clean up some similarly duplicated code. 2014-02-04 20:43:07 -05:00
Lioncash 7ebc829b17 Remove a pointless c_str() call in FileUtil.cpp. The function takes a string in it's parameter 2014-02-03 21:31:12 -05:00
Tillmann Karras b34fe2b8f1 x64: fix parameter names of WriteModRM() 2014-01-25 17:36:09 +01:00
Tillmann Karras 21b0252e27 Jit64: disable non-IEEE mode emulation
I give up. Merging the ppc_fp branch has caused issues in numerous games
and I can't find the bug. I'm leaving this merged to enable easy
recompilation for people who would like to play games that benefit from
non-IEEE mode emulation (e.g. Starfox Assault).
2014-01-19 09:36:08 +01:00
Jasper St. Pierre 34692ab826 Remove unnecessary Src/ folders 2013-12-31 14:03:19 -05:00
Jasper St. Pierre 43e618682e Convert all vcxproj files to UNIX line endings 2013-12-31 14:03:18 -05:00
comex 1b617c736c Add a non-tiny warning about CPUs that will silently desync. 2013-12-16 22:41:52 -05:00
Ryan Houdek eb3b933dd0 Remove all instances of OpenCL in the Dolphin Project. A brief history of OpenCL in Dolphin. OpenCL was originally added to the Dolphin codebase 1 month after it was released with OS X Snow Leopard in 2009. OpenCL was one of the largest group projects that Dolphin ever has had. The OpenCL texture decoder was originally aded with version 1.0 of the OpenCL spec; This version didn't have the capability of a OpenCL-OpenGL interop which would allow for uploading textures once and have it decoded directly to a OpenGL texure. This was to be worked out when the OpenCL 1.1 spec was released and allowed the interop. This work has never been done, and no one in the team is willing to work on it for various reasons. OpenCL has had the unreasonable expectation that it increases the performance of video games that require a large amount of EFB copies like NSMBW. In reality, enabling OpenCL just put the graphics card in a higher power mode which increased the game speed. This is due to the unfortunate effect of Dolphin tending to not push GPUs out of their lower frequency power savings modes. Thanks to everyone that had contributed to the OpenCL texture decoder. 2013-12-11 15:15:55 -06:00
comex eaacf10f71 Fix an idiotic race condition when starting games in multiple Dolphin instances at the same time on Unix.
MemArena mmaps the emulated memory from a file in order to get the same
mapping at multiple addresses.  A file which, formerly, was located at a
static filename: it was unlinked after creation, but the open did not
use O_EXCL, so if two instances started up on the same system at just
the right time, they would get the same memory.  Naturally, this caused
extremely mysterious crashes, but only in Netplay, where the game is
automatically started when the client receives a broadcast from the
server, so races are actually quite likely.

And switch to shm_open, because it fits the bill better and avoids any
issues with using /tmp.
2013-12-10 16:20:52 -05:00
Tillmann Karras b863e40677 Merge branch 'ppc_fp' 2013-11-18 19:31:09 +01:00
Jordan Cristiano f96e9e1ae4 warnings and code formatting 2013-11-13 04:03:46 -05:00
Tillmann Karras 288bef2807 x64: add small warning if CPU has SSE2 but not DAZ 2013-11-13 06:26:57 +01:00
Tillmann Karras cd069fdce1 Interpreter: software-based flush-to-zero
bDAZ is now called bFlushToZero to better reflect what it's actually
used for.

I decided not to support any hardware-based flush-to-zero on systems
that don't support this for both inputs _and_ outputs. It makes the code
cleaner and the intersection of CPUs that support SSE2 but not DAZ
should be very small.
2013-11-13 06:24:58 +01:00
Tillmann Karras 466a7afde3 Interpreter: support non-IEEE mode emulation
v2: fix fxsave on visual studio, thx @ rodolfo for this patch
2013-11-13 06:24:57 +01:00
Tillmann Karras ae86850a78 x64: support VEX opcode encoding
and add some AVX instructions
2013-11-13 06:12:23 +01:00
Tillmann Karras 6054129df8 x64: detect FMA support 2013-11-13 04:46:34 +01:00
Tillmann Karras 268bdf19ce Fix format string warnings 2013-11-13 04:01:16 +01:00
Ryan Houdek 7c1ac441f6 Redo 'Fixes GCC 4.9 compilation. It now supplies its own _mm_shuffle_epi8 intrinsic.' This time with support for Windows. 2013-11-12 16:34:56 -06:00
Ryan Houdek 56557c845a [ARM] Fix NEON emitter encodings. 2013-11-12 01:01:54 +00:00
Ryan Houdek d1de336879 [ARM] More NEON emitters. 2013-11-11 01:47:05 +00:00
Jordan Cristiano 3a28afd8d5 Changed thread barrier and event to use a lamba wait predicate instead of a functor. 2013-11-10 04:57:11 -05:00
Ryan Houdek e013a74cdb [ARM] More NEON emitters. 2013-11-10 05:02:32 +00:00
Ryan Houdek 56685c396a [ARM] Fix an issue with the data offset in LoadStore operations. Thanks to PPSSPP. 2013-11-05 13:05:38 +00:00
Matthew Parlane e15f628935 Fix {Read,Write}FileToString.
We should be using binary always.
2013-11-05 00:33:41 +13:00
comex c579637eaf Run code through the advanced tool 'sed' to remove trailing whitespace. 2013-11-03 20:54:05 -05:00
comex 965b32be9c Run code through clang-modernize -loop-convert to create range-based for loops, and manually fix some stuff up. 2013-11-03 20:54:01 -05:00
comex 82729fcc8f Merge remote-tracking branch 'shuffle2/vc12'
Conflicts:
	Source/Core/Common/Common.vcxproj
	Source/Core/Common/Common.vcxproj.filters
2013-10-31 16:51:56 -04:00
comex 4c7bbd96e4 Improve ChunkFile.h:
- Add support for std::set and std:pair.

- Switch from std::is_pod to std::is_trivially_copyable, to allow for
  types that have constructors but trivial copy constructors.  Easy,
  except there are three different nonstandard versions of it required
  on different platforms, in addition to the standard one.
2013-10-31 15:40:53 -04:00
comex 2e983071c5 Add git.bat to the options in make_scmrev.h.js because depot_tools uses it and I'm silly. 2013-10-27 19:51:55 -04:00
Ryan Houdek 8e73e8ae5f Wipe all traces of OpenSSL's AES implementation. Use polarssl instead. 2013-10-27 18:27:07 +00:00
Shawn Hoffman ccd30024b3 Update to VS2013 and a slew of build-related updates. Notes:
* Currently there is no DEBUGFAST configuration. Defining DEBUGFAST as a preprocessor definition in Base.props (or a global header) enables it for now, pending a better method. This was done to make managing the build harder to screw up. However it may not even be an issue anymore with the new .props usage.
* D3DX11SaveTextureToFile usage is dropped and not replaced.
* If you have $(DXSDK_DIR) in your global property sheets (Microsoft.Cpp.$(PlatformName).user), you need to remove it. The build will error out with a message if it's configured incorrectly.
* If you are on Windows 8 or above, you no longer need the June 2010 DirectX SDK installed to build dolphin. If you are in this situation, it is still required if you want your built binaries to be able to use XAudio2 and XInput on previous Windows versions.
* GLew updated to 1.10.0
* compiler switches added: /volatile:iso, /d2Zi+
* LTCG available via msbuild property: DolphinRelease
* SDL updated to 2.0.0
* All Externals (excl. OpenAL and SDL) are built from source.
* Now uses STL version of std::{mutex,condition_variable,thread}
* Now uses Build as root directory for *all* intermediate files
* Binary directory is populated as post-build msbuild action
* .gitignore is simplified
* UnitTests project is no longer compiled
2013-10-26 17:55:38 -07:00
Ryan Houdek 1eba4da21a Revert "Fixes GCC 4.9 compilation. It now supplies its own _mm_shuffle_epi8 intrinsic."
This reverts commit b2c4901b3f.

Breaks Windows build. GCC 4.9 isn't out yet anyway.
2013-10-26 19:21:00 -05:00
Ryan Houdek b2c4901b3f Fixes GCC 4.9 compilation. It now supplies its own _mm_shuffle_epi8 intrinsic. 2013-10-26 19:05:31 -05:00