dolphin

Commit Graph

Author	SHA1	Message	Date
degasus	0d92c8fb89	Jit64: Optimize dcbx	2015-08-24 18:33:23 +02:00
Tillmann Karras	ac84d6d0fa	Jit64: some cache flush changes - dynamically allocate third scratch register instead of forcing ECX - use LEA as 3 operand add if possible - use BT,JC instead of SHR,TEST,JNZ - merge MOV,TEST - use appropriate ABI function (no asm change)	2015-08-24 18:33:23 +02:00
degasus	6f34b27323	Jit64: implement dcbf + dcbi	2015-08-24 18:33:19 +02:00
Markus Wick	0ad6fa8f62	Merge pull request #2903 from lioncash/cast Memmap: Remove pointer casts	2015-08-24 15:42:56 +02:00
Lioncash	abd3b124be	Memmap: Remove pointer casts	2015-08-24 09:07:09 -04:00
flacs	4baf3e10c6	Merge pull request #2902 from Tilka/fpscr Jit64: quickfix for mtfsfx	2015-08-24 13:19:26 +02:00
Tillmann Karras	33eefc2d86	Jit64: quickfix for mtfsfx	2015-08-24 12:12:31 +02:00
Ryan Houdek	d3176fe22a	[AArch64] Implement stfiwx Improves povray performance by ~4%	2015-08-24 01:10:55 -05:00
Ryan Houdek	80fa9af9b1	Merge pull request #2898 from degasus/linking JitArm64: Faster linking of continuous blocks	2015-08-23 18:09:02 -05:00
Markus Wick	8bc311ab3c	Merge pull request #2897 from degasus/arm JitArm64: Implement subfex, subfcx, addex, subfic, divwux, srwx	2015-08-23 23:52:35 +02:00
degasus	7320d519b4	JitArm64: Implement srwx	2015-08-23 23:29:48 +02:00
degasus	4722a69fd0	JitArm64: Implement divwux	2015-08-23 23:29:18 +02:00
degasus	9e4366963c	JitArm64: Implement subfic	2015-08-23 23:29:07 +02:00
degasus	95be17772f	JitArm64: Implement addex	2015-08-23 23:29:02 +02:00
degasus	025e7c835a	JitArm64: Implement subfcx	2015-08-23 23:28:28 +02:00
degasus	550a90e691	JitArm64: Implement subfex	2015-08-23 23:28:24 +02:00
Ryan Houdek	561744819e	[AArch64] Implement fctiwzx Improves the povray benchmark time by 5.6%	2015-08-23 15:35:18 -05:00
Ryan Houdek	4fa23abbe1	[AArch64] Implement MOVI and ORR(imm) in the NEON emitter.	2015-08-23 15:34:53 -05:00
aroulin	0a0e012fab	x64Emitter: add RCPPS and RCPSS SSE instructions	2015-08-23 16:59:27 +02:00
degasus	77a6798094	JitArm64: Faster linking of continuous blocks	2015-08-23 14:44:23 +02:00
Markus Wick	73067b1ef1	Merge pull request #2888 from degasus/jit64 Jit64: Faster linking of continuous blocks	2015-08-23 13:24:15 +02:00
Lioncash	2a1abf8dd6	Merge pull request #2896 from lioncash/using Core: Minor CPU core typedef cleanup	2015-08-22 19:00:23 -04:00
Ryan Houdek	cc3fb7e7b4	Merge pull request #2883 from degasus/master Profiler: Sort output by total time	2015-08-22 17:52:54 -05:00
Markus Wick	8b881a6c34	Merge pull request #2891 from Sonicadvance1/aarch64_implement_crxxx [AArch64] Implement the cr instructions	2015-08-23 00:44:47 +02:00
Lioncash	fdafa5d063	Core: Move includes out of instruction table headers These aren't necessary (and cause unnecessary indirect inclusions).	2015-08-22 14:15:02 -04:00
Lioncash	a248a4d2ce	Jit64/JitIL: Relocate instruction typedefs	2015-08-22 14:15:00 -04:00
Lioncash	c56717e058	Core: Shorten the _interpreterInstruction typedef The class itself already acts as a namespace trailer, so '_interpreter' isn't necessary. This also gets rid of a duplicate typedef in the Interpreter_Tables.	2015-08-22 14:14:49 -04:00
Ryan Houdek	b4e4a4cef4	Disable OpenGL ES 3.1 on all Qualcomm Adreno devices. Their new driver that supports GLES3.1 + AEP has issues with it. At the very least they don't implement all of the geometry shader features fully which causes shader linker issues when we attempt to use them. I don't have a device so I can't fully test, so until I do I'm going to blanket disable the whole thing.	2015-08-22 09:12:19 -05:00
Markus Wick	a39c0910c4	Merge pull request #2893 from Sonicadvance1/aarch64_memory_base_register [AArch64] Use a register as a constant for the memory base.	2015-08-22 15:41:57 +02:00
Ryan Houdek	dba579c52f	[AArch64] Use a register as a constant for the memory base. Removes a /lot/ of redundant movk operations in fastmem loadstores. Improves performance of the povray bench by ~5%	2015-08-22 08:36:34 -05:00
Markus Wick	3f5ff98c1b	Merge pull request #2890 from lioncash/ptr x64Emitter: Remove pointer casts from Write{8,16,32,64} functions	2015-08-22 10:09:28 +02:00
Markus Wick	2d505bc2a6	Merge pull request #2894 from Sonicadvance1/no_more_eaten_canary Fix the shader overrunning our max shader size.	2015-08-22 10:08:14 +02:00
Markus Wick	c2f38f1d16	Merge pull request #2892 from Sonicadvance1/aarch64_frsp [AArch64] Implement frspx	2015-08-22 09:44:14 +02:00
Ryan Houdek	3242e1a617	Fix the shader overrunning our max shader size. The Star Wars games really push the hardware to its limits, which can cause the shaders that are produced to be 18kb or more. Double our maximum shader size to compensate. Fixes issue #8860	2015-08-22 01:01:03 -05:00
Ryan Houdek	ce32b76be3	[AArch64] Implement frspx Improves performance in povray bench by 2%	2015-08-22 00:35:30 -05:00
Ryan Houdek	d74eb0ea58	[AArch64] Fix the bugs in the cr instructions Makes it a bit more efficient in the process.	2015-08-21 23:24:29 -05:00
degasus	e9ade0abe1	JitArm64: implement crXXX	2015-08-21 20:49:08 -05:00
Lioncash	a69755d9ee	x64Emitter: Remove pointer casts from Write{8,16,32,64} functions This also silences quite a few ubsan asserts from firing when the emitter is being used.	2015-08-21 18:09:48 -04:00
flacs	95d958c03d	Merge pull request #2889 from lioncash/interp Interpreter: Use std::isnan instead of IsNAN	2015-08-21 21:43:08 +02:00
Lioncash	caec42135d	MathUtil: Remove IsNAN and IsINF These aren't necessary, since the stdlib provides equivalents.	2015-08-21 15:05:43 -04:00
flacs	bb7f3d1822	Merge pull request #2867 from Tilka/mtspr_hid0 Jit64: implement HID0 case of mtspr	2015-08-21 21:04:35 +02:00
flacs	01aea965ba	Merge pull request #2864 from Tilka/fpscr Jit64: implement FPSCR related instructions	2015-08-21 21:04:20 +02:00
Lioncash	18d658df1f	Interpreter_FloatingPoint: Use std::isnan instead of IsNAN Same thing, except one is part of the stdlib.	2015-08-21 15:04:03 -04:00
degasus	78aa01e06e	Jit64: Faster linking of continuous blocks We compile the blocks as they are executed, so it's common to link them continuously. We end with calling JMP after every block, but often just with a distance of 0. So just emitting NOPs instead also "calls" the next block, but easier for the CPU.	2015-08-21 17:41:53 +02:00
Markus Wick	c325c310d6	Merge pull request #2884 from lioncash/emitter x64Emitter: Minor cleanup	2015-08-21 13:03:51 +02:00
Ryan Houdek	5f628749ff	Merge pull request #2886 from Sonicadvance1/aarch64_faster_lfd [AArch64] Optimize lfd instructions if possible.	2015-08-21 05:38:53 -05:00
Ryan Houdek	df53b37253	[AArch64] Optimize lfd instructions if possible. If we are going to be using lfd, then chances are it is going to be used in double heavy areas of code. If we only need to load the lower register, then we should also not worry about having to insert in to the low 64bits of the guest register. So add a new flag to the backpatching to handle lfd to directly to the destination register. This gives ~3% performance improvement to Povray.	2015-08-21 04:31:54 -05:00
Markus Wick	4f45d71840	Merge pull request #2760 from Sonicadvance1/aarch64_fcmp [AArch64] Implement fcmp{u,o}	2015-08-21 11:03:20 +02:00
Tillmann Karras	39ced2a2d7	AVIDump: fix -Wsign-compare warning Cast the other side of the comparison to avoid a warning with newer ffmpeg/libav versions (cb3591e69738c808d26ba15eb02414fedfcd91cc).	2015-08-21 10:26:35 +02:00
Markus Wick	6cb87a9227	Merge pull request #2837 from Sonicadvance1/aarch64_faster_nonpaired [AArch64] Optimize cases when an FPR is only used for non-paired ops.	2015-08-21 09:51:45 +02:00

... 7 8 9 10 11 ...

18746 Commits All Branches Search

18746 Commits

All Branches