dolphin

Commit Graph

Author	SHA1	Message	Date
Ryan Houdek	b907576510	[AArch64] Support profiling by cycle counters if they are available to EL0	2015-08-30 10:25:16 -05:00
Ryan Houdek	5110574c1f	Merge pull request #2921 from Sonicadvance1/aarch64_optimize_lmw [AArch64] Optimize lmw.	2015-08-30 10:23:57 -05:00
Anthony Serna	0390bd61df	Fixed introduced compiler warning in Linux	2015-08-29 20:41:59 -07:00
Lioncash	d58550e874	MemcardManager: Minor cleanup of header code	2015-08-29 05:19:51 -04:00
Lioncash	0f3e4c50e1	MemcardManager: Correct class indentation	2015-08-29 05:13:20 -04:00
Lioncash	072150589e	Merge pull request #2924 from lioncash/scope Hash: Narrow define scope	2015-08-29 03:12:18 -04:00
Lioncash	e7c7dcaa1f	Merge pull request #2923 from lioncash/override Jit_Util: Add missing override specifiers	2015-08-29 03:12:11 -04:00
Lioncash	310bb46967	Hash: Narrow define scope	2015-08-29 02:57:35 -04:00
Markus Wick	a16669231a	Merge pull request #2917 from Sonicadvance1/android_fix_sgs6 [Android] Workaround Mali driver issue on the Samsung Galaxy S6.	2015-08-29 08:56:32 +02:00
Lioncash	df19f11cb9	Jit_Util: Add missing override specifiers	2015-08-29 00:30:18 -04:00
Markus Wick	6004ecc521	Merge pull request #2920 from rohit-n/build-pch Fix building with PCH disabled.	2015-08-28 23:08:24 +02:00
Ryan Houdek	8d61706440	[AArch64] Optimize lmw. This instruction is fairly heavily used by Ikaruga to load a bunch of registers from the stack. In particular at the start of the second stage is a block that takes up ~20% CPU time that includes a usage of lmw to load half of the guest registers. Basic thing optimized here is changing from a single 32bit LDR to potentially a single 128bit LDR. a single 32bit LDR is fairly slow, so we can optimize a few ways. If we have four or more registers to load, do a 64bit LDP in to two host registers, byteswap, and then move the high 32bits of the host registers in to the correct mapped guest register locations. If we have two registers to load then do a 32bit LDP which will load two guest registers in a single instruction. and then if we have only one register left to load, load it as before. This saves quite a bit of cycles since the Cortex-A57 and A72's LDR instruction takes a few cycles. Each 32bit LDR takes 4 cycles latency, plus 1 cycle for post-index(which typically happens in parallel. Both the 32bit and 64bit LDP take the same amount of latency. So we are improving latencies and reducing code bloat here.	2015-08-28 14:40:30 -05:00
Ryan Houdek	2c3fa8da28	[AArch64] Fix a bug in the register caches. This is a bug that crops if BindToRegister() is called multiple times in a row without a R() function call between them. How to reproduce the bug: 1) Have a completely filled cache with no host register remaining 2) Call BindToRegister() with different guest registers 3) Don't call R() between the BindToRegister() calls. This issue typically wouldn't be seen for a couple of reasons. Typically we have /plenty/ of registers in the cache, and in most cases we only call BindToRegister() once per instruction. In the off chance that it is called multiple times, it wouldn't update the last used counts and would flush the same register as the previous call to it.	2015-08-28 14:36:14 -05:00
Rohit Nirmal	6252d2d71a	Fix building with PCH disabled.	2015-08-28 14:13:28 -05:00
Lioncash	a6bd2fea28	Merge pull request #2919 from lioncash/vec Vec3: Remove a memset call on the this pointer.	2015-08-28 15:05:02 -04:00
Lioncash	e787501528	Vec3: Simplify operator== code	2015-08-28 14:46:40 -04:00
Markus Wick	b11de5bddb	Merge pull request #2918 from lioncash/memcpy DataReader: Get rid of pointer casts	2015-08-28 20:45:15 +02:00
Lioncash	d86d5fae9f	Merge pull request #2909 from aserna3/DollsAndElves Implemented .elf and .dol support in gamelist	2015-08-28 14:28:09 -04:00
Lioncash	bb27f80a65	Vec3: Remove a memset call on the this pointer	2015-08-28 14:10:07 -04:00
Anthony Serna	faedf1bc5c	Implemented .elf and .dol support in gamelist Fixed a TON of structuring, formatting. removed README.txt files from themes at MaJoR's request Added platform icon for ELFs/DOLs	2015-08-28 11:10:03 -07:00
Ryan Houdek	01db003779	[Android] Workaround Mali driver issue on the Samsung Galaxy S6. Samsung updated the video drivers on the SGS6 which introduced a bug when disabling vsync. Both the driver versions are r5p0, but the md5sums of the blob differ. To work around the issue, make sure to never disable vsync by calling eglSwapInterval. We can't actually determine the driver version on Android yet. So until the driver version lands that displays the driver version string in the GL_VERSION string we will need to keep this workaround enabled at all times, which is a bit annoying. Current mali drivers return the video driver version in one of the EGL strings you can query. The issue with that is that Android eats all of those strings, so we can't query it.	2015-08-28 09:02:46 -05:00
flacs	d373dd372d	Merge pull request #2913 from Tilka/fix_warning_fix AVIDump: fix -Wsign-compare warning	2015-08-27 23:50:34 +02:00
Lioncash	4fb3a8b78d	DataReader: Get rid of pointer casts	2015-08-27 13:43:04 -04:00
Ryan Houdek	d9b18862f3	Merge pull request #2908 from Sonicadvance1/gles_3_2 Support OpenGL ES 3.2.	2015-08-26 18:19:17 -05:00
Ryan Houdek	447b1b09e3	Support OpenGL ES 3.2. OpenGL ES 3.2 adds a few things we care about supporting in core. In particular: - GL_{ARB,EXT,OES}_draw_elements_base_vertex - KHR_Debug - Sample Shading - GL_{ARB,EXT,OES,NV}_copy_image - Geometry shaders - Geometry shader instancing (If they support GL_{EXT,OES}_geometry_point_size) Nvidia was the first to release an OpenGL ES 3.2 driver which I uesd to test this on. This also enables GS Instancing on GLES 3.1 hardware if it supports all of the required extensions.	2015-08-26 17:57:51 -05:00
Ryan Houdek	6d25c469cf	Merge pull request #2915 from degasus/arm JitArm64: Implement rlwnmx	2015-08-26 15:52:37 -05:00
Markus Wick	54f882704a	Merge pull request #2914 from JosJuice/fix-volumedirectory Fix VolumeDirectory	2015-08-26 22:12:23 +02:00
degasus	e516d4ef59	JitArm64: Implement rlwnmx	2015-08-26 21:59:10 +02:00
JosJuice	d276d1abbb	Fix VolumeDirectory Fixes the regression from `a225426` and clarifies a related comment.	2015-08-26 19:21:09 +02:00
Markus Wick	3e9dac3910	Merge pull request #2810 from Sonicadvance1/disassembler_improv Have the disassembler show the PC next to host instructions.	2015-08-26 17:01:39 +02:00
flacs	99e88a7af7	Merge pull request #2887 from Tilka/swap Jit64: some byte-swapping changes	2015-08-26 16:43:45 +02:00
flacs	eb6ac641be	Merge pull request #2906 from Tilka/fpscr Jit64: fix bugs in the FPSCR instructions	2015-08-26 16:43:28 +02:00
Tillmann Karras	6ec4bdf862	CoreTiming: remove unused functions	2015-08-26 15:40:15 +02:00
Tillmann Karras	0f4861cac2	CoreTiming: make loops easier to read	2015-08-26 14:53:58 +02:00
Ryan Houdek	ca51f1a4f6	[AArch64] Optimize paired registers being used in double operations. In particular this optimizes the case where a 32bit float is loaded via lfs, and then used in double operations. This happens very often in Gekko based code because the best way to load a 32bit value as a double is lfs since it automatically turns in to a double value. There are a few other implications of this in practice as well. Like if both of the paired registers are loaded via psq_l and then used in double operations it would be improved. Also if we implement a double register we've got to be careful to make sure we understand if it is in "lower" register or the full 128bit register.	2015-08-26 05:50:04 -05:00
Markus Wick	5716d18d10	Merge pull request #2910 from Sonicadvance1/aarch64_regcache_fix [AArch64] Fix a bug in the register cache.	2015-08-26 08:31:24 +02:00
Ryan Houdek	4f5f29a0fb	[AArch64] Fix a bug in the register cache. If the register was only a lower pair and it needed the full register, then we need to load the high 64bits. Which we weren't doing before.	2015-08-26 01:21:43 -05:00
Markus Wick	43d17cb360	Merge pull request #2904 from Sonicadvance1/aarch64_more_inst [AArch64] Implement fdivx/fdivsx/mfcr/mtcrf.	2015-08-26 07:48:24 +02:00
Tillmann Karras	ee4a12ffe2	Jit64: some byte-swapping changes	2015-08-26 05:41:18 +02:00
flacs	6015e2d812	Merge pull request #2900 from aroulin/x64emitter-rcp x64Emitter: add RCPPS and RCPSS SSE instructions	2015-08-26 05:05:53 +02:00
Ryan Houdek	6729a36d8d	[AArch64] Set BindToRegister's to_load correctly for double FP ops.	2015-08-25 21:29:27 -05:00
Lioncash	db4f692482	GCMemcard: Clean up memcard logging messages.	2015-08-25 21:55:52 -04:00
Tillmann Karras	ee50a2ef28	Jit64: fix bugs in the FPSCR instructions	2015-08-25 23:48:14 +02:00
Markus Wick	bd08c1b01a	Merge pull request #2901 from Sonicadvance1/aarch64_stfiwx [AArch64] Implement stfiwx	2015-08-25 22:47:39 +02:00
Markus Wick	24cb650078	Merge pull request #2663 from degasus/dcbx Jit64: dcbf + dcbi	2015-08-25 12:16:56 +02:00
Ryan Houdek	0666c0750b	[AArch64] Implement fdivx/fdivsx/mfcr/mtcrf. Gets the povray bench to better times than the Wii.	2015-08-24 15:32:19 -05:00
Ryan Houdek	d96be9250c	Merge pull request #2899 from Sonicadvance1/aarch64_fctiwzx [AArch64] Implement fctiwzx	2015-08-24 13:22:27 -05:00
Ryan Houdek	cd03b8baf6	Merge pull request #2895 from Sonicadvance1/qualcomm_workaround_gles31 Disable OpenGL ES 3.1 on all Qualcomm Adreno devices.	2015-08-24 13:22:12 -05:00
degasus	0d92c8fb89	Jit64: Optimize dcbx	2015-08-24 18:33:23 +02:00
Tillmann Karras	ac84d6d0fa	Jit64: some cache flush changes - dynamically allocate third scratch register instead of forcing ECX - use LEA as 3 operand add if possible - use BT,JC instead of SHR,TEST,JNZ - merge MOV,TEST - use appropriate ABI function (no asm change)	2015-08-24 18:33:23 +02:00

1 2 3 4 5 ...

13331 Commits