xenia-canary

Commit Graph

Author	SHA1	Message	Date
chrisps	08232de8cc	patch a mistake in NZM calculation for OPCODE_NOT	2022-06-26 09:30:56 -07:00
chss95cs@gmail.com	327cc9eff5	drastically reduce size of final generated code for rlwinm by adding special paths for rotations of 0, masks that discard the rotated bits and using And w/ UINT_MAX instead of truncate/zero extend Add special case to TYPE_INT64's EmitAnd for UINT_MAX mask. Do mov32 to 32 if detected to take advantage of implicit zero xt/reg renaming Add helper function for skipping assignment defs in instr. Add helper function for checking if an opcode is binary value type Add several new optimizations to simplificationpass, plus weak NZM calculation code (better full evaluation of Z/NZ will be done later) . List of optimizations: If a value is anded with a bitmask that it was already masked against, reuse the old value (this cuts out most FPSCR update garbage, although it does cause a local variable to be allocated for the masked FPSCR and it still repeatedly stores the masked value to the context) If masking a value that was or'ed against another check whether our mask only considers bits from one value or another. if so, change the operand to the OR input that actually matters If the only usage of a rotate left's output is an AND against a mask that discards the bits that were rotated in change the opcode to SHIFT_LEFT If masking against all ones, become an assign. If XOR or OR against 0, become an assign (additional FPSCR codegen cleanup) If XOR against all ones, become a NOT Adding a direct CPUID check to x64_emitter for lzcnt, the version of xbyak we are using is skipping checking for lzcnt on all non-intel cpus, meaning we are generating the much slower bitscan path for AMD cpus.	2022-06-25 09:58:13 -07:00
Gliniak	2b3686f0e9	[XAM] Set profile setting 'from' entry accordingly to setting existence	2022-06-24 10:10:52 +02:00
Gliniak	ce3b159683	Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental	2022-06-22 21:05:45 +02:00
Triang3l	e9f129f67f	[GPU] Safer and more correct depth bias conversion Float24-as-float32 depth bias is now in the increments of 8, because conversion of the depth to float24 directly in the pixel shaders may destroy the bias qualitatively otherwise if it's too small.	2022-06-22 21:14:40 +03:00
Triang3l	a7885ae1a4	[GPU] Fix CPU-side float24 conversion broken recently	2022-06-22 20:47:44 +03:00
Gliniak	e7a122d943	Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental	2022-06-22 12:18:13 +02:00
Triang3l	cbf0476d42	[D3D12] Don't round float24 depth when it's known to be exact	2022-06-22 13:14:38 +03:00
Gliniak	83269315d8	Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental	2022-06-22 12:06:42 +02:00
Triang3l	7869b080d3	[D3D12] Truncate depth to float24 in EDRAM range ownership transfers and resolves by default Doesn't ruin the "greater or equal" depth test in subsequent rendering passes if precision is lost, unlike rounding to the nearest	2022-06-22 12:53:09 +03:00
Gliniak	87fd772393	Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental	2022-06-21 07:54:44 +02:00
chss95cs@gmail.com	549ee28a93	ome guest function calls can now be resolved and embedded directly in the emitted asm as rel32 calls. Disabled by default, enabled via resolve_rel32_guest_calls detect whether cpu has fast jrcxz, fast loop/loope/loopne much more thorough LoadConstantXMM New cvar elide_e0_check that allows the backend to assume accesses via the SP or TLS register will not cross into 0xe0 range Add x64 codegen for Vector shift uint8 If has fast jrcxz use for some traptrue/breaktrue instructions Use phat nops Add cvar use_fast_dot_product, which uses a four instruction sequence for both dot product instructions which ought to be equivalent. disabled by default.	2022-06-20 15:08:18 -07:00
Triang3l	e2f632f8fa	[D3D12] Use udiv by constant tile size + minor transfer cleanup Drivers compile that to a multiplication and a shift anyway.	2022-06-20 22:39:30 +03:00
Gliniak	a4ff64c465	Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental	2022-06-20 21:07:32 +02:00
Triang3l	207e11c8d2	[GPU] Separate range arguments for fixed16 RG and RGBA in GetResolveInfo On Vulkan, when snorm16 in unsupported, these formats may be emulated as float16, which natively can represent a wide range of numbers including -32 to 32 with blending. However, R16G16_SNORM and R16G16B16A16_SNORM are two separate formats, which may have different support on the device.	2022-06-20 12:29:45 +03:00
Triang3l	3b4845511d	[Vulkan] Don't require an explicit uint64_t cast for SetDeviceObjectName	2022-06-20 12:25:52 +03:00
Triang3l	67ff108f53	[Vulkan] Explain why CreateShaderModule takes uint32_t* [ci skip]	2022-06-20 12:22:41 +03:00
Triang3l	b61953374e	[GPU] Make resolve EDRAM binding DS 0 and rename it Ordering the descriptor sets by the change frequency on Vulkan, in increasing order (the opposite of D3D12 root signatures). The EDRAM binding never changes there (always one storage buffer), while the destination buffer binding may become changeable in the future (to split dispatches if exceeding `maxStorageBufferRange`, for example).	2022-06-20 12:15:52 +03:00
Triang3l	9b83d3d0f4	[GPU] XeSL resolve shaders + host depth store width fix	2022-06-19 17:50:21 +03:00
Gliniak	1e369afa3d	[Memory] Allocate system heap memory from bottom of heap last quarter Aka. From 0x30000000	2022-06-17 22:23:39 +02:00
Gliniak	0b183a3582	Merge branch 'chris_cpu_changes' of https://github.com/Gliniak/xenia.git into canary_experimental	2022-06-17 14:04:58 +02:00
chrisps	e4fd015886	Juicy optimization goodness	2022-06-17 14:03:24 +02:00
chss95cs@gmail.com	8a8ff6ae46	Reuse flag results in OPCODE_BRANCH_TRUE codegen if the preceding instruction was a comparison that already set the cpu flags	2022-06-17 11:13:49 +02:00
chss95cs@gmail.com	3675b3860a	Add constant folding for OPCODE_ROTATE_LEFT	2022-06-17 11:12:49 +02:00
chrisps	3ad80810b5	Optimized CONVERT_I64_TO_F64 with neat overflow trick Reduced instruction count from 11 to 8, eliminated a movq stall.	2022-06-17 11:10:48 +02:00
chrisps	9dfbef8acf	Smaller ComputeMemoryAddress/Offset sequence Replace a movzx after setae in both ComputeMemoryAddressOffset and ComputeMemoryAddress with a xor_ of eax prior to the cmp. This reduces the length in bytes of both sequences by 1, and should be a moderate ICache usage reduction thanks to the frequency of these sequences.	2022-06-17 11:10:27 +02:00
Gliniak	c0483f8bee	Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental	2022-06-17 10:58:15 +02:00
Triang3l	166be463be	[XeSL] Metal Shading Language definitions	2022-06-16 21:39:16 +03:00
Gliniak	e8aaddf4d5	Merge remote-tracking branch 'GliniakRepo/patchingSystem' into canary_experimental	2022-06-14 17:50:25 +02:00
Gliniak	91f43a374d	Initial support for xex patching	2022-06-12 20:10:07 +02:00
Gliniak	945976a31d	Added Premake Files For PatchingSystem	2022-06-12 19:58:12 +02:00
Triang3l	820b7ba217	[GPU] Fix GetActiveTextureHostSwizzle return type	2022-06-12 18:50:38 +03:00
Gliniak	90d67ac11c	[Kernel] Return X_STATUS_END_OF_FILE for async file read when offset > file_size	2022-06-09 21:36:09 +02:00
Triang3l	78d1eb8bf8	[GPU] TextureCache::GetActiveTextureHostSwizzle	2022-06-09 21:34:21 +03:00
Gliniak	d0175ddf2f	[XAM] Cut handle mask from socket handles, added support for: NetDll_getsockopt Only positive values should be interpreted as valid sockets!	2022-06-08 19:59:15 +02:00
Gliniak	25f3e16baa	[Patcher] Fixed issue with incorrect patches endianness	2022-06-08 19:42:18 +02:00
Gliniak	0de0f40fb5	[XAM] Added stubs for: - NetDll_XNetCreateKey - NetDll_XNetRegisterKey This will allow certain games to run local multiplayer For example PDZ Deathmatch mode	2022-06-07 20:46:47 +02:00
Triang3l	56f72da137	[GPU] More exact PWL texture/RT gamma conversion	2022-06-07 21:26:34 +03:00
Gliniak	916eb1b9bd	[XAM] Scan every controller slot if provided flags contains USER_ANY flag	2022-06-07 15:52:41 +02:00
Margen67	5701823ccf	Log title_name	2022-06-07 09:43:04 +02:00
jgoyvaerts	5296d2e91e	Fix xenia.log file not always being created in the executable folder.	2022-06-07 09:41:52 +02:00
Gliniak	c7da7e1999	Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental	2022-06-02 22:19:43 +02:00
Triang3l	55a91afcc7	[D3D12] Don't decompress unaligned BC textures if supported	2022-06-02 22:48:03 +03:00
Triang3l	84fcd5defa	[GPU] Fix resolve destination offset and extent calculation	2022-06-02 21:47:30 +03:00
Triang3l	a9a072bf00	[GPU] Explain why a 32x32x4bpp linear texture takes 2 pages, not 1 [ci skip]	2022-06-01 13:00:23 +03:00
Triang3l	8bd244f277	[GPU] Better explanation for exact texture memory extent calculation [ci skip]	2022-06-01 12:55:16 +03:00
Gliniak	3169aa2ff3	Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental	2022-06-01 08:45:21 +02:00
Triang3l	d1ad10b98c	[GPU] Primitive reset comment typo correction [ci skip]	2022-05-31 23:23:53 +03:00
Triang3l	efd7ef212a	[D3D12] 128 megatexel limit explanation based on the spec [ci skip]	2022-05-31 23:23:10 +03:00
Triang3l	25594c918c	[GPU] Fix tiled texture memory extent calculation	2022-05-31 23:17:33 +03:00

1 2 3 4 5 ...

5333 Commits