xenia-canary

Commit Graph

Author	SHA1	Message	Date
Triang3l	b1be33004a	Merge branch 'master' into vulkan	2022-06-25 20:31:26 +03:00
Triang3l	4812b4ba8b	[D3D12] Fix outdated color system constants comment [ci skip]	2022-06-25 20:31:05 +03:00
chss95cs@gmail.com	327cc9eff5	drastically reduce size of final generated code for rlwinm by adding special paths for rotations of 0, masks that discard the rotated bits and using And w/ UINT_MAX instead of truncate/zero extend Add special case to TYPE_INT64's EmitAnd for UINT_MAX mask. Do mov32 to 32 if detected to take advantage of implicit zero xt/reg renaming Add helper function for skipping assignment defs in instr. Add helper function for checking if an opcode is binary value type Add several new optimizations to simplificationpass, plus weak NZM calculation code (better full evaluation of Z/NZ will be done later) . List of optimizations: If a value is anded with a bitmask that it was already masked against, reuse the old value (this cuts out most FPSCR update garbage, although it does cause a local variable to be allocated for the masked FPSCR and it still repeatedly stores the masked value to the context) If masking a value that was or'ed against another check whether our mask only considers bits from one value or another. if so, change the operand to the OR input that actually matters If the only usage of a rotate left's output is an AND against a mask that discards the bits that were rotated in change the opcode to SHIFT_LEFT If masking against all ones, become an assign. If XOR or OR against 0, become an assign (additional FPSCR codegen cleanup) If XOR against all ones, become a NOT Adding a direct CPUID check to x64_emitter for lzcnt, the version of xbyak we are using is skipping checking for lzcnt on all non-intel cpus, meaning we are generating the much slower bitscan path for AMD cpus.	2022-06-25 09:58:13 -07:00
Triang3l	5dca11a892	[SPIR-V] Fix fetch constant LOD bias signedness	2022-06-25 16:33:35 +03:00
Triang3l	d8b0227cbd	[SPIR-V] Fix cubemap X axis	2022-06-25 16:25:29 +03:00
Triang3l	fdcbf67623	[Vulkan] Enable VK_KHR_sampler_ycbcr_conversion	2022-06-25 15:46:02 +03:00
Triang3l	758db4ccb3	[Vulkan] Fix textures not loaded if using a shader for the first time	2022-06-25 15:15:06 +03:00
Triang3l	4db445c6f9	Merge branch 'master' into vulkan	2022-06-25 15:13:41 +03:00
Triang3l	aa45d7b47d	[D3D12] More descriptive pipeline creation call comment [ci skip]	2022-06-25 15:13:11 +03:00
Triang3l	c37c05d189	[Vulkan] Remove an outdated fullscreen shader comment [ci skip]	2022-06-25 14:35:15 +03:00
Triang3l	4b4205ba00	[Vulkan] Frontbuffer presentation	2022-06-25 14:33:43 +03:00
Triang3l	3fc7d8753c	Merge branch 'master' into vulkan	2022-06-24 23:38:04 +03:00
Triang3l	f4a634c617	[XeSL] xesl_writeStore > xesl_Store	2022-06-24 23:37:29 +03:00
Triang3l	7a4732e14f	[GPU] XeSL swap shaders	2022-06-24 23:24:30 +03:00
Gliniak	2b3686f0e9	[XAM] Set profile setting 'from' entry accordingly to setting existence	2022-06-24 10:10:52 +02:00
Triang3l	b7737d70ca	[D3D12] Update RequestSwapTexture resource state comment [ci skip]	2022-06-23 22:59:53 +03:00
Gliniak	ce3b159683	Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental	2022-06-22 21:05:45 +02:00
Triang3l	227d495738	Merge branch 'master' into vulkan	2022-06-22 21:19:29 +03:00
Triang3l	e9f129f67f	[GPU] Safer and more correct depth bias conversion Float24-as-float32 depth bias is now in the increments of 8, because conversion of the depth to float24 directly in the pixel shaders may destroy the bias qualitatively otherwise if it's too small.	2022-06-22 21:14:40 +03:00
Triang3l	a7885ae1a4	[GPU] Fix CPU-side float24 conversion broken recently	2022-06-22 20:47:44 +03:00
Triang3l	4514050f55	[Vulkan] Truncate depth to float24 in EDRAM range ownership transfers and resolves by default Doesn't ruin the "greater or equal" depth test in subsequent rendering passes if precision is lost, unlike rounding to the nearest	2022-06-22 13:25:06 +03:00
Gliniak	e7a122d943	Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental	2022-06-22 12:18:13 +02:00
Triang3l	0d8bd0e0c6	Merge branch 'master' into vulkan	2022-06-22 13:15:50 +03:00
Triang3l	cbf0476d42	[D3D12] Don't round float24 depth when it's known to be exact	2022-06-22 13:14:38 +03:00
Gliniak	83269315d8	Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental	2022-06-22 12:06:42 +02:00
Triang3l	7869b080d3	[D3D12] Truncate depth to float24 in EDRAM range ownership transfers and resolves by default Doesn't ruin the "greater or equal" depth test in subsequent rendering passes if precision is lost, unlike rounding to the nearest	2022-06-22 12:53:09 +03:00
Gliniak	87fd772393	Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental	2022-06-21 07:54:44 +02:00
Radosław Gliński	9a72d6ab05	Merge pull request #46 from chrisps/canary_experimental ome guest function calls can now be resolved and embedded directly in	2022-06-21 00:35:40 +02:00
chss95cs@gmail.com	549ee28a93	ome guest function calls can now be resolved and embedded directly in the emitted asm as rel32 calls. Disabled by default, enabled via resolve_rel32_guest_calls detect whether cpu has fast jrcxz, fast loop/loope/loopne much more thorough LoadConstantXMM New cvar elide_e0_check that allows the backend to assume accesses via the SP or TLS register will not cross into 0xe0 range Add x64 codegen for Vector shift uint8 If has fast jrcxz use for some traptrue/breaktrue instructions Use phat nops Add cvar use_fast_dot_product, which uses a four instruction sequence for both dot product instructions which ought to be equivalent. disabled by default.	2022-06-20 15:08:18 -07:00
Triang3l	c0703e64db	Merge branch 'master' into vulkan	2022-06-20 22:40:19 +03:00
Triang3l	e2f632f8fa	[D3D12] Use udiv by constant tile size + minor transfer cleanup Drivers compile that to a multiplication and a shift anyway.	2022-06-20 22:39:30 +03:00
Triang3l	0dc480721f	[Vulkan] Render target resolving	2022-06-20 22:29:07 +03:00
Triang3l	c6ec6d8239	[Vulkan] Use UDiv/UMod by constant tile size + minor transfer cleanup Drivers compile that to a multiplication and a shift anyway.	2022-06-20 22:24:07 +03:00
Gliniak	a4ff64c465	Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental	2022-06-20 21:07:32 +02:00
Triang3l	61c4c49d76	Merge branch 'master' into vulkan	2022-06-20 12:34:41 +03:00
Triang3l	207e11c8d2	[GPU] Separate range arguments for fixed16 RG and RGBA in GetResolveInfo On Vulkan, when snorm16 in unsupported, these formats may be emulated as float16, which natively can represent a wide range of numbers including -32 to 32 with blending. However, R16G16_SNORM and R16G16B16A16_SNORM are two separate formats, which may have different support on the device.	2022-06-20 12:29:45 +03:00
Triang3l	3b4845511d	[Vulkan] Don't require an explicit uint64_t cast for SetDeviceObjectName	2022-06-20 12:25:52 +03:00
Triang3l	67ff108f53	[Vulkan] Explain why CreateShaderModule takes uint32_t* [ci skip]	2022-06-20 12:22:41 +03:00
Triang3l	b61953374e	[GPU] Make resolve EDRAM binding DS 0 and rename it Ordering the descriptor sets by the change frequency on Vulkan, in increasing order (the opposite of D3D12 root signatures). The EDRAM binding never changes there (always one storage buffer), while the destination buffer binding may become changeable in the future (to split dispatches if exceeding `maxStorageBufferRange`, for example).	2022-06-20 12:15:52 +03:00
Triang3l	1200b205cf	Merge branch 'master' into vulkan	2022-06-19 17:52:28 +03:00
Triang3l	9b83d3d0f4	[GPU] XeSL resolve shaders + host depth store width fix	2022-06-19 17:50:21 +03:00
Gliniak	1e369afa3d	[Memory] Allocate system heap memory from bottom of heap last quarter Aka. From 0x30000000	2022-06-17 22:23:39 +02:00
Gliniak	0b183a3582	Merge branch 'chris_cpu_changes' of https://github.com/Gliniak/xenia.git into canary_experimental	2022-06-17 14:04:58 +02:00
chrisps	e4fd015886	Juicy optimization goodness	2022-06-17 14:03:24 +02:00
chss95cs@gmail.com	8a8ff6ae46	Reuse flag results in OPCODE_BRANCH_TRUE codegen if the preceding instruction was a comparison that already set the cpu flags	2022-06-17 11:13:49 +02:00
chss95cs@gmail.com	3675b3860a	Add constant folding for OPCODE_ROTATE_LEFT	2022-06-17 11:12:49 +02:00
chrisps	3ad80810b5	Optimized CONVERT_I64_TO_F64 with neat overflow trick Reduced instruction count from 11 to 8, eliminated a movq stall.	2022-06-17 11:10:48 +02:00
chrisps	9dfbef8acf	Smaller ComputeMemoryAddress/Offset sequence Replace a movzx after setae in both ComputeMemoryAddressOffset and ComputeMemoryAddress with a xor_ of eax prior to the cmp. This reduces the length in bytes of both sequences by 1, and should be a moderate ICache usage reduction thanks to the frequency of these sequences.	2022-06-17 11:10:27 +02:00
Gliniak	c0483f8bee	Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental	2022-06-17 10:58:15 +02:00
Triang3l	166be463be	[XeSL] Metal Shading Language definitions	2022-06-16 21:39:16 +03:00

1 2 3 4 5 ...

6997 Commits All Branches Search

6997 Commits

All Branches