Commit Graph

  • 3dcbd25e7f
    Merge pull request #92 from chrisps/canary_experimental chrisps 2022-11-05 11:59:35 -0700
  • c70ae76a69 hopefully switched cxxopts to the main master branch now that the selectany changes are accepted chss95cs@gmail.com 2022-11-05 11:08:04 -0700
  • c1d922eebf Minor decoder optimizations, kernel fixes, cpu backend fixes chss95cs@gmail.com 2022-11-05 10:50:33 -0700
  • ba66373d8c [APU][Janky] Fixed issues with incorrect frames on streamed data Gliniak 2022-11-03 20:56:36 +0100
  • dae508500a [APU] Clear remaining packets skip when we're done with current stream Gliniak 2022-11-03 12:59:34 +0100
  • 4ba14bc35e [APU+HID] Optimizations Margen67 2022-11-03 03:56:13 -0700
  • b23566b823 [APU] Fix incorrect packet frame count when frame ends exactly where packet ends Gliniak 2022-11-03 11:14:37 +0100
  • 259679d53c [APU] Handle exceeding input offset by switching buffer Gliniak 2022-11-02 08:47:36 +0100
  • ff0f3fcc9d
    Merge pull request #89 from xenia-canary/revert-87-canary_experimental chrisps 2022-11-01 14:46:55 -0700
  • 8186792113
    Revert "Minor decoder optimizations, kernel fixes, cpu backend fixes" chrisps 2022-11-01 14:45:36 -0700
  • 781871e2d5
    Merge pull request #87 from chrisps/canary_experimental chrisps 2022-11-01 11:49:10 -0700
  • c080e2e17c [APU] Resolved crashes related to out of bound readouts Gliniak 2022-11-01 11:24:01 +0100
  • 778333b1b5 [UI] Fix ClearInput not called in ImGuiDrawer after deferred dialog removal Triang3l 2022-10-31 18:57:54 +0300
  • 06bfd624de fix failed debug build from loops variable assert chss95cs@gmail.com 2022-10-30 12:33:08 -0700
  • 941237027d fix ffmpeg submodule ptr chss95cs@gmail.com 2022-10-30 11:16:05 -0700
  • bff264b5fd Fixed RtlCompareString and RtlCompareStringN, they were very wrong, for CompareString the params are struct ptrs not char ptrs Fixed a ton of clang-cl compiler warnings about unused variables, still many left. Fixed a lot of inconsistent override ones too chss95cs@gmail.com 2022-10-30 10:47:09 -0700
  • 65b9d93551
    Merge branch 'xenia-canary:canary_experimental' into canary_experimental chrisps 2022-10-30 09:05:40 -0700
  • f5cc54bdae Fix building on clang-cl, it did not like the cxxopts selectany changes chss95cs@gmail.com 2022-10-30 09:05:10 -0700
  • 4fc18949a2 Merge branch 'canary_experimental' of https://github.com/chrisps/xenia-canary into canary_experimental chss95cs@gmail.com 2022-10-30 08:55:53 -0700
  • 550d1d0a7c use much faster exp2/cos approximations in ffmpeg, large decrease in cpu usage on my machine on decoder thread properly byteswap r13 for spinlock Add PPCOpcodeBits stub out broken fpscr updating in ppc_hir_builder. it's just code that repeatedly does nothing right now. add note about 0 opcode bytes being executed to ppc_frontend Add assert to check that function end is greater than function start, can happen with malformed functions Disable prefetch and cachecontrol by default, automatic hardware prefetchers already do the job for the most part minor cleanup in simplification_pass, dont loop optimizations, let the pass manager do it for us Add experimental "delay_via_maybeyield" cvar, which uses MaybeYield to "emulate" the db16cyc instruction Add much faster/simpler way of directly calling guest functions, no longer have to do a byte by byte search through the generated code Generate label string ids on the fly Fix unused function warnings for prefetch on clang, fix many other clang warnings Eliminated majority of CallNativeSafes by replacing them with naive generic code paths. ^ Vector rotate left, vector shift left, vector shift right, vector shift arithmetic right, and vector average are included These naive paths are implemented small loops that stash the two inputs to the stack and load them in gprs from there, they are not particularly fast but should be an order of magnitude faster than callnativesafe to a host function, which would involve a call, stashing all volatile registers, an indirect call, potentially setting up a stack frame for the arrays that the inputs get stashed to, the actual operations, a return, loading all volatile registers, a return, etc Added the fast SHR_V128 path back in Implement signed vector average byte, signed vector average word. previously we were emitting no code for them. signed vector average byte appears in many games Fix bug with signed vector average 32, we were doing unsigned shift, turning negative values into big positive ones potentially chss95cs@gmail.com 2022-10-30 08:48:58 -0700
  • 7d44e638c6 [UI] Resolved issue with next window disappearing after providing input Gliniak 2022-10-28 21:09:57 +0200
  • 5326000f8f [CPU] Increase amount of possible labels used in FinalizationPass Gliniak 2022-10-27 20:59:15 +0200
  • 55877f4c61 [APU] Force buffer swap at the end of stream Gliniak 2022-10-25 17:20:45 +0200
  • 6b11787c93 [APU] Fixed typo that prevented last packet in stream to be processed Gliniak 2022-10-24 21:33:25 +0200
  • fac2a89d0f Disallow offset to be set before header, header size fix, audio channels crashfix Gliniak 2022-10-24 19:40:21 +0200
  • 954cc64e48 [x64] Add AX512 optimization for `OPCODE_SELECT`(F64) Wunkolo 2022-10-23 22:21:25 -0700
  • c5ecb6ff4b [x64] Add AX512 optimization for `OPCODE_SELECT`(V128) Wunkolo 2022-10-23 21:27:36 -0700
  • 43cd387e86 [x64] Add `x64_util.h` Wunkolo 2022-10-23 20:38:14 -0700
  • a37b57ca8d [GPU] Fix tiled mip tail extent calculation Triang3l 2022-10-23 21:19:41 +0300
  • 74f1f6bb6d [Vulkan] Check depthClamp feature Triang3l 2022-10-23 19:01:17 +0300
  • 4add1f67b1 [D3D12] Replace unused shared memory view with a null view Triang3l 2022-10-23 18:09:47 +0300
  • 5fde7c6aa5 [x64] Add AVX512 optimizations for `PERMUTE_V128` Wunkolo 2022-09-05 09:26:26 -0700
  • f207239349 [x64] Add `kX64EmitAVX512VBMI` feature-flag and detection Wunkolo 2022-02-04 22:51:17 -0800
  • d73088e5ca [x64] Add AVX512 optimization for `OPCODE_VECTOR_SUB`(saturated) Wunkolo 2022-09-18 08:10:19 -0700
  • 7c375879bc
    Merge pull request #85 from chrisps/canary_experimental Radosław Gliński 2022-10-21 14:18:03 +0200
  • 4493d17acc
    Update hir_builder.cc chrisps 2022-10-20 14:58:27 -0700
  • adc3405537
    change else{if} to else if in AndNot chrisps 2022-10-20 14:56:55 -0700
  • 48fea6d9aa Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental Gliniak 2022-10-18 12:19:52 +0200
  • cdb40ddb28 [DXBC] Fix interpolator copying from v# to r# in PS Triang3l 2022-10-18 13:12:20 +0300
  • c873049263 [x64] Add AVX512 optimization for `OPCODE_VECTOR_SUB`(saturated) Wunkolo 2022-09-18 08:10:19 -0700
  • 23ec3bf1b1 [x64] Add AVX512 optimizations for `PERMUTE_V128` Wunkolo 2022-09-05 09:26:26 -0700
  • 84344d7f01 [x64] Add `kX64EmitAVX512VBMI` feature-flag and detection Wunkolo 2022-02-04 22:51:17 -0800
  • e7b76d32b4 [Base] BitStream: Prevent readout beyond buffer Gliniak 2022-10-17 11:50:09 +0200
  • d8b7b3ecec Fix bindless path in d3d12 that i broke in earlier commit (did not affect any users, thats a debug thing) Fix guest code profiler, it previously only worked with function precomp + all code you were about to execute already discovered Allow AndNot if type is V128 chss95cs@gmail.com 2022-10-16 07:48:43 -0700
  • b41e5060da Fix bindless path in d3d12 that i broke in earlier commit (did not affect any users, thats a debug thing) Fix guest code profiler, it previously only worked with function precomp + all code you were about to execute already discovered Allow AndNot if type is V128 chss95cs@gmail.com 2022-10-16 07:47:27 -0700
  • 22e52cbecd Canary can now run on sandy bridge/e and ivy bridge/e Stubbed out OPCODE_AND_NOT, its fallback implementation if bmi1 was not supported was broken. it's difficult to tell where the actual issue is there. chss95cs@gmail.com 2022-10-15 05:14:53 -0700
  • 7204532b1c Implement RtlUpcaseUnicodeChar chss95cs@gmail.com 2022-10-15 04:29:13 -0700
  • d7fa8481af Switch cxxopts over to version with selectany while i wait for the selectany change to be merged there chss95cs@gmail.com 2022-10-15 03:49:12 -0700
  • a495709344
    Merge branch 'xenia-canary:canary_experimental' into canary_experimental chrisps 2022-10-15 03:07:35 -0700
  • efbeae660c Drastically reduce cpu time wasted by XMADecoderThread spinning, went from 13% of all cpu time to about 0.6% in my tests Commented out lock in WatchMemoryRange, lock is always held by caller properly set the value/check the irql for spinlocks in xboxkrnl_threading chss95cs@gmail.com 2022-10-15 03:07:07 -0700
  • d262214c1b Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental Gliniak 2022-10-14 20:13:03 +0200
  • e5d5f73875
    Update src/xenia/app/emulator_window.cc Nicholas Huelin 2022-10-13 15:02:19 -0400
  • c09b9e7fbf
    Update src/xenia/app/emulator_window.cc Nicholas Huelin 2022-10-13 15:02:11 -0400
  • 112af805c8 Add screenshot feature SirQuartz 2022-10-09 01:18:49 -0400
  • 07e760612e Add message box when emulator fails to load invalid or corrupted file formats. SirQuartz 2022-10-10 15:06:01 -0400
  • 28b565c0c7
    Merge pull request #82 from chrisps/canary_experimental chrisps 2022-10-09 14:07:43 -0700
  • ecf6bfbbdf Stub event query for linux, fix missing semicolon in linux SetEventBoostPriority chss95cs@gmail.com 2022-10-09 12:30:18 -0700
  • 45050b2380 [GPU] Vulkan fragment shader interlock RB and related fixes/cleanup Triang3l 2022-10-09 22:06:41 +0300
  • c923ab78a9 [Config] Added note about internal_display_resolution Gliniak 2022-10-09 12:33:04 +0200
  • 7975ea78d4 [Base] BitStream: Prevent readout beyond buffer Gliniak 2022-10-09 12:24:46 +0200
  • 17b3939bbf Revert "[Base] Changed size of bitstream accessed data (Risky)" Gliniak 2022-10-09 12:18:43 +0200
  • 08d38bdff6
    Merge pull request #81 from chrisps/canary_experimental chrisps 2022-10-08 12:04:43 -0700
  • 2dd6f33f4b Fix debug/ui premake too chss95cs@gmail.com 2022-10-08 10:34:50 -0700
  • bcd57f8663
    Merge branch 'xenia-canary:canary_experimental' into canary_experimental chrisps 2022-10-08 10:11:30 -0700
  • d8c94b1aee Fix premake filter mistake that broke debug builds (and likely any build other than release) chss95cs@gmail.com 2022-10-08 10:10:36 -0700
  • 8f7f7dc6ad fixed wine crash from use of NtSetEventPriorityBoost add xe::clear_lowest_bit, use it in place of shift-andnot in some bit iteration code make is_allocated_ and is_enabled_ volatile in xma_context preallocate avpacket buffer in XMAContext::Setup, the reallocations of the buffer in ffmpeg were showing up on profiles check is_enabled and is_allocated BEFORE locking an xmacontext. XMA worker was spending most of its time locking and unlocking contexts Removed XeDMAC, dma:: namespace. It was a bad idea and I couldn't make it work in the end. Kept vastcpy and moved it to the memory namespace instead Made the rest of global_critical_region's members static. They never needed an instance. Removed ifdef'ed out code from ring_buffer.h Added EventInfo struct to threading, added Event::Query to aid with implementing NtQueryEvent. Removed vector from WaitMultiple, instead use a fixed array of 64 handles that we populate. WaitForMultipleObjects cannot handle more than 64 objects. Remove XE_MSVC_OPTIMIZE_SMALL() use in x64_sequences, x64 backend is now always size optimized because of premake Make global_critical_region_ static constexpr in shared_memory.h to get rid of wasteage of 8 bytes (empty class=1byte, +alignment for next member=8) Move trace-related data to the tail of SharedMemory to keep more important data together In IssueDraw build an array of fetch constant addresses/sizes, then pre-lock the global lock before doing requestrange for each instead of individually locking within requestrange for each of them Consistent access specifier protected for pm4_command_processor_declare Devirtualize WriteOneRegisterFromRing. Move ExecutePacket and ExecutePrimaryBuffer to pm4_command_buffer_x Remove many redundant header inclusions access xenia-gpu Minor microoptimization of ExecutePacketType0 chss95cs@gmail.com 2022-10-08 09:55:17 -0700
  • 50fce8bdb3
    Merge pull request #80 from chrisps/canary_experimental chrisps 2022-10-05 04:15:53 -0700
  • bae63b95c5 Update to latest version of cxxopts chss95cs@gmail.com 2022-09-30 06:51:25 -0700
  • b4c175d8a3 Enable SDL_LEAN_AND_MEAN, SDL_RENDER_DISABLED, saves about 500kb in final exe chss95cs@gmail.com 2022-09-29 07:26:38 -0700
  • 7e58a3b320 Fix compiler errors i introduced under clang-cl remove xe_kernel_export_shim_fn field of Export function_data, trampoline is now the only way exports get invoked Remove kernelstate argument from string functions in order to conform to the trampoline signature (the argument was unused anyway) Constant-evaluated initialization of ppc_opcode_disasm_table, removal of unused std::vector fields Constant-evaluated initialization of export tables name field on export is just a const char* now, only immutable static strings are ever passed to it Remove unused callcount field of export. PM4 compare op function extracted Globally apply /Oy, /GS-, /Gw on msvc windows Remove imgui testwindow code call, it took up like 300 kb chss95cs@gmail.com 2022-09-29 07:04:17 -0700
  • 596fafcd13 [GPU] Allow setting vsync interval using numerator/denominator beeanyew 2022-09-25 12:03:28 +0200
  • 00b2222c36 [App/UI] Make menu item index handling a bit more robust beeanyew 2022-09-24 13:51:16 +0200
  • 203267b106 Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental Gliniak 2022-09-23 12:23:53 +0200
  • 9ab4db285c [Premake] Update premake-cmake Joel Linn 2022-09-22 12:44:10 +0200
  • 3bfa3b05e1
    Lint fix. Rick Gibbed 2022-09-22 06:34:21 -0500
  • 8f85320a83 [Premake] Update premake-cmake Joel Linn 2022-09-22 12:44:10 +0200
  • 3733ef7041 [App/UI/Kernel] Add console region setting and locale menu beeanyew 2022-09-21 15:57:32 +0200
  • 7d970967c4 Merge branch 'master' of https://github.com/xenia-project/xenia into canary_experimental Gliniak 2022-09-20 21:15:12 +0200
  • def00e6ddb
    Merge pull request #76 from beeanyew/input-system-mutex-revert chrisps 2022-09-18 07:46:02 -0700
  • cd17f1846f [Input System] xe_mutex revert beeanyew 2022-09-18 15:18:29 +0200
  • a29a7436e0
    Merge pull request #75 from chrisps/canary_experimental chrisps 2022-09-17 06:43:50 -0700
  • d0acd68369
    Merge branch 'xenia-canary:canary_experimental' into canary_experimental chrisps 2022-09-17 07:05:24 -0400
  • eb8154908c atomic cas use prefetchw if available remove useless memorybarrier remove double membarrier in wait pm4 cmd add int64 cvar use int64 cvar for x64 feature mask Rework some functions that were frontend bound according to vtune placing some of their code in different noinline functions, profiling after indicating l1 cache misses decreased and perf of func increased remove long vpinsrd dep chain code for conversion.h, instead do normal load+bswap or movbe if avail Much faster entry table via split_map, code size could be improved though GetResolveInfo was very large and had impact on icache, mark callees as noinline + msvc pragma optimize small use log2 shifts instead of integer divides in memory minor optimizations in PhysicalHeap::EnableAccessCallbacks, the majority of time in the function is spent looping, NOT calling Protect! Someone should optimize this function and rework the algo completely remove wonky scheduling log message, it was spammy and unhelpful lock count was unnecessary for criticalsection mutex, criticalsection is already a recursive mutex brief notes i gotta run chss95cs@gmail.com 2022-09-17 04:04:53 -0700
  • addd8c94e5 [x64] Add AVX512 optimization for `OPCODE_VECTOR_ADD`(saturated) Wunkolo 2022-09-09 15:59:16 -0700
  • 6e42db4a85 [x64] Add AVX512 optimization for `OPCODE_VECTOR_ADD`(saturated) Wunkolo 2022-09-09 15:59:16 -0700
  • 9fd684594b [x64] Add AVX512 optimization for `OPCODE_VECTOR_CONVERT_F2I`(unsigned) Wunkolo 2022-09-09 14:16:19 -0700
  • b4224ff3dc
    Merge pull request #74 from chrisps/canary_experimental chrisps 2022-09-11 18:02:00 -0400
  • 0fd4a2533b Prevent clang-format from moving d3d12_nvapi above the require d3d12 headers chss95cs@gmail.com 2022-09-11 14:35:33 -0700
  • 20638c2e61 use Sleep(0) instead of SwitchToThread, should waste less power and help the os with scheduling. PM4 buffer handling made a virtual member of commandprocessor, place the implementation/declaration into reusable macro files. this is probably the biggest boost here. Optimized SET_CONSTANT/ LOAD_CONSTANT pm4 ops based on the register range they start writing at, this was also a nice boost chss95cs@gmail.com 2022-09-11 14:14:48 -0700
  • 35baebf14c [x64] Add AVX512 optimization for `OPCODE_VECTOR_CONVERT_F2I`(unsigned) Wunkolo 2022-09-09 14:16:19 -0700
  • 2e29056ac0
    Merge bec9c387be into 90fffe1de7 Alexandre Messier 2022-09-06 21:57:04 +0200
  • 90fffe1de7 [PPC] Fix memory assert formatting Wunkolo 2022-09-05 09:54:37 -0700
  • b0cc3db4d8 [x64] Add AVX512 optimization for `NOT_V128` Wunkolo 2022-09-05 08:17:33 -0700
  • 470bdf81d9 [PPC] Fix memory assert formatting Wunkolo 2022-09-05 09:54:37 -0700
  • 1bc1321ace [x64] Add AVX512 optimization for `NOT_V128` Wunkolo 2022-09-05 08:17:33 -0700
  • 9a6dd4cd6f
    Merge branch 'xenia-canary:canary_experimental' into canary_experimental chrisps 2022-09-05 09:08:46 -0400
  • 0c576877c8 Add constant folding for LVR when 16 aligned, clean up prior commit by removing dead test code for LVR/LVL/STVL/STVR opcodes and legacy hir sequence Delay using mm_pause in KeAcquireSpinLockAtRaisedIrql_entry, a huge amount of time is spent spinning in halo3 chss95cs@gmail.com 2022-09-04 11:44:29 -0700
  • d372d8d5e3 nasty commit with a bunch of test code left in, will clean up and pr chss95cs@gmail.com 2022-09-04 11:04:41 -0700
  • f62ac9868a Make portable default for new install illusion0001 2022-08-31 12:01:19 -0500
  • 5476d5e422
    Merge branch 'xenia-canary:canary_experimental' into canary_experimental chrisps 2022-09-04 14:45:03 -0400