Previously, the OSD neglected to mention any sort of message when the savestate load is failed, the following patch now also prints a message on OSD when detecting such cases of loading an incomplete/corrupt savestate.
This prevents the internal state of the objects from becoming
inconsistent, which causes inflate() to fail with recent zlib versions
(1.2.9 and later).
EE interpeter: remove unused argument
rdd is neither used, nor needed. It appears it was there to pass the _Rd_ word to write to, but the writing was moved to PHMSBH() to have one "if (_Rd_)".
Add a note on undefined behavior
The ring buffer is composed of severals read/write by transaction.
Atomic operations are only required at the start/end of the full
transaction. In the middle, you can use normal variable (optimization
opportunity for the compiler)
Use acquire/release semantics on isBusy and vuCycles to remain 100% safe
(relaxed might be doable but better be safe than sorry)
Use dedicated cache line for atomic variable to avoid any conflict between CPU
The struct is copied in various ring buffer (hot path)
We only need the return status of the function so use a reference instead of
a state variable
Side note: if we align the struct to 16B maybe the compiler can use SSE to copy it.
Warning: it breaks save state compatibility
Hopefully this translates well to slower systems :)
Tekken Tag:
Before: 79-81fps
After: 82-84fps
Front Mission 4 intro (as it pans over the roofs)
Before: 158-159fps
After: 165-166fps
Previously, the seconds variable of the RTC was updated on progressive modes after every 50 Vsyncs, which was obviously wrong. The code has been adjusted to update the RTC with respect to the vertical frequencies of various other video modes.
16B alignment is now useless for nVifBlock (no more SSE)
However update the alignment of bucket to 64B. It will reduce cache miss
probability in the find loop
It avoids memory stalls and greatly reduces the overhead of the dVifUnpack function
Here a vtune summary of this branch (done on SotC init)
dVifUnpack<1> was 14.5% of effective VU thread time
dVifUnpack<1> is now 3.8% of effective VU thread time
I hope it will translate to better fps
Delete() deletes the menu item but keeps the sub menu. Remove() doesn't
delete the menu item.
Also use AppendSubMenu - using Append on a submenu is deprecated.
It allow to compare only 8B in the lookup so SSE could be replaced with general instruction
As a bonus, it allow to compute the hash key with a mov rather than modulo (which was an 'and')
Inline the execution part
Add a num parameter to dVifsetVUptr
Use a local variable for the nVifBlock instead of a global struct state
The goal is to ease future update of the nVifBlock struct
Previous implementation saved the both the chain pointer and the chain size
Rational: size is useful to add new element and to detect the end of the chain
Vif cache is rarely miss. So 'add' is barely called and the end of a chain is
barely reached.
New implementation will add a null cell at the end of the chain. As a
cell contains a x86 pointer, if is null you could conclude that you
reach the end of the chain.
The 'add' function will traverse the chain to get the current size. It is
a cold path besides the chain is often short (< 4).
The 'find' function only need to check the startPtr bytes to detect the end
of the loop.
Note: SizeChain was replaced with a std::array
Safety:
* check remaining space before compilation
* clear hash if recompiler is reset
Perf:
* don't research the hash after a miss
* reduce branching in Unpack/ExecuteUnpack
Note: a potential speed optimization for dVifsetVUptr
Precompute the length and store in the cache. However it need 2B on the
nVifBlock struct. Maybe we can compact cl/wl. Or merge aligned with upkType
(if some bits are useless)
I misses some early return in my first tentative. Now VTune shows me
properly the time in VU recompiler.
Note: It seem some block overlap (likely due to the branching mess). But it is still way better than no data
GS_Packet constructor calls memset which is quite slow and useless as data is overwritten
Vtune overhead of Gif_Unit::Execute goes from 5.8% to 3.0% (EE thread)
Several PSX titles lack a backslash in the elf path, which made the disc
serial contain 'cdrom:', this caused savestate issues in those ganes.
Solves: https://github.com/PCSX2/pcsx2/issues/1692
Previously the boot menu items always displayed "Boot CDVD" regardless of the current source medium, this behavior has been fixed to properly adjust the text when source medium is changed. Now it'll display Boot CDVD/ISO/BIOS with respect to the current source medium.
v2: Some instances of "Iso" have been changed to "ISO" for consistency.
v3: Remove the unnecessary "Reboot" on menu item labels, saves some string translations.
v4: Add a new shortcut key for the primary boot menu item.
There is already a dedicated bind event to handle the gray out of the menu item, so let's just gray it out initially and let the bind event handler do it's thing.
The previous behavior would only gray out the menu item when all the plugins are in a non-active state which didn't seem ideal as the plugins were shutdown only when closing PCSX2 (or) switching plugins.
The purpose is to stop vtune profiling in a predictable way. It allows
to compare multiple runs.
ERET is called every syscall/interrupt return so it is proportional to
the EE program execution.
Cons:
* requires ~180MB of physical memory (virtual memory is the same so it
doesn't impact the 4GB limit)
From steam: 98.81% got at least 2GB of RAM. 83.62% got at least 4GB of RAM.
That being said, it might not really increase RAM requirements as OS could put the
new allocation in the swap.
Pro:
* code is much easier
* remove at least half of the signal listener
* last but not least, it is way easier for profiler/debugger
Previously, the code used a lot of "bitwise AND" to get specific bitfields of the interrupt mask control register, which makes the code look a bit hacky, also it's even more hard for normal people to calculate the value when hexadecimal values are used for the bitwise operations where the register is totally binary. Instead of dealing with all those mess, let's just get the bitfield values from the already implemented nice union of the IMR register. FWIW it also makes the code more readable.
Doesn't fully work yet
* Unknown stack frame
* Outside any known module
Potential root cause:
* Nvidia driver
* VU code as ebp is required for emulation so likely no frame
A FreeBSD 10.3 user (meowthink) reported to me that games were not
working properly on their system. After some investigation, it was
discovered that aio was buggy on their setup. There's also bug reports
for other applications that involve aio too.
Workaround the issue by using a normal read and disabling the use of aio
on FreeBSD 10.3 and earlier. It'll remain enabled on FreeBSD 11.0 in the
hope that the aio issue has since been fixed.
Static size is better aligned but it consumes too much space on the GUI
Besides, if a string (translation) is bigger that the static size it will be cut off.
VU/EE min sized are the same to keep a proper alignment
../pcsx2/gui/MessageBoxes.cpp:62:1: warning: base class ‘class pxActionEvent’ should be explicitly initialized in the copy constructor [-Wextra]
BaseMessageBoxEvent::BaseMessageBoxEvent( const BaseMessageBoxEvent& event )
Previously the video mode was initialized using the info fetched from SetGsCrt Syscall though unfortunately, it doesn't seem to work with PSX games as they don't use the SetGsCrt syscall. At such cases, we get the video mode info from the SMODE2 colorburst to properly maintain the timing as per the video mode. Might help some cases on PSX games where PAL/NTSC video mode was improperly set to a wrong limit instead of it's actual vertical frequency limit.
Newer wxWidgets versions call SetThreadUILanguage() on Windows, which
somehow causes our recursive mutex implementation to take ~1ms when
recursive locking occurs. So when a game boots up and the debugger is
loading the symbol map which can easily have 15000+ symbols, the GUI
locks up for 15+ seconds.
Switching to C++11 recursive mutexes seems to work around the issue. It
should be safe here since there's no direct interaction with the GUI.
Note: There is still a 1-2 second GUI lockup when booting a game on
Windows (it has existed for quite a while, and is more noticeable with
fast boot). It doesn't seem to affect Linux (or maybe it's harder to
detect).
- include a small game exe detection so pcsx2 doesn't believe it's running the bios
- cdrom.cpp has a hack to account for pcsx2's wrong iop dma timing when mixing mdec and cdrom dmas. This should be properly fixed for the benefit of all ps2 / psx software!
- dmasif2 is disabled since pgpu already handles it