This fourth part of my series of patches to get rid of unsafe uses of
GetPointer takes care of the "easy" cases in VideoCommon. Three uses of
GetPointer now remain in Dolphin: VertexLoaderManager, TextureInfo, and
the software renderer's TextureSampler.
For some reason Linux is surprisingly slow at closing file descriptors
of event devices. This commit improves GUI startup times on my computer
by about 1.5 seconds.
When writing the software FMA code, I didn't realize that we can't
overwrite d if d is the same register as one of the inputs and
HandleNaNs is going to be called. This fixes that.
On all platforms, this would result in out of bounds accesses when getting the component sizes (which uses stuff from VertexLoader_Position.h/VertexLoader_TextCoord.h/VertexLoader_Normal.h). On platforms other than x64 and ARM64, this would also be out of bounds accesses when getting function pointers for the non-JIT vertex loader (in VertexLoader_Position.cpp etc.). Usually both of these would get data from other entries in the same multi-dimensional array, but the last few entries would be truly out of bounds. This does mean that an out of bounds function pointer can be called on platforms that don't have a JIT vertex loader, but it is limited to invalid component formats with values 5/6/7 due to the size of the bitfield the formats come from, so it seems unlikely that this could be exploited in practice.
This issue affects a few games; Def Jam: Fight for New York (https://bugs.dolphin-emu.org/issues/12719) and Fifa Street are known to be affected.
I have not done any hardware testing for this PR specifically, though I *think* I previously determined that at least a value of 5 behaves the same as float (4). That's what I implemented in any case. I did previously determine that both Def Jam: Fight for New York and Fifa Street use an invalid normal format, but don't actually have lighting enabled when that normal vector is used, so it doesn't change rendering in practice.
The color component format also has two invalid values, but VertexLoader_Color.h/.cpp do check for those invalid ones and return a default value instead of doing an out of bounds access.
IOS::HLE::IOCtlVRequest::Dump sometimes tries to call GetPointerForRange
with an address of 0 and a size of 0. Address 0 is valid, but we were
mistakenly also trying to check that address 3FFFFFFF is valid, which it
isn't.
Fixes https://bugs.dolphin-emu.org/issues/13514.
Move CheatManager's child widgets into scroll areas to allow making the
window smaller than the default.
In CheatSearchWidget, enable word wrapping for the label describing the
address space and search type to help it fit better inside a narrower
window.
Typically when someone uses GetPointer, it's because they want to read
from a range of memory. GetPointer is unsafe to use for this. While it
does check that the passed-in address is valid, it doesn't know the size
of the range that will be accessed, so it can't check that the end
address is valid. The safer alternative GetPointerForRange should be
used instead.
Note that there is still the problem of many callers not checking for
nullptr.
This is part 2 of a series of changes removing the use of GetPointer
throughout the code base. After this, VideoCommon is the one major part
of Dolphin that remains.
Typically when someone uses GetPointer, it's because they want to read
from a range of memory. GetPointer is unsafe to use for this. While it
does check that the passed-in address is valid, it doesn't know the size
of the range that will be accessed, so it can't check that the end
address is valid. The safer alternative GetPointerForRange should be
used instead.
Note that there is still the problem of many callers not checking for
nullptr.
This is the first part of a series of changes that will remove the usage
of GetPointer in different parts of the code base. This commit gets rid
of every GetPointer call from our IOS code except for a particularly
tricky one in BluetoothEmuDevice.
RCOpArg::ExtractWithByteOffset is only used in one place: a special case
of rlwinmx. ExtractWithByteOffset first stores the value of the
specified register into m_ppc_state (unless it's already there), and
then returns an offset into m_ppc_state. Our use of this function has
two undesirable properties (except in the trivial case `offset == 0`):
1. ExtractWithByteOffset calls StoreFromRegister without going through
any of the usual functions. This violated an assumption I made when
working on my constant propagation PR and led to a hard-to-find bug.
2. If the specified register is in a host register and is dirty,
ExtractWithByteOffset will store its value to m_ppc_state even when
it's unnecessary. In particular, this always happens when rlwinmx
uses the same register as input and output, since rlwinmx always
allocates a host register for the output and marks it as dirty.
Since ExtractWithByteOffset is only used in one place, I figure we might
as well inline it there. This commit does that, and also alters
rlwinmx's logic so the special case code is only triggered when the
input is already in m_ppc_state.
Input in `m_ppc_state`, before (11 bytes):
mov esi, dword ptr [rbp-104]
mov dword ptr [rbp-104], esi
movzx esi, byte ptr [rbp-101]
Input in `m_ppc_state`, after (5 bytes):
movzx esi, byte ptr [rbp-101]
Input in host register, before (8 bytes):
mov dword ptr [rbp-104], esi
movzx esi, byte ptr [rbp-101]
Input in host register, after (3 bytes):
shr edi, 0x18
There were three distinct mechanisms for signaling symbol changes in DolphinQt: `Host::NotifyMapLoaded`, `MenuBar::NotifySymbolsUpdated`, and `CodeViewWidget::SymbolsChanged`. The behavior of these signals has been consolidated into the new `Host::PPCSymbolsUpdated` signal, which can be emitted from anywhere in DolphinQt to properly update symbols everywhere in DolphinQt.
Not sure if the behavior I'm implementing here is what real hardware
does, but since this is a buffer overflow, I'd like to get it fixed
quickly. Hardware verification can happen later.
https://bugs.dolphin-emu.org/issues/13506
Having to look up macros that are defined elsewhere makes the code
harder to reason about. The macros don't remove enough repetition to
justify their existence in my opinion.
With this, I intend to make it clearer that Auto, Force 4:3, Force 16:9
and Custom are really the same thing, just with the aspect ratio of the
simulated TV being selected in a different way. I also extended the
introduction in a way I feel will clarify things but which you are
welcome to bikeshed :)
I was thinking of this during the review of 41b19e262f, but wanted to
put it in a separate PR as to avoid blocking it on bikeshedding.
I'm a bit unsure what to do about the word "analog" in "analog TV". I
felt that repeating it for each of these options would be too
repetitive. I suppose there's a reason why we used the word originally,
but digital TVs do give you basically the same aspect ratio for GC/Wii
games as analog TVs. (Of course, whether it's 4:3-like or 16:9-like
depends on what aspect ratio you set in the TV's settings, but that's
the case for widescreen CRTs too.)
When the divisor is a constant value, we can emit more efficient code.
For powers of two, we can use bit shifts. For other values, we can
instead use a multiplication by magic constant method.
- Example 1 - Division by 16 (power of two)
Before:
mov w24, #0x10 ; =16
udiv w27, w25, w24
After:
lsr w27, w25, #4
- Example 2 - Division by 10 (fast)
Before:
mov w25, #0xa ; =10
udiv w27, w26, w25
After:
mov w27, #0xcccd ; =52429
movk w27, #0xcccc, lsl #16
umull x27, w26, w27
lsr x27, x27, #35
- Example 3 - Division by 127 (slow)
Before:
mov w26, #0x7f ; =127
udiv w27, w27, w26
After:
mov w26, #0x408 ; =1032
movk w26, #0x8102, lsl #16
umaddl x27, w27, w26, x26
lsr x27, x27, #38