Let's talk a bit about this bug. 12nd oldest bug not fixed in Dolphin, it was a
lot of fun to debug and it kept me busy for a while :)
Shoutout to Nintendo for framework.map, without which this could have taken a
lot longer.
Basic debugging using apitrace shows that the heat effect is rendered in an
interesting way:
* An EFB copy texture is created, using the hardware scaler to divide the
texture resolution by two and that way create the blur effect.
* This texture is then warped using indirect texturing: a deformation map is
used to "move" the texture coordinates used to sample the framebuffer copy.
Pixel shader: http://pastie.org/private/25oe1pqn6s0h5yieks1jfw
Interestingly, when looking at apitrace, the deformation texture was only 4x4
pixels... weird. It also does not have any feature that you would expect from a
deformation map. Seeing how the heat effect glitches, this deformation texture
being wrong looks like a good candidate for the problem. Let's see how it's
loaded!
By NOPing random calls to GXSetTevIndirect, we find a call that when removed
breaks the effect completely. The parameters used for this call come from the
results of methods of JPAExTexShapeArc objects. 3 different objects go through
this code path, by breaking each one we can notice that the one "controlling"
the heat effect is the one at 0x81575b98.
Following the path of this object a bit more, we can see that it has a method
called "getIndTexId". When this is called, the returned texture ID is used to
index a map and get a JPATextureArc object stored at 0x81577bec.
Nice feature of JPATextureArc: they have a getName method. For this object, it
returns "AK_kagerouInd01". We can probably use that to see how this texture
should look like, by loading it "manually" from the Wind Waker DVD.
Unfortunately I don't know how to do that. Fortunately @Abahbob got me the
texture I wanted in less than 10min after I asked him on Twitter.
AK_kagerouInd01 is a 32x32 texture that really looks like a deformation map:
http://i.imgur.com/0TfZEVj.png . Fun fact: "kagerou" means "heat haze" in JP.
So apparently we're not using the right texture object when rendering! The
GXTexObj that maps to the JPATextureArc is at offset 0x81577bf0 and points to
data at 0x80ed0460, but we're loading texture data from 0x0039d860 instead.
I started to suspect the BP write that loads the texture parameters "did not
work" somehow. Logged that and yes: nothing gets loaded to texture stage 1! ...
but it turns out this is normal, the deformation map is loaded to texture stage
5 (hardcoded in the DOL). Wait, why is the TextureCache trying to load from
texture stage 1 then?!
Because someone sucked at hex.
Fixes issue 2338.
At the moment, custom textures with:
- invalid mipmap size
- invalid aspect ratio
- non-fractional scaling factors
are allowed. But they can't be loaded fine by the backend, so generate a warning if someone trys to load them.
fixes issue 6898
OpenGL defaults are GL_REPEAT, which is even more unlikely than GL_CLAMP_TO_EDGE.
As I can't test the behavoir of the real hardware, I changed it to how it works before,
but I guess just clip the texture makes more sense.
Some settings that bootmanger reads from game ini can be changed while a game is running, so we don't have to revert these back to what they were when starting the game, unless they were actually changed by the game ini.
Fix signed/unsigned warnings that pauldacheez pointed out.
It didn't make sense. The math was nonsensical. Calibration data was somehow applied twice. I don't even.
This reverts commit 4dad640d5f.
Fixed issue 6702.
SSE do support non-vector instructions, but they _all_ overwrite the dest register
if the src location isn't a register. (wtf?)
So we have to load the src into a temporay register :(
Parsing Gecko codes (in any manner) is much like parsing HTML with regex
- that w̷a̶y̸ l̵i̷e̴s̵ m̴̲a̵͈d̵̝n̵̙ę̵͎̞̼̙̼͔̞͖͎̝s̵̨̬̱͍͓͉̠̯̤͙̝s̷͍̲̲̭̼͍͎͖̤̭̘. Luckily, with the embedded codehandler.bin,
the monstrosity may remain at only one implementation. Anyway, removing
the inserted_asm_codes thing probably speeds up the interpreter a bit.
MemArena mmaps the emulated memory from a file in order to get the same
mapping at multiple addresses. A file which, formerly, was located at a
static filename: it was unlinked after creation, but the open did not
use O_EXCL, so if two instances started up on the same system at just
the right time, they would get the same memory. Naturally, this caused
extremely mysterious crashes, but only in Netplay, where the game is
automatically started when the client receives a broadcast from the
server, so races are actually quite likely.
And switch to shm_open, because it fits the bill better and avoids any
issues with using /tmp.
This flag wasn't cleared at all, so we set our constants dirty every time...
This could fix some performance regressions because of revision 6798a4763e
Real xfb didn't provide any read_stride, so there is a division by zero.
This commit calculates the correct read_stride for real_xfb, so there is also no hack for texture vs xfb needed.
This removes the redundant code and also implements this feature for OSX and Wayland.
But so it's dropped for non-wx builds...
imo DolphinWX still isn't the best place for this, but now it's in the same file as all other hotkeys. Maybe they'll be moved to InputCommon sometimes at once ...
This is done with a pixel buffer object. We still have to stall the GPU, but
we only do it once per efb2ram call.
As the cpu can't access the vram, it has to queue a memcpy for the gpu and
wait for the gpu to finish this copy. We did this for every cache line which
is just stupid. Now we copy the complete texture into a pbo and readback this
at once. So we don't have to wait for lots of round-trip-times.
Also use attributeless rendering. But we need the src rect, so set it by uniform.
If there is a slowdown here (I doubt as the driver likely has a fast path to update uniforms)
then we should check if this rect changes and only then update the uniform.
We use attributeless rendering, so officially we have to bind _any_ VAO.
As the state of this VAO doesn't matter, we don't have to switch it.
Also fix an AMD issue as they don't like to render from an empty VAO.
We neither scale nor render from subimages, so we by using gl_Position, we don't have to generate _any_ vertices for this converting.
Also remove the glTexSubImage optimization as every driver does it when needed. But there are some workflows (eg on APU) where it's better to realloc this texture instead of a second memcpy or stall.
This fixes Real XFB Jaggies in OpenGL on games which use yscaling, such
as most PAL games.
This fixes the last of the "Real XFB Macroblocking" issues for opengl,
see issue #6503
Seems OpenGL ES 3 Requires you must have an lod argument, while Desktop
versions require you must not have a lod argument if you are using a
Sampler2DRect (which doesn't do Mipmapping).
The pal version of SSBB has a 640x568 xfb, which is larger than the efb.
Increase the size of the static textures and put in some runtime checks
to prevent buffer overruns.
Lots of x86 instructions are different on memory vs registers.
So to generate code, we often have to check if a ppc register is already bound to a x86 register.
Revision ddaf29e039 introduced a register
corruption bug (#6825). Since fmrx/MOVSD only modifies ps0 but we save
both ps0 and ps1 in one xmm register, not loading the previous value
when binding to a x64 register trashed ps1.
But hey, a good opportunity to shave off one more instruction ;)