Update to bsnes v033r05? release.

New WIP.

After some more hardware testing, it seems my theory from before was
correct. See the HDMA thread for more info if you care.

With those changes plus a few others, I'm now able to get everything
in my "known troublesome" games list to work properly and with no
flickering:
- Breath of Fire 2 (G)
- Earthworm Jim 2 (U+E)
- Energy Breaker
- Jumbo Osaki no Hole in One
- Mecarobot Golf
- Secret of Mana
- Street Racer
- Super Mario Kart

I still can't get Street Racer to flicker, maybe you guys can?
Hopefully not, such a hard-to-trigger bug will be even harder to
debug.

Image
(ignore the framerate, from a pause/resume screen capture.)

And fucking _hell_ that game is hard.

Note that to get BoF2 (G) to work, I had to modify S-SMP cycle timing
from 32040hz*768 to 32041hz*768. It seems the game is very sensitive
to S-CPU <> S-SMP timing, and the improved HDMA timing was just
unlucky enough to just _barely_ miss the handshake. This was further
compounded by there being no input before the point in question to
vary timing.

It's not really a problem with the game itself -- d4s really pushed
the limits of these two chips to pull off that impressive intro. It
was more that I was hitting an extremely tiny window of time that
caused a deadlock.

This timing change only affects S-CPU <> S-SMP communications (eg
handshakes and such), and not timing inside each individual processor.
Recall that both processors in both regions (NTSC and PAL) have
slightly different timings, and the exact timings vary even on real
hardware, as the crystal clocks used are not perfect.

The NTSC S-SMP has been observed at ~32040hz on an oscilloscope by the
guy at alpha-ii.com, which is faster than the stock speed of ~32000hz.
But we still use stock speeds for the S-CPU because that's all we
have. Changing the S-CPU speed a bit would've fixed this as well.

So yeah, the fix is a bit of a kludge, but it's the best I can do when
the problem is in communication between the two chips.

Keep in mind that the S-SMP clock rates are cached in the config file.
You'll either need to delete it, or reset the values to the default in
the advanced panel. Otherwise the game will hang on first run.

Also, I tightened DMA transfer restrictions even more. A-bus accesses
to $4200-421f and $4016-4017 are now blocked. And I also block these
during HDMA line counter / indirect address fetches (as observed on
hardware.) Further, I was previously allowing invalid B->A transfers
to still write the the MMIO reg specified in A, but ignoring the B-bus
read. This seemed wrong: not being able to access the reg should mean
not being able to access it period, so I swapped that around.
Shouldn't affect any known games, but mentioning it just in case.

> Perfect timing matching isn't needed, the games are broken if they
> can't take a normal sized delay for this.


Mortal Kombat II breaks if you're exactly 6 cycles off from expected
timing (but works if you're more than six cycles off.) Jumbo Osaki was
failing by 20 cycles. Wild Guns fails if off by two cycles. A couple
other games were the same. There are roughly _21 million cycles_ in a
second.

Death Brade and some European racing game break if _uninitialized RAM_
doesn't return the values they like.

Uniracers is quite simply _beyond_ broken.

I wish I could get away with just saying the games themselves were
broken (and they are), but when it runs at least 99% of the time on
hardware, you can't use that as an excuse. Everyone will still call it
an emulation bug :(

> Err, not really. Fixed delay for all operations is as dumb as no
> delay for all operations.


I typically like the idea of emulating as much as we can ("building
blocks" and such), if that means guessing approximate delays, so much
the better. But for the DSP-1, adding any delays is even worse in my
opinion. Why?

First, the delay lengths will no doubt vary depending upon how complex
the transfer is. Second, emulating the delays would force us to
implement the DSP-1 as the dedicated processor that it is: thusly, its
overhead would soar from barely noticeable to nearly as intense as
SuperFX / SA-1 emulation. Third, it may be possible to read partially
computed results before the operations finish. We can't even figure
out the partial computations of mere _unsigned multiplication and
division_ in the S-CPU core, so how the hell would we ever plan to
figure out attitude / altitude calculations?

The only feasible way we're going to get this right is to dump the
program ROM and then emulate the instruction set. Even decapping the
DSP-1 has been no help for that, and even if by some miracle we got
the ROM, we'd have to figure out the instruction set and timing with
no documentation. And all of this to improve emulation of a couple of
lackluster action games. Good luck finding someone willing to do all
that for free, and just to end up getting ~90% of people bitching that
suddenly DSP-1 emulation is as demanding as SFX emulation, yet
provides no visible improvement over existing emulation. And it even
requires another DSP1program.rom file that they didn't need before!

Thus, it's really not worth the effort if our entire model of
emulating the chip is busted in such a manner that we couldn't improve
it more even if we wanted to anyway.

[No archive available]
This commit is contained in:
byuu 2008-08-03 12:08:00 +00:00
parent 0cf16ce784
commit 53e913e225

Diff Content Not Available