mirror of https://github.com/bsnes-emu/bsnes.git
Update to bsnes v031r02? release.
New WIP. Please be sure to test a few games with this one to look for regressions. I got tired of using bit packing for CPU / SMP register flags, because they do not mask the upper bits properly. In other words, (assume big endian) if you have struct { uint8_t n:1, v:1, m:1, x:1, d:1, i:1, z:1, c:1; } p; and you set p.m = 7; it will set p.v and p.n as well. It doesn't cast the type to bool. So I rewrote the old template struct trick, but bound it with a reference rather than relying upon union alignment. Looks something like this: template<int mask> struct CPUFlag { uint8 &data; inline operator bool() const { return data & mask; } inline CPUFlag& operator=(bool i) { data = (data & ~mask) | (-i & mask); return *this; } CPUFlag(uint8 &data_) : data(data_) {} }; class CPURegFlags { public: uint8 data; CPUFlag<0x80> n; CPUFlag<0x40> v; ... CPURegFlags() : data(0), n(data), v(data), m(data), x(data), d(data), i(data), z(data), c(data) {} }; Surprisingly, benchmarks show this method is ~2x faster, but flags were never a bottleneck so it won't affect bsnes' speed. Anyway, with this, I decided to get rid of the confusing and stupid !!() stuff all throughout the CMP and SMP opfn.cpp files. It's no longer needed since the template assignment takes only a boolean argument. Anything not zero becomes one with that. So code such as this: uint8 sSMP::op_adc(uint8 x, uint8 y) { int16 r = x + y + regs.p.c; regs.p.n = !!(r & 0x80); regs.p.v = !!(~(x ^ y) & (y ^ (uint8)r) & 0x80); regs.p.h = !!((x ^ y ^ (uint8)r) & 0x10); regs.p.z = ((uint8)r == 0); regs.p.c = (r > 0xff); return r; } Now looks like this: uint8 sSMP::op_adc(uint8 x, uint8 y) { int r = x + y + regs.p.c; regs.p.n = r & 0x80; regs.p.v = ~(x ^ y) & (x ^ r) & 0x80; regs.p.h = (x ^ y ^ r) & 0x10; regs.p.z = (uint8)r == 0; regs.p.c = r > 0xff; return r; } I also took the time to figure out how the hell the overflow stuff worked. Pretty neat stuff. Essentially, overflow is set when you add/subtract two positive or two negative numbers, and the result ends up with a different sign. Hence, the sign overflowed, so your negative number is now positive, or vice versa. A simple way to simulate it is: int result = (int8_t)x + (int8_t)y; bool overflow = (result < -128 || result > 127); But there's no reason to perform signed math, since the result can't be used for anything else, not even any other flags, as the opcode math is always unsigned. So to implement it with this: int result = (uint8_t)x + (uint8_t)y; We just verify that both signs in x and y are the same, and that their sign is different from the result to set overflow, eg: bool overflow = (x & 0x80) == (y & 0x80) && (x & 0x80) != (result & 0x80); But that's kind of slow. We can test a single bit for equality and merge the &0x80's by using a XOR table: 0^0=0, 0^1=1, 1^0=1, 1^1=0 The trick here is that if the two bits are equal, we get 0, if they are not equal, we get one. So if we want to see if x&0x80 == y&0x80, we can do: !((x ^ y) & 0x80); ... or we can simply invert the XOR result so that 1 = equal, 0 = different, eg ~(x ^ y) & 0x80; The latter is nice because it keeps the bit positions in-tact. Whereas the former reduces to 1 or 0, the latter remains 0x80 or 0x00. This is good for chaining, as I'll demonstrate below. Do the same for the second test and we get: bool overflow = ~(x ^ y) & 0x80 && (x ^ result) & 0x80; We complement the former because we want to verify they are the same, we don't for the latter because we want to verify that they have changed. Now we can basically use one more trick to combine the two bit masks here. We want to return 1 when overflow is set, so we can look for a pattern that will only return one when both the first and second tests pass. An AND table works great here. 0&0=0, 0&1=0, 1&0=0, 1&1=1. Only if both are true do we end up with 1. So this means we can AND the two results, and then mask the only bit we care about once to get the result, eg: bool overflow = ~(x ^ y) & (x ^ result) & 0x80; And there we go, that's where that bizarre math trick comes from. I realized while doing this something that bugged me in the past. I used to think that for some reason, the S-SMP add overflow test required x^y & y^r, whereas S-CPU add overflow used x^y & x^r. Probably because I read the algorithm from Snes9x's sources or something. But that was flawed -- since addition is commutative, it doesn't matter whether the latter is x^result or y^result. Only in subtraction does the order matter, where you must always use x^result to test the initial value every time. Subtraction switches up things a little. It sets overflow only when the signs of x and y are _different_, and when x and the result are also different, eg: bool overflow = (x ^ y) & (x ^ result) & 0x80; Fun stuff, huh? So I was wanting this tested thoroughly, just in case there was a typo or something when updating the opfn.cpp files. --- That said, I also polished up the UI a bit. Moved disabled to the bottom of the speed regulation list, and added key / joypad bindings for "exit emulator", "speed regulation increase / decrease" and "frameskip increase / decrease". I know these key bindings do not update the menubar radiobox positions yet. I'll get that taken care of shortly. [No archive available]
This commit is contained in:
parent
b895f29bed
commit
340d86845a