mirror of https://github.com/bsnes-emu/bsnes.git
Update to bsnes v038r05? release.
New WIP, this one's fairly big as nightlies go. First, moved the priority queue to a generic implementation so I can re-use it elsewhere in the future. Took a ~1% speed hit or so by using functors for the callback and using the signed math trick to avoid the need for a normalize() function. Sadly it gets up to 3% slower if the priorityqueue class code isn't placed right next to the CPU core. Second, while I failed miserably at using the queues for IRQ / NMI testing, I did come up with a neat compromise. NMI is only tested once per scanline, IRQs only have PPU dot precision (every 4 clocks), the hold time for both is four clock cycles, and scanlines for both NTSC and PAL, even on the short colorburst scanline, are always evenly divisible by four. ... so testing every 2 clock cycles was kind of pointless, as it'd always be false. Since the delays between the PPU counter and CPU trigger for NMI is 2, and IRQ is 10, they even align again with an offset of 2. ... hence, I can call poll_interrupts() half as often by using if(ppu.hcounter() & 2). I reverse that for the Super Scope / Justifier dot testing and cut their overhead in half as well. That gives us a nice ~10-15% speedup. Nowhere near the idealistic ~30-40% for range tested IRQs, because that only actually tests once per scanline (~1364 cycles). This just cuts ~682 tests down to ~341 tests. Still, it's pretty close to half as good while still being super clean and easy. It greatly diminishes the value of a range-based IRQ tester, as that will only offer a ~15-20% speedup now at best. Getting PGO working again is the new lowest-hanging fruit. I also eked out a tiny bit more speed by adding some previous missed "else" statements in the irq_valid testing part. With the newfound speed, I gave a tiny bit up (1-2%) to simplify and improve some old edge cases. It's known that IRQs won't trigger on the very last dot of each field. It's due to the way the V and H counters are misaligned, that we can't easily emulate. So before I had a bunch of cruft to support that, update_interrupts() was called at the start of each scanline, which would call irq_valid() to run a bunch of tests to make sure the latch positions would actually work on hardware. Writes to $4207-420a would also call the update_interrupts() proc. I killed all that, and now compute the HTIME position inline in poll_interrupts(), and perform the last dot check there. Since testing is ten clocks behind anyway, then we need only check to see if VTIME > 0 and ppu.vcounter(-6 clocks) == 0 to know that it was set for the last dot on any given field. This gives us two nice perks for free: one, no more need to hard-code scanlines/frame inside the CPU core; and two, the old version was missing an edge case in interlace mode where odd fields would allow an IRQ on the last dot, which was simply because my old irq_valid() test didn't have a third condition for that. All that said, I'm getting ~157.5fps instead of ~137.5fps now in Zelda 3. Third, I removed grayscale/sepia/invert from the video settings panel, and stuck them in advanced. Used the new space to add checkboxes for NTSC merge fields and the start in fullscreen thing. Reference: //called once every four clock cycles; //as NMI steps by scanlines (divisible by 4) and IRQ by PPU 4-cycle dots. // //ppu.(vh)counter(n) returns the value of said counters n-clocks before current time; //it is used to emulate hardware communication delay between opcode and interrupt units. alwaysinline void sCPU::poll_interrupts() { //NMI hold if(status.nmi_hold) { status.nmi_hold = false; if(status.nmi_enabled) status.nmi_transition = true; } //NMI test bool nmi_valid = (ppu.vcounter(2) >= (!ppu.overscan() ? 225 : 240)); if(!status.nmi_valid && nmi_valid) { //0->1 edge sensitive transition status.nmi_line = true; status.nmi_hold = true; //hold /NMI for four cycles } else if(status.nmi_valid && !nmi_valid) { //1->0 edge sensitive transition status.nmi_line = false; } status.nmi_valid = nmi_valid; //IRQ hold status.irq_hold = false; if(status.irq_line) { if(status.virq_enabled || status.hirq_enabled) status.irq_transition = true; } //IRQ test (unrolling the duplicate Nirq_enabled tests causes speed hit) bool irq_valid = (status.virq_enabled || status.hirq_enabled); if(irq_valid) { if((status.virq_enabled && ppu.vcounter(10) != (status.virq_pos)) || (status.hirq_enabled && ppu.hcounter(10) != (status.hirq_pos + 1) * 4) || (status.virq_pos && ppu.vcounter(6) == 0) //IRQs cannot trigger on last dot of field ) irq_valid = false; } if(!status.irq_valid && irq_valid) { //0->1 edge sensitive transition status.irq_line = true; status.irq_hold = true; //hold /IRQ for four cycles } status.irq_valid = irq_valid; } [No archive available]
This commit is contained in:
parent
155b4fbfcd
commit
3908890072