2016-07-10 05:28:26 +00:00
|
|
|
#pragma once
|
|
|
|
|
|
|
|
namespace Emulator {
|
|
|
|
|
Update to v100r14 release.
byuu says:
(Windows: compile with -fpermissive to silence an annoying error. I'll
fix it in the next WIP.)
I completely replaced the time management system in higan and overhauled
the scheduler.
Before, processor threads would have "int64 clock"; and there would
be a 1:1 relationship between two threads. When thread A ran for X
cycles, it'd subtract X * B.Frequency from clock; and when thread B ran
for Y cycles, it'd add Y * A.Frequency from clock. This worked well
and allowed perfect precision; but it doesn't work when you have more
complicated relationships: eg the 68K can sync to the Z80 and PSG; the
Z80 to the 68K and PSG; so the PSG needs two counters.
The new system instead uses a "uint64 clock" variable that represents
time in attoseconds. Every time the scheduler exits, it subtracts
the smallest clock count from all threads, to prevent an overflow
scenario. The only real downside is that rounding errors mean that
roughly every 20 minutes, we have a rounding error of one clock cycle
(one 20,000,000th of a second.) However, this only applies to systems
with multiple oscillators, like the SNES. And when you're in that
situation ... there's no such thing as a perfect oscillator anyway. A
real SNES will be thousands of times less out of spec than 1hz per 20
minutes.
The advantages are pretty immense. First, we obviously can now support
more complex relationships between threads. Second, we can build a
much more abstracted scheduler. All of libco is now abstracted away
completely, which may permit a state-machine / coroutine version of
Thread in the future. We've basically gone from this:
auto SMP::step(uint clocks) -> void {
clock += clocks * (uint64)cpu.frequency;
dsp.clock -= clocks;
if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread);
if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread);
}
To this:
auto SMP::step(uint clocks) -> void {
Thread::step(clocks);
synchronize(dsp);
synchronize(cpu);
}
As you can see, we don't have to do multiple clock adjustments anymore.
This is a huge win for the SNES CPU that had to update the SMP, DSP, all
peripherals and all coprocessors. Likewise, we don't have to synchronize
all coprocessors when one runs, now we can just synchronize the active
one to the CPU.
Third, when changing the frequencies of threads (think SGB speed setting
modes, GBC double-speed mode, etc), it no longer causes the "int64
clock" value to be erroneous.
Fourth, this results in a fairly decent speedup, mostly across the
board. Aside from the GBA being mostly a wash (for unknown reasons),
it's about an 8% - 12% speedup in every other emulation core.
Now, all of this said ... this was an unbelievably massive change, so
... you know what that means >_> If anyone can help test all types of
SNES coprocessors, and some other system games, it'd be appreciated.
----
Lastly, we have a bitchin' new about screen. It unfortunately adds
~200KiB onto the binary size, because the PNG->C++ header file
transformation doesn't compress very well, and I want to keep the
original resource files in with the higan archive. I might try some
things to work around this file size increase in the future, but for now
... yeah, slightly larger archive sizes, sorry.
The logo's a bit busted on Windows (the Label control's background
transparency and alignment settings aren't working), but works well on
GTK. I'll have to fix Windows before the next official release. For now,
look on my Twitter feed if you want to see what it's supposed to look
like.
----
EDIT: forgot about ICD2::Enter. It's doing some weird inverse
run-to-save thing that I need to implement support for somehow. So, save
states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
|
|
|
struct Thread;
|
|
|
|
|
2016-07-10 05:28:26 +00:00
|
|
|
struct Scheduler {
|
|
|
|
enum class Mode : uint {
|
|
|
|
Run,
|
|
|
|
SynchronizeMaster,
|
|
|
|
SynchronizeSlave,
|
|
|
|
};
|
|
|
|
|
|
|
|
enum class Event : uint {
|
|
|
|
Step,
|
|
|
|
Frame,
|
|
|
|
Synchronize,
|
|
|
|
};
|
|
|
|
|
Update to v100r14 release.
byuu says:
(Windows: compile with -fpermissive to silence an annoying error. I'll
fix it in the next WIP.)
I completely replaced the time management system in higan and overhauled
the scheduler.
Before, processor threads would have "int64 clock"; and there would
be a 1:1 relationship between two threads. When thread A ran for X
cycles, it'd subtract X * B.Frequency from clock; and when thread B ran
for Y cycles, it'd add Y * A.Frequency from clock. This worked well
and allowed perfect precision; but it doesn't work when you have more
complicated relationships: eg the 68K can sync to the Z80 and PSG; the
Z80 to the 68K and PSG; so the PSG needs two counters.
The new system instead uses a "uint64 clock" variable that represents
time in attoseconds. Every time the scheduler exits, it subtracts
the smallest clock count from all threads, to prevent an overflow
scenario. The only real downside is that rounding errors mean that
roughly every 20 minutes, we have a rounding error of one clock cycle
(one 20,000,000th of a second.) However, this only applies to systems
with multiple oscillators, like the SNES. And when you're in that
situation ... there's no such thing as a perfect oscillator anyway. A
real SNES will be thousands of times less out of spec than 1hz per 20
minutes.
The advantages are pretty immense. First, we obviously can now support
more complex relationships between threads. Second, we can build a
much more abstracted scheduler. All of libco is now abstracted away
completely, which may permit a state-machine / coroutine version of
Thread in the future. We've basically gone from this:
auto SMP::step(uint clocks) -> void {
clock += clocks * (uint64)cpu.frequency;
dsp.clock -= clocks;
if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread);
if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread);
}
To this:
auto SMP::step(uint clocks) -> void {
Thread::step(clocks);
synchronize(dsp);
synchronize(cpu);
}
As you can see, we don't have to do multiple clock adjustments anymore.
This is a huge win for the SNES CPU that had to update the SMP, DSP, all
peripherals and all coprocessors. Likewise, we don't have to synchronize
all coprocessors when one runs, now we can just synchronize the active
one to the CPU.
Third, when changing the frequencies of threads (think SGB speed setting
modes, GBC double-speed mode, etc), it no longer causes the "int64
clock" value to be erroneous.
Fourth, this results in a fairly decent speedup, mostly across the
board. Aside from the GBA being mostly a wash (for unknown reasons),
it's about an 8% - 12% speedup in every other emulation core.
Now, all of this said ... this was an unbelievably massive change, so
... you know what that means >_> If anyone can help test all types of
SNES coprocessors, and some other system games, it'd be appreciated.
----
Lastly, we have a bitchin' new about screen. It unfortunately adds
~200KiB onto the binary size, because the PNG->C++ header file
transformation doesn't compress very well, and I want to keep the
original resource files in with the higan archive. I might try some
things to work around this file size increase in the future, but for now
... yeah, slightly larger archive sizes, sorry.
The logo's a bit busted on Windows (the Label control's background
transparency and alignment settings aren't working), but works well on
GTK. I'll have to fix Windows before the next official release. For now,
look on my Twitter feed if you want to see what it's supposed to look
like.
----
EDIT: forgot about ICD2::Enter. It's doing some weird inverse
run-to-save thing that I need to implement support for somehow. So, save
states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
|
|
|
auto active(Thread& thread) const -> bool {
|
|
|
|
return co_active() == thread.handle();
|
|
|
|
}
|
|
|
|
|
|
|
|
auto reset() -> void {
|
|
|
|
threads.reset();
|
|
|
|
}
|
|
|
|
|
|
|
|
auto primary(Thread& thread) -> void {
|
|
|
|
master = _resume = thread.handle();
|
2016-07-10 05:28:26 +00:00
|
|
|
host = co_active();
|
|
|
|
}
|
|
|
|
|
Update to v100r14 release.
byuu says:
(Windows: compile with -fpermissive to silence an annoying error. I'll
fix it in the next WIP.)
I completely replaced the time management system in higan and overhauled
the scheduler.
Before, processor threads would have "int64 clock"; and there would
be a 1:1 relationship between two threads. When thread A ran for X
cycles, it'd subtract X * B.Frequency from clock; and when thread B ran
for Y cycles, it'd add Y * A.Frequency from clock. This worked well
and allowed perfect precision; but it doesn't work when you have more
complicated relationships: eg the 68K can sync to the Z80 and PSG; the
Z80 to the 68K and PSG; so the PSG needs two counters.
The new system instead uses a "uint64 clock" variable that represents
time in attoseconds. Every time the scheduler exits, it subtracts
the smallest clock count from all threads, to prevent an overflow
scenario. The only real downside is that rounding errors mean that
roughly every 20 minutes, we have a rounding error of one clock cycle
(one 20,000,000th of a second.) However, this only applies to systems
with multiple oscillators, like the SNES. And when you're in that
situation ... there's no such thing as a perfect oscillator anyway. A
real SNES will be thousands of times less out of spec than 1hz per 20
minutes.
The advantages are pretty immense. First, we obviously can now support
more complex relationships between threads. Second, we can build a
much more abstracted scheduler. All of libco is now abstracted away
completely, which may permit a state-machine / coroutine version of
Thread in the future. We've basically gone from this:
auto SMP::step(uint clocks) -> void {
clock += clocks * (uint64)cpu.frequency;
dsp.clock -= clocks;
if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread);
if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread);
}
To this:
auto SMP::step(uint clocks) -> void {
Thread::step(clocks);
synchronize(dsp);
synchronize(cpu);
}
As you can see, we don't have to do multiple clock adjustments anymore.
This is a huge win for the SNES CPU that had to update the SMP, DSP, all
peripherals and all coprocessors. Likewise, we don't have to synchronize
all coprocessors when one runs, now we can just synchronize the active
one to the CPU.
Third, when changing the frequencies of threads (think SGB speed setting
modes, GBC double-speed mode, etc), it no longer causes the "int64
clock" value to be erroneous.
Fourth, this results in a fairly decent speedup, mostly across the
board. Aside from the GBA being mostly a wash (for unknown reasons),
it's about an 8% - 12% speedup in every other emulation core.
Now, all of this said ... this was an unbelievably massive change, so
... you know what that means >_> If anyone can help test all types of
SNES coprocessors, and some other system games, it'd be appreciated.
----
Lastly, we have a bitchin' new about screen. It unfortunately adds
~200KiB onto the binary size, because the PNG->C++ header file
transformation doesn't compress very well, and I want to keep the
original resource files in with the higan archive. I might try some
things to work around this file size increase in the future, but for now
... yeah, slightly larger archive sizes, sorry.
The logo's a bit busted on Windows (the Label control's background
transparency and alignment settings aren't working), but works well on
GTK. I'll have to fix Windows before the next official release. For now,
look on my Twitter feed if you want to see what it's supposed to look
like.
----
EDIT: forgot about ICD2::Enter. It's doing some weird inverse
run-to-save thing that I need to implement support for somehow. So, save
states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
|
|
|
auto append(Thread& thread) -> bool {
|
|
|
|
if(threads.find(&thread)) return false;
|
|
|
|
return threads.append(&thread), true;
|
|
|
|
}
|
|
|
|
|
|
|
|
auto remove(Thread& thread) -> bool {
|
|
|
|
if(auto offset = threads.find(&thread)) return threads.remove(*offset), true;
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2016-07-10 05:28:26 +00:00
|
|
|
auto enter(Mode mode_ = Mode::Run) -> Event {
|
|
|
|
mode = mode_;
|
|
|
|
host = co_active();
|
Update to v100r14 release.
byuu says:
(Windows: compile with -fpermissive to silence an annoying error. I'll
fix it in the next WIP.)
I completely replaced the time management system in higan and overhauled
the scheduler.
Before, processor threads would have "int64 clock"; and there would
be a 1:1 relationship between two threads. When thread A ran for X
cycles, it'd subtract X * B.Frequency from clock; and when thread B ran
for Y cycles, it'd add Y * A.Frequency from clock. This worked well
and allowed perfect precision; but it doesn't work when you have more
complicated relationships: eg the 68K can sync to the Z80 and PSG; the
Z80 to the 68K and PSG; so the PSG needs two counters.
The new system instead uses a "uint64 clock" variable that represents
time in attoseconds. Every time the scheduler exits, it subtracts
the smallest clock count from all threads, to prevent an overflow
scenario. The only real downside is that rounding errors mean that
roughly every 20 minutes, we have a rounding error of one clock cycle
(one 20,000,000th of a second.) However, this only applies to systems
with multiple oscillators, like the SNES. And when you're in that
situation ... there's no such thing as a perfect oscillator anyway. A
real SNES will be thousands of times less out of spec than 1hz per 20
minutes.
The advantages are pretty immense. First, we obviously can now support
more complex relationships between threads. Second, we can build a
much more abstracted scheduler. All of libco is now abstracted away
completely, which may permit a state-machine / coroutine version of
Thread in the future. We've basically gone from this:
auto SMP::step(uint clocks) -> void {
clock += clocks * (uint64)cpu.frequency;
dsp.clock -= clocks;
if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread);
if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread);
}
To this:
auto SMP::step(uint clocks) -> void {
Thread::step(clocks);
synchronize(dsp);
synchronize(cpu);
}
As you can see, we don't have to do multiple clock adjustments anymore.
This is a huge win for the SNES CPU that had to update the SMP, DSP, all
peripherals and all coprocessors. Likewise, we don't have to synchronize
all coprocessors when one runs, now we can just synchronize the active
one to the CPU.
Third, when changing the frequencies of threads (think SGB speed setting
modes, GBC double-speed mode, etc), it no longer causes the "int64
clock" value to be erroneous.
Fourth, this results in a fairly decent speedup, mostly across the
board. Aside from the GBA being mostly a wash (for unknown reasons),
it's about an 8% - 12% speedup in every other emulation core.
Now, all of this said ... this was an unbelievably massive change, so
... you know what that means >_> If anyone can help test all types of
SNES coprocessors, and some other system games, it'd be appreciated.
----
Lastly, we have a bitchin' new about screen. It unfortunately adds
~200KiB onto the binary size, because the PNG->C++ header file
transformation doesn't compress very well, and I want to keep the
original resource files in with the higan archive. I might try some
things to work around this file size increase in the future, but for now
... yeah, slightly larger archive sizes, sorry.
The logo's a bit busted on Windows (the Label control's background
transparency and alignment settings aren't working), but works well on
GTK. I'll have to fix Windows before the next official release. For now,
look on my Twitter feed if you want to see what it's supposed to look
like.
----
EDIT: forgot about ICD2::Enter. It's doing some weird inverse
run-to-save thing that I need to implement support for somehow. So, save
states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
|
|
|
co_switch(_resume);
|
2016-07-10 05:28:26 +00:00
|
|
|
return event;
|
|
|
|
}
|
|
|
|
|
Update to v100r14 release.
byuu says:
(Windows: compile with -fpermissive to silence an annoying error. I'll
fix it in the next WIP.)
I completely replaced the time management system in higan and overhauled
the scheduler.
Before, processor threads would have "int64 clock"; and there would
be a 1:1 relationship between two threads. When thread A ran for X
cycles, it'd subtract X * B.Frequency from clock; and when thread B ran
for Y cycles, it'd add Y * A.Frequency from clock. This worked well
and allowed perfect precision; but it doesn't work when you have more
complicated relationships: eg the 68K can sync to the Z80 and PSG; the
Z80 to the 68K and PSG; so the PSG needs two counters.
The new system instead uses a "uint64 clock" variable that represents
time in attoseconds. Every time the scheduler exits, it subtracts
the smallest clock count from all threads, to prevent an overflow
scenario. The only real downside is that rounding errors mean that
roughly every 20 minutes, we have a rounding error of one clock cycle
(one 20,000,000th of a second.) However, this only applies to systems
with multiple oscillators, like the SNES. And when you're in that
situation ... there's no such thing as a perfect oscillator anyway. A
real SNES will be thousands of times less out of spec than 1hz per 20
minutes.
The advantages are pretty immense. First, we obviously can now support
more complex relationships between threads. Second, we can build a
much more abstracted scheduler. All of libco is now abstracted away
completely, which may permit a state-machine / coroutine version of
Thread in the future. We've basically gone from this:
auto SMP::step(uint clocks) -> void {
clock += clocks * (uint64)cpu.frequency;
dsp.clock -= clocks;
if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread);
if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread);
}
To this:
auto SMP::step(uint clocks) -> void {
Thread::step(clocks);
synchronize(dsp);
synchronize(cpu);
}
As you can see, we don't have to do multiple clock adjustments anymore.
This is a huge win for the SNES CPU that had to update the SMP, DSP, all
peripherals and all coprocessors. Likewise, we don't have to synchronize
all coprocessors when one runs, now we can just synchronize the active
one to the CPU.
Third, when changing the frequencies of threads (think SGB speed setting
modes, GBC double-speed mode, etc), it no longer causes the "int64
clock" value to be erroneous.
Fourth, this results in a fairly decent speedup, mostly across the
board. Aside from the GBA being mostly a wash (for unknown reasons),
it's about an 8% - 12% speedup in every other emulation core.
Now, all of this said ... this was an unbelievably massive change, so
... you know what that means >_> If anyone can help test all types of
SNES coprocessors, and some other system games, it'd be appreciated.
----
Lastly, we have a bitchin' new about screen. It unfortunately adds
~200KiB onto the binary size, because the PNG->C++ header file
transformation doesn't compress very well, and I want to keep the
original resource files in with the higan archive. I might try some
things to work around this file size increase in the future, but for now
... yeah, slightly larger archive sizes, sorry.
The logo's a bit busted on Windows (the Label control's background
transparency and alignment settings aren't working), but works well on
GTK. I'll have to fix Windows before the next official release. For now,
look on my Twitter feed if you want to see what it's supposed to look
like.
----
EDIT: forgot about ICD2::Enter. It's doing some weird inverse
run-to-save thing that I need to implement support for somehow. So, save
states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
|
|
|
inline auto resume(Thread& thread) -> void {
|
|
|
|
if(mode != Mode::SynchronizeSlave) co_switch(thread.handle());
|
|
|
|
}
|
|
|
|
|
2016-07-10 05:28:26 +00:00
|
|
|
auto exit(Event event_) -> void {
|
Update to v100r14 release.
byuu says:
(Windows: compile with -fpermissive to silence an annoying error. I'll
fix it in the next WIP.)
I completely replaced the time management system in higan and overhauled
the scheduler.
Before, processor threads would have "int64 clock"; and there would
be a 1:1 relationship between two threads. When thread A ran for X
cycles, it'd subtract X * B.Frequency from clock; and when thread B ran
for Y cycles, it'd add Y * A.Frequency from clock. This worked well
and allowed perfect precision; but it doesn't work when you have more
complicated relationships: eg the 68K can sync to the Z80 and PSG; the
Z80 to the 68K and PSG; so the PSG needs two counters.
The new system instead uses a "uint64 clock" variable that represents
time in attoseconds. Every time the scheduler exits, it subtracts
the smallest clock count from all threads, to prevent an overflow
scenario. The only real downside is that rounding errors mean that
roughly every 20 minutes, we have a rounding error of one clock cycle
(one 20,000,000th of a second.) However, this only applies to systems
with multiple oscillators, like the SNES. And when you're in that
situation ... there's no such thing as a perfect oscillator anyway. A
real SNES will be thousands of times less out of spec than 1hz per 20
minutes.
The advantages are pretty immense. First, we obviously can now support
more complex relationships between threads. Second, we can build a
much more abstracted scheduler. All of libco is now abstracted away
completely, which may permit a state-machine / coroutine version of
Thread in the future. We've basically gone from this:
auto SMP::step(uint clocks) -> void {
clock += clocks * (uint64)cpu.frequency;
dsp.clock -= clocks;
if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread);
if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread);
}
To this:
auto SMP::step(uint clocks) -> void {
Thread::step(clocks);
synchronize(dsp);
synchronize(cpu);
}
As you can see, we don't have to do multiple clock adjustments anymore.
This is a huge win for the SNES CPU that had to update the SMP, DSP, all
peripherals and all coprocessors. Likewise, we don't have to synchronize
all coprocessors when one runs, now we can just synchronize the active
one to the CPU.
Third, when changing the frequencies of threads (think SGB speed setting
modes, GBC double-speed mode, etc), it no longer causes the "int64
clock" value to be erroneous.
Fourth, this results in a fairly decent speedup, mostly across the
board. Aside from the GBA being mostly a wash (for unknown reasons),
it's about an 8% - 12% speedup in every other emulation core.
Now, all of this said ... this was an unbelievably massive change, so
... you know what that means >_> If anyone can help test all types of
SNES coprocessors, and some other system games, it'd be appreciated.
----
Lastly, we have a bitchin' new about screen. It unfortunately adds
~200KiB onto the binary size, because the PNG->C++ header file
transformation doesn't compress very well, and I want to keep the
original resource files in with the higan archive. I might try some
things to work around this file size increase in the future, but for now
... yeah, slightly larger archive sizes, sorry.
The logo's a bit busted on Windows (the Label control's background
transparency and alignment settings aren't working), but works well on
GTK. I'll have to fix Windows before the next official release. For now,
look on my Twitter feed if you want to see what it's supposed to look
like.
----
EDIT: forgot about ICD2::Enter. It's doing some weird inverse
run-to-save thing that I need to implement support for somehow. So, save
states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
|
|
|
uint64 minimum = ~0ull >> 1;
|
|
|
|
for(auto thread : threads) {
|
|
|
|
if(thread->_clock < minimum) minimum = thread->_clock;
|
|
|
|
}
|
|
|
|
for(auto thread : threads) {
|
|
|
|
thread->_clock -= minimum;
|
|
|
|
}
|
|
|
|
|
2016-07-10 05:28:26 +00:00
|
|
|
event = event_;
|
Update to v100r14 release.
byuu says:
(Windows: compile with -fpermissive to silence an annoying error. I'll
fix it in the next WIP.)
I completely replaced the time management system in higan and overhauled
the scheduler.
Before, processor threads would have "int64 clock"; and there would
be a 1:1 relationship between two threads. When thread A ran for X
cycles, it'd subtract X * B.Frequency from clock; and when thread B ran
for Y cycles, it'd add Y * A.Frequency from clock. This worked well
and allowed perfect precision; but it doesn't work when you have more
complicated relationships: eg the 68K can sync to the Z80 and PSG; the
Z80 to the 68K and PSG; so the PSG needs two counters.
The new system instead uses a "uint64 clock" variable that represents
time in attoseconds. Every time the scheduler exits, it subtracts
the smallest clock count from all threads, to prevent an overflow
scenario. The only real downside is that rounding errors mean that
roughly every 20 minutes, we have a rounding error of one clock cycle
(one 20,000,000th of a second.) However, this only applies to systems
with multiple oscillators, like the SNES. And when you're in that
situation ... there's no such thing as a perfect oscillator anyway. A
real SNES will be thousands of times less out of spec than 1hz per 20
minutes.
The advantages are pretty immense. First, we obviously can now support
more complex relationships between threads. Second, we can build a
much more abstracted scheduler. All of libco is now abstracted away
completely, which may permit a state-machine / coroutine version of
Thread in the future. We've basically gone from this:
auto SMP::step(uint clocks) -> void {
clock += clocks * (uint64)cpu.frequency;
dsp.clock -= clocks;
if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread);
if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread);
}
To this:
auto SMP::step(uint clocks) -> void {
Thread::step(clocks);
synchronize(dsp);
synchronize(cpu);
}
As you can see, we don't have to do multiple clock adjustments anymore.
This is a huge win for the SNES CPU that had to update the SMP, DSP, all
peripherals and all coprocessors. Likewise, we don't have to synchronize
all coprocessors when one runs, now we can just synchronize the active
one to the CPU.
Third, when changing the frequencies of threads (think SGB speed setting
modes, GBC double-speed mode, etc), it no longer causes the "int64
clock" value to be erroneous.
Fourth, this results in a fairly decent speedup, mostly across the
board. Aside from the GBA being mostly a wash (for unknown reasons),
it's about an 8% - 12% speedup in every other emulation core.
Now, all of this said ... this was an unbelievably massive change, so
... you know what that means >_> If anyone can help test all types of
SNES coprocessors, and some other system games, it'd be appreciated.
----
Lastly, we have a bitchin' new about screen. It unfortunately adds
~200KiB onto the binary size, because the PNG->C++ header file
transformation doesn't compress very well, and I want to keep the
original resource files in with the higan archive. I might try some
things to work around this file size increase in the future, but for now
... yeah, slightly larger archive sizes, sorry.
The logo's a bit busted on Windows (the Label control's background
transparency and alignment settings aren't working), but works well on
GTK. I'll have to fix Windows before the next official release. For now,
look on my Twitter feed if you want to see what it's supposed to look
like.
----
EDIT: forgot about ICD2::Enter. It's doing some weird inverse
run-to-save thing that I need to implement support for somehow. So, save
states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
|
|
|
_resume = co_active();
|
2016-07-10 05:28:26 +00:00
|
|
|
co_switch(host);
|
|
|
|
}
|
|
|
|
|
Update to v100r14 release.
byuu says:
(Windows: compile with -fpermissive to silence an annoying error. I'll
fix it in the next WIP.)
I completely replaced the time management system in higan and overhauled
the scheduler.
Before, processor threads would have "int64 clock"; and there would
be a 1:1 relationship between two threads. When thread A ran for X
cycles, it'd subtract X * B.Frequency from clock; and when thread B ran
for Y cycles, it'd add Y * A.Frequency from clock. This worked well
and allowed perfect precision; but it doesn't work when you have more
complicated relationships: eg the 68K can sync to the Z80 and PSG; the
Z80 to the 68K and PSG; so the PSG needs two counters.
The new system instead uses a "uint64 clock" variable that represents
time in attoseconds. Every time the scheduler exits, it subtracts
the smallest clock count from all threads, to prevent an overflow
scenario. The only real downside is that rounding errors mean that
roughly every 20 minutes, we have a rounding error of one clock cycle
(one 20,000,000th of a second.) However, this only applies to systems
with multiple oscillators, like the SNES. And when you're in that
situation ... there's no such thing as a perfect oscillator anyway. A
real SNES will be thousands of times less out of spec than 1hz per 20
minutes.
The advantages are pretty immense. First, we obviously can now support
more complex relationships between threads. Second, we can build a
much more abstracted scheduler. All of libco is now abstracted away
completely, which may permit a state-machine / coroutine version of
Thread in the future. We've basically gone from this:
auto SMP::step(uint clocks) -> void {
clock += clocks * (uint64)cpu.frequency;
dsp.clock -= clocks;
if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread);
if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread);
}
To this:
auto SMP::step(uint clocks) -> void {
Thread::step(clocks);
synchronize(dsp);
synchronize(cpu);
}
As you can see, we don't have to do multiple clock adjustments anymore.
This is a huge win for the SNES CPU that had to update the SMP, DSP, all
peripherals and all coprocessors. Likewise, we don't have to synchronize
all coprocessors when one runs, now we can just synchronize the active
one to the CPU.
Third, when changing the frequencies of threads (think SGB speed setting
modes, GBC double-speed mode, etc), it no longer causes the "int64
clock" value to be erroneous.
Fourth, this results in a fairly decent speedup, mostly across the
board. Aside from the GBA being mostly a wash (for unknown reasons),
it's about an 8% - 12% speedup in every other emulation core.
Now, all of this said ... this was an unbelievably massive change, so
... you know what that means >_> If anyone can help test all types of
SNES coprocessors, and some other system games, it'd be appreciated.
----
Lastly, we have a bitchin' new about screen. It unfortunately adds
~200KiB onto the binary size, because the PNG->C++ header file
transformation doesn't compress very well, and I want to keep the
original resource files in with the higan archive. I might try some
things to work around this file size increase in the future, but for now
... yeah, slightly larger archive sizes, sorry.
The logo's a bit busted on Windows (the Label control's background
transparency and alignment settings aren't working), but works well on
GTK. I'll have to fix Windows before the next official release. For now,
look on my Twitter feed if you want to see what it's supposed to look
like.
----
EDIT: forgot about ICD2::Enter. It's doing some weird inverse
run-to-save thing that I need to implement support for somehow. So, save
states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
|
|
|
auto synchronize(Thread& thread) -> void {
|
|
|
|
if(thread.handle() == master) {
|
2016-07-10 05:28:26 +00:00
|
|
|
while(enter(Mode::SynchronizeMaster) != Event::Synchronize);
|
|
|
|
} else {
|
Update to v100r14 release.
byuu says:
(Windows: compile with -fpermissive to silence an annoying error. I'll
fix it in the next WIP.)
I completely replaced the time management system in higan and overhauled
the scheduler.
Before, processor threads would have "int64 clock"; and there would
be a 1:1 relationship between two threads. When thread A ran for X
cycles, it'd subtract X * B.Frequency from clock; and when thread B ran
for Y cycles, it'd add Y * A.Frequency from clock. This worked well
and allowed perfect precision; but it doesn't work when you have more
complicated relationships: eg the 68K can sync to the Z80 and PSG; the
Z80 to the 68K and PSG; so the PSG needs two counters.
The new system instead uses a "uint64 clock" variable that represents
time in attoseconds. Every time the scheduler exits, it subtracts
the smallest clock count from all threads, to prevent an overflow
scenario. The only real downside is that rounding errors mean that
roughly every 20 minutes, we have a rounding error of one clock cycle
(one 20,000,000th of a second.) However, this only applies to systems
with multiple oscillators, like the SNES. And when you're in that
situation ... there's no such thing as a perfect oscillator anyway. A
real SNES will be thousands of times less out of spec than 1hz per 20
minutes.
The advantages are pretty immense. First, we obviously can now support
more complex relationships between threads. Second, we can build a
much more abstracted scheduler. All of libco is now abstracted away
completely, which may permit a state-machine / coroutine version of
Thread in the future. We've basically gone from this:
auto SMP::step(uint clocks) -> void {
clock += clocks * (uint64)cpu.frequency;
dsp.clock -= clocks;
if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread);
if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread);
}
To this:
auto SMP::step(uint clocks) -> void {
Thread::step(clocks);
synchronize(dsp);
synchronize(cpu);
}
As you can see, we don't have to do multiple clock adjustments anymore.
This is a huge win for the SNES CPU that had to update the SMP, DSP, all
peripherals and all coprocessors. Likewise, we don't have to synchronize
all coprocessors when one runs, now we can just synchronize the active
one to the CPU.
Third, when changing the frequencies of threads (think SGB speed setting
modes, GBC double-speed mode, etc), it no longer causes the "int64
clock" value to be erroneous.
Fourth, this results in a fairly decent speedup, mostly across the
board. Aside from the GBA being mostly a wash (for unknown reasons),
it's about an 8% - 12% speedup in every other emulation core.
Now, all of this said ... this was an unbelievably massive change, so
... you know what that means >_> If anyone can help test all types of
SNES coprocessors, and some other system games, it'd be appreciated.
----
Lastly, we have a bitchin' new about screen. It unfortunately adds
~200KiB onto the binary size, because the PNG->C++ header file
transformation doesn't compress very well, and I want to keep the
original resource files in with the higan archive. I might try some
things to work around this file size increase in the future, but for now
... yeah, slightly larger archive sizes, sorry.
The logo's a bit busted on Windows (the Label control's background
transparency and alignment settings aren't working), but works well on
GTK. I'll have to fix Windows before the next official release. For now,
look on my Twitter feed if you want to see what it's supposed to look
like.
----
EDIT: forgot about ICD2::Enter. It's doing some weird inverse
run-to-save thing that I need to implement support for somehow. So, save
states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
|
|
|
_resume = thread.handle();
|
2016-07-10 05:28:26 +00:00
|
|
|
while(enter(Mode::SynchronizeSlave) != Event::Synchronize);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
auto synchronize() -> void {
|
|
|
|
if(co_active() == master) {
|
|
|
|
if(mode == Mode::SynchronizeMaster) return exit(Event::Synchronize);
|
|
|
|
} else {
|
|
|
|
if(mode == Mode::SynchronizeSlave) return exit(Event::Synchronize);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
private:
|
|
|
|
cothread_t host = nullptr; //program thread (used to exit scheduler)
|
Update to v100r14 release.
byuu says:
(Windows: compile with -fpermissive to silence an annoying error. I'll
fix it in the next WIP.)
I completely replaced the time management system in higan and overhauled
the scheduler.
Before, processor threads would have "int64 clock"; and there would
be a 1:1 relationship between two threads. When thread A ran for X
cycles, it'd subtract X * B.Frequency from clock; and when thread B ran
for Y cycles, it'd add Y * A.Frequency from clock. This worked well
and allowed perfect precision; but it doesn't work when you have more
complicated relationships: eg the 68K can sync to the Z80 and PSG; the
Z80 to the 68K and PSG; so the PSG needs two counters.
The new system instead uses a "uint64 clock" variable that represents
time in attoseconds. Every time the scheduler exits, it subtracts
the smallest clock count from all threads, to prevent an overflow
scenario. The only real downside is that rounding errors mean that
roughly every 20 minutes, we have a rounding error of one clock cycle
(one 20,000,000th of a second.) However, this only applies to systems
with multiple oscillators, like the SNES. And when you're in that
situation ... there's no such thing as a perfect oscillator anyway. A
real SNES will be thousands of times less out of spec than 1hz per 20
minutes.
The advantages are pretty immense. First, we obviously can now support
more complex relationships between threads. Second, we can build a
much more abstracted scheduler. All of libco is now abstracted away
completely, which may permit a state-machine / coroutine version of
Thread in the future. We've basically gone from this:
auto SMP::step(uint clocks) -> void {
clock += clocks * (uint64)cpu.frequency;
dsp.clock -= clocks;
if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread);
if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread);
}
To this:
auto SMP::step(uint clocks) -> void {
Thread::step(clocks);
synchronize(dsp);
synchronize(cpu);
}
As you can see, we don't have to do multiple clock adjustments anymore.
This is a huge win for the SNES CPU that had to update the SMP, DSP, all
peripherals and all coprocessors. Likewise, we don't have to synchronize
all coprocessors when one runs, now we can just synchronize the active
one to the CPU.
Third, when changing the frequencies of threads (think SGB speed setting
modes, GBC double-speed mode, etc), it no longer causes the "int64
clock" value to be erroneous.
Fourth, this results in a fairly decent speedup, mostly across the
board. Aside from the GBA being mostly a wash (for unknown reasons),
it's about an 8% - 12% speedup in every other emulation core.
Now, all of this said ... this was an unbelievably massive change, so
... you know what that means >_> If anyone can help test all types of
SNES coprocessors, and some other system games, it'd be appreciated.
----
Lastly, we have a bitchin' new about screen. It unfortunately adds
~200KiB onto the binary size, because the PNG->C++ header file
transformation doesn't compress very well, and I want to keep the
original resource files in with the higan archive. I might try some
things to work around this file size increase in the future, but for now
... yeah, slightly larger archive sizes, sorry.
The logo's a bit busted on Windows (the Label control's background
transparency and alignment settings aren't working), but works well on
GTK. I'll have to fix Windows before the next official release. For now,
look on my Twitter feed if you want to see what it's supposed to look
like.
----
EDIT: forgot about ICD2::Enter. It's doing some weird inverse
run-to-save thing that I need to implement support for somehow. So, save
states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
|
|
|
cothread_t _resume = nullptr; //resume thread (used to enter scheduler)
|
2016-07-10 05:28:26 +00:00
|
|
|
cothread_t master = nullptr; //primary thread (used to synchronize components)
|
|
|
|
Mode mode = Mode::Run;
|
|
|
|
Event event = Event::Step;
|
Update to v100r14 release.
byuu says:
(Windows: compile with -fpermissive to silence an annoying error. I'll
fix it in the next WIP.)
I completely replaced the time management system in higan and overhauled
the scheduler.
Before, processor threads would have "int64 clock"; and there would
be a 1:1 relationship between two threads. When thread A ran for X
cycles, it'd subtract X * B.Frequency from clock; and when thread B ran
for Y cycles, it'd add Y * A.Frequency from clock. This worked well
and allowed perfect precision; but it doesn't work when you have more
complicated relationships: eg the 68K can sync to the Z80 and PSG; the
Z80 to the 68K and PSG; so the PSG needs two counters.
The new system instead uses a "uint64 clock" variable that represents
time in attoseconds. Every time the scheduler exits, it subtracts
the smallest clock count from all threads, to prevent an overflow
scenario. The only real downside is that rounding errors mean that
roughly every 20 minutes, we have a rounding error of one clock cycle
(one 20,000,000th of a second.) However, this only applies to systems
with multiple oscillators, like the SNES. And when you're in that
situation ... there's no such thing as a perfect oscillator anyway. A
real SNES will be thousands of times less out of spec than 1hz per 20
minutes.
The advantages are pretty immense. First, we obviously can now support
more complex relationships between threads. Second, we can build a
much more abstracted scheduler. All of libco is now abstracted away
completely, which may permit a state-machine / coroutine version of
Thread in the future. We've basically gone from this:
auto SMP::step(uint clocks) -> void {
clock += clocks * (uint64)cpu.frequency;
dsp.clock -= clocks;
if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread);
if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread);
}
To this:
auto SMP::step(uint clocks) -> void {
Thread::step(clocks);
synchronize(dsp);
synchronize(cpu);
}
As you can see, we don't have to do multiple clock adjustments anymore.
This is a huge win for the SNES CPU that had to update the SMP, DSP, all
peripherals and all coprocessors. Likewise, we don't have to synchronize
all coprocessors when one runs, now we can just synchronize the active
one to the CPU.
Third, when changing the frequencies of threads (think SGB speed setting
modes, GBC double-speed mode, etc), it no longer causes the "int64
clock" value to be erroneous.
Fourth, this results in a fairly decent speedup, mostly across the
board. Aside from the GBA being mostly a wash (for unknown reasons),
it's about an 8% - 12% speedup in every other emulation core.
Now, all of this said ... this was an unbelievably massive change, so
... you know what that means >_> If anyone can help test all types of
SNES coprocessors, and some other system games, it'd be appreciated.
----
Lastly, we have a bitchin' new about screen. It unfortunately adds
~200KiB onto the binary size, because the PNG->C++ header file
transformation doesn't compress very well, and I want to keep the
original resource files in with the higan archive. I might try some
things to work around this file size increase in the future, but for now
... yeah, slightly larger archive sizes, sorry.
The logo's a bit busted on Windows (the Label control's background
transparency and alignment settings aren't working), but works well on
GTK. I'll have to fix Windows before the next official release. For now,
look on my Twitter feed if you want to see what it's supposed to look
like.
----
EDIT: forgot about ICD2::Enter. It's doing some weird inverse
run-to-save thing that I need to implement support for somehow. So, save
states on the SGB core probably won't work with this WIP.
2016-07-30 03:56:12 +00:00
|
|
|
vector<Thread*> threads;
|
2016-07-10 05:28:26 +00:00
|
|
|
};
|
|
|
|
|
|
|
|
}
|