2018-08-21 03:17:12 +00:00
|
|
|
#pragma once
|
|
|
|
|
|
|
|
//burrows-wheeler transform
|
|
|
|
|
|
|
|
#include <nall/suffix-array.hpp>
|
|
|
|
|
2019-01-16 00:46:42 +00:00
|
|
|
namespace nall::Encode {
|
2018-08-21 03:17:12 +00:00
|
|
|
|
Update to v106r59 release.
byuu says:
Changelog:
- fixed bug in Emulator::Game::Memory::operator bool()
- nall: renamed view<string> back to `string_view`
- nall:: implemented `array_view`
- Game Boy: split cartridge-specific input mappings (rumble,
accelerometer) to their own separate ports
- Game Boy: fixed MBC7 accelerometer x-axis
- icarus: Game Boy, Super Famicom, Mega Drive cores output internal
header game titles to heuristics manifests
- higan, icarus, hiro/gtk: improve viewport geometry configuration;
fixed higan crashing bug with XShm driver
- higan: connect Video::poll(),update() functionality
- hiro, ruby: several compilation / bugfixes, should get the macOS
port compiling again, hopefully [Sintendo]
- ruby/video/xshm: fix crashing bug on window resize
- a bit hacky; it's throwing BadAccess Xlib warnings, but they're
not fatal, so I am catching and ignoring them
- bsnes: removed Application::Windows::onModalChange hook that's no
longer needed [Screwtape]
2018-08-26 06:49:54 +00:00
|
|
|
/*
|
|
|
|
A standard suffix array cannot produce a proper burrows-wheeler transform, due to rotations.
|
|
|
|
|
|
|
|
Take the input string, "nall", this gives us:
|
|
|
|
nall
|
|
|
|
alln
|
|
|
|
llna
|
|
|
|
lnal
|
|
|
|
|
|
|
|
If we suffix sort this, we produce:
|
|
|
|
all => alln
|
|
|
|
l => lnal
|
|
|
|
ll => llna
|
|
|
|
nall => nall
|
|
|
|
|
|
|
|
If we sort this, we produce:
|
|
|
|
alln
|
|
|
|
llna
|
|
|
|
lnal
|
|
|
|
nall
|
|
|
|
|
|
|
|
Thus, suffix sorting gives us "nlal" as the last column instead of "nall".
|
|
|
|
This is because BWT rotates the input string, whereas suffix arrays sort the input string.
|
|
|
|
|
|
|
|
Adding a 256th character terminator before sorting will not produce the desired result, either.
|
|
|
|
A more complicated string such as "mississippi" will sort as "ssmppissiii" with terminator=256,
|
|
|
|
and as "ipssmpissii" with terminator=0, alphabet=1..256, whereas we want "pssmipissii".
|
|
|
|
|
|
|
|
Performing a merge sort to use a specialized comparison function that wraps suffixes is too slow at O(n log n).
|
|
|
|
|
|
|
|
Producing a custom induced sort to handle rotations would be incredibly complicated,
|
|
|
|
owing to the recursive nature of induced sorting, among other things.
|
|
|
|
|
|
|
|
So instead, a temporary array is produced that contains the input suffix twice.
|
|
|
|
This is then fed into the suffix array sort, and the doubled matches are filtered out.
|
|
|
|
After this point, suffixes are sorted in their mirrored form, and the correct result can be derived
|
|
|
|
|
|
|
|
The result of this is an O(2n) algorithm, which vastly outperforms a naive O(n log n) algorithm,
|
|
|
|
but is still far from ideal. However, this will have to do until a better solution is devised.
|
|
|
|
|
|
|
|
Although to be fair, BWT is inferior to the bijective BWT anyway, so it may not be worth the effort.
|
|
|
|
*/
|
|
|
|
|
|
|
|
inline auto BWT(array_view<uint8_t> input) -> vector<uint8_t> {
|
|
|
|
auto size = input.size();
|
2018-08-21 03:17:12 +00:00
|
|
|
vector<uint8_t> output;
|
|
|
|
output.reserve(8 + 8 + size);
|
|
|
|
for(uint byte : range(8)) output.append(size >> byte * 8);
|
|
|
|
for(uint byte : range(8)) output.append(0x00);
|
|
|
|
|
Update to v106r59 release.
byuu says:
Changelog:
- fixed bug in Emulator::Game::Memory::operator bool()
- nall: renamed view<string> back to `string_view`
- nall:: implemented `array_view`
- Game Boy: split cartridge-specific input mappings (rumble,
accelerometer) to their own separate ports
- Game Boy: fixed MBC7 accelerometer x-axis
- icarus: Game Boy, Super Famicom, Mega Drive cores output internal
header game titles to heuristics manifests
- higan, icarus, hiro/gtk: improve viewport geometry configuration;
fixed higan crashing bug with XShm driver
- higan: connect Video::poll(),update() functionality
- hiro, ruby: several compilation / bugfixes, should get the macOS
port compiling again, hopefully [Sintendo]
- ruby/video/xshm: fix crashing bug on window resize
- a bit hacky; it's throwing BadAccess Xlib warnings, but they're
not fatal, so I am catching and ignoring them
- bsnes: removed Application::Windows::onModalChange hook that's no
longer needed [Screwtape]
2018-08-26 06:49:54 +00:00
|
|
|
vector<uint8_t> buffer;
|
|
|
|
buffer.reserve(2 * size);
|
|
|
|
for(uint offset : range(size)) buffer.append(input[offset]);
|
|
|
|
for(uint offset : range(size)) buffer.append(input[offset]);
|
|
|
|
|
|
|
|
auto suffixes = SuffixArray(buffer);
|
|
|
|
|
|
|
|
vector<int> prefixes;
|
|
|
|
prefixes.reserve(size);
|
|
|
|
|
|
|
|
for(uint offset : range(2 * size + 1)) {
|
|
|
|
uint suffix = suffixes[offset];
|
|
|
|
if(suffix >= size) continue; //beyond the bounds of the original input string
|
|
|
|
prefixes.append(suffix);
|
|
|
|
}
|
2018-08-21 03:17:12 +00:00
|
|
|
|
|
|
|
uint64_t root = 0;
|
|
|
|
for(uint offset : range(size)) {
|
Update to v106r59 release.
byuu says:
Changelog:
- fixed bug in Emulator::Game::Memory::operator bool()
- nall: renamed view<string> back to `string_view`
- nall:: implemented `array_view`
- Game Boy: split cartridge-specific input mappings (rumble,
accelerometer) to their own separate ports
- Game Boy: fixed MBC7 accelerometer x-axis
- icarus: Game Boy, Super Famicom, Mega Drive cores output internal
header game titles to heuristics manifests
- higan, icarus, hiro/gtk: improve viewport geometry configuration;
fixed higan crashing bug with XShm driver
- higan: connect Video::poll(),update() functionality
- hiro, ruby: several compilation / bugfixes, should get the macOS
port compiling again, hopefully [Sintendo]
- ruby/video/xshm: fix crashing bug on window resize
- a bit hacky; it's throwing BadAccess Xlib warnings, but they're
not fatal, so I am catching and ignoring them
- bsnes: removed Application::Windows::onModalChange hook that's no
longer needed [Screwtape]
2018-08-26 06:49:54 +00:00
|
|
|
uint suffix = prefixes[offset];
|
|
|
|
if(suffix == 0) root = offset, suffix = size;
|
2018-08-21 03:17:12 +00:00
|
|
|
output.append(input[--suffix]);
|
|
|
|
}
|
|
|
|
for(uint byte : range(8)) output[8 + byte] = root >> byte * 8;
|
|
|
|
|
Update to v106r59 release.
byuu says:
Changelog:
- fixed bug in Emulator::Game::Memory::operator bool()
- nall: renamed view<string> back to `string_view`
- nall:: implemented `array_view`
- Game Boy: split cartridge-specific input mappings (rumble,
accelerometer) to their own separate ports
- Game Boy: fixed MBC7 accelerometer x-axis
- icarus: Game Boy, Super Famicom, Mega Drive cores output internal
header game titles to heuristics manifests
- higan, icarus, hiro/gtk: improve viewport geometry configuration;
fixed higan crashing bug with XShm driver
- higan: connect Video::poll(),update() functionality
- hiro, ruby: several compilation / bugfixes, should get the macOS
port compiling again, hopefully [Sintendo]
- ruby/video/xshm: fix crashing bug on window resize
- a bit hacky; it's throwing BadAccess Xlib warnings, but they're
not fatal, so I am catching and ignoring them
- bsnes: removed Application::Windows::onModalChange hook that's no
longer needed [Screwtape]
2018-08-26 06:49:54 +00:00
|
|
|
return output;
|
2018-08-21 03:17:12 +00:00
|
|
|
}
|
|
|
|
|
2019-01-16 00:46:42 +00:00
|
|
|
}
|