Updated Maintaining OOVPA's for HLE function detection (markdown)
parent
9c2d5ca14a
commit
622b3a0af3
|
@ -4,16 +4,43 @@ Cxbx in it's current form, uses HLE (High Level Emulation). This roughly means,
|
|||
So, for HLE to work, Cxbx needs to prevent those hardware accesses. Cxbx does this by scanning for problematic pieces of code, and patches these with replacement code, that emulates the original behavior.
|
||||
|
||||
#Scanning
|
||||
Cxbx scans the contents of an XBE using so-called OOVPA's. This stands for Optimized (Order,Value)-Pair Array.
|
||||
It's a datastructure that was thought up by Aaron Robinson (also known as Caustik), the initiator of the Cxbx back in 2003.
|
||||
Cxbx scans the contents of an XBE using so-called OOVPA's. OOVPA stands for "Optimized (Offset, Value)-Pair Array".
|
||||
It's a data-structure that was thought up by Aaron Robinson (also known as Caustik), the initiator of Cxbx back in 2003.
|
||||
|
||||
It's initial description can be read on http://www.caustik.com/cxbx/download/progress.htm, it says:
|
||||
>In order to efficiently locate a given chunk of assembly code (i.e. a High Level Function), a database of (offset,value) pairs can be used. Offset represents the offset (in bytes) from the start of the function. Value represents the byte value at that location. With this datatype, we can locate the function by hand, and then write down important (offset,value) pairs. This process is time consuming, but very rewarding. Cxbx is able to successfully (and with no false identifications to date) identify High Level Functions inside an arbitrary XBE file. This is due to the fact that, statistically, carefully chosen (offset,value) pairs are capable of uniquely identifying relocatable code. The likelihood of falsely locating a function body is inversely proportional to the number of pairs combined with the rarity of those pairs.
|
||||
|
||||
Each OOVPA describes one unique function which originated from a specific version of a library. Cxbx uses the OOVPA to scan for the location of that function in the XBE. If the OOVPA can be matched to a location in the XBE, that location can be overwritten with a call to a replacement function, called a patch, which emulates this function.
|
||||
Each OOVPA describes one unique function which originated from a specific version of a library. Cxbx uses an OOVPA to scan for the location of that function in an XBE.
|
||||
|
||||
OOVPA's are registered in OOVPATable's. In it's current state, Cxbx contains one OOVPATable per version of a library.
|
||||
|
||||
Together with the registration of an OOVPA in a OOVPATable, a patch can be mentioned. Registrations without a patch mention are not patched (obviously) but are still useful for scanning.
|
||||
Scanning for functions using OOVPA's goes roughly like this:
|
||||
|
||||
Apart from OOVPA's, Cxbx contains a set of so-called XRef numbers (short for cross-reference numbers), which are unique ID's to indenify a function by. When scanning for functions, Cxbx records the location of each function XRef's for which the location is determined,
|
||||
Some OOVPA's are defined including an XRef number. When scanning for functions using the OOVPA's, matching locations are found. When there's a match found for an OOVPA with an XRef, the matching location is written to a list, indexed by the XRef number. Once this XRef is set, it's final, meaning that Cxbx will skip any further OOVPA mentioning that same XRef number. Also, some OOVPA's contain a XRef that must be previously detected. During scanning, if this XRef isn't set, the OOVPA is skipped and retried in a later pass. (As scanning is done in passes, repeating until no more XRef's are located.)
|
||||
* Cxbx walks through a list of OOVPA's, and for each of these, the address range is determined and scanned through.
|
||||
* For each location in the address range, all byte offsets mentioned in the OOVPA are read from the executable and checked against the value that should be there according to the OOVPA.
|
||||
* If all checks are valid, the location is considered a match for the OOVPA and scanning continues with the next OOVPA.
|
||||
* If one or more values mismatch, it's a miss, and scanning continues through the rest of the address range.
|
||||
* If the entire range is checked without finding a match, the OOVPA (or rather, the function it describes) is considered not present in the executable.
|
||||
|
||||
#Patches
|
||||
Together with the registration of an OOVPA in an OOVPATable, a replacement function, called a patch, can be mentioned. Registrations that don't mention a patch are not patched (obviously) but are still useful for scanning. (See the XRef description below).
|
||||
|
||||
If an OOVPA is matched to a location in the XBE, and the OOVPA is registered with a patch, that location is overwritten with a call to the patch, which emulates this function. Thus, the problem of hardware accesses is avoided and emulated instead.
|
||||
|
||||
#XRefs
|
||||
Apart from OOVPA's, Cxbx contains a set of so-called XRef numbers (short for cross-reference numbers), which are unique ID's to indentify a function by. Some OOVPA's are defined including one of these XRef numbers.
|
||||
|
||||
When there's a match found for an OOVPA that has an XRef, the matching location is written to a list, indexed by the XRef number. Once there's a location recorded for an XRef, it's final, meaning that Cxbx will skip a scan with any other OOVPA mentioning that same XRef number.
|
||||
|
||||
Some OOVPA's contain an XRef that must be checked for, together with the (Offset, Value)-pairs. This check requires the mentioned XRef to be previously recorded. If during scanning, XRef isn't set yet, the OOVPA is skipped and retried in a later pass. (As scanning is done in passes, repeating until no more XRef's are located.) If this XRef IS set however, the code must reference this location to be valid. If not, the OOVPA search continues looking through the executable.
|
||||
|
||||
Checking for an XRef means comparing the recorded location to the 4 bytes that are present on the mentioned offset. This is compared as a direct (absolute) reference, and as an indirect (address-relative) reference - either way, if that matches the recorded location, the XRef check holds, and all other (Offset, Value) pairs are checked. If the XRef check fails, the rest of the OOVPA is not checked, it's deemed a miss, scanning continues with the next address.
|
||||
|
||||
#Maintaining OOVPA's
|
||||
Each OOVPA must be unique from all other OOVPA's, otherwise, the same location could be matched to more than one function, which would lead to incorrectly placed patches, which leads to unpredictable behaviour, probably crashes.
|
||||
|
||||
An OOVPA is formed by choosing a few offsets in the machine code of that function, and writing down their byte values, in such a way that no other function is identifiable with these offsetted bytes.
|
||||
|
||||
The function an OOVPA scans for can be different between library versions. To get reliable emulation, Cxbx needs to contain unqiue OOVPA definitions that will match all existing versions of a function.
|
||||
|
||||
Sometimes, after a function changed in one version, it changes once more in another, later version. In some rare cases, a function might even re-appear in a prior form! In this case, the OOVPA for that re-appearance must not be copied over from an earlier version, but instead an alias must be registered. (Aliasses are simply `#define function_new_version function_old_version`)
|
Loading…
Reference in New Issue