Post by gtoal on Oct 23, 2019 22:15:41 GMT -5
Traditional static disassemblers generally don't do a very good job of labeling data areas.
Some of them have trouble even determining what is data and what is code, though a semi-intelligent disassembler will do a tree-walk of code areas from known entry points to find areas of a program that it knows for sure are code. However indirect jumps and other tricks can mean that some of the code areas don't get discovered. Data areas are generally opaque but for a good disassembly we need to know at least when a data item is a two-byte value and whether it represents a code or data address.
Some of these issues are important if you're building a static binary translator (or decompiler), others are more important for a disassembly, especially if you need to modify the disassembly and re-assemble the program with changes. Inserting a NOP at the start of the code or an extra byte of data at the start of the data area can completely break a re-assembly if done naively.
On the 6809 it's not actually possible to statically make a complete labelled symbolic disassembly because of the DP register - data addresses can represent different areas dependent on the value of DP. In a binary translator that's not so much of an issue as you're just accessing a numeric offset but in a disassembler you want the name of the location so that if you move the data around and reassemble, everything still works.
However there is a way to get all the information that a disassembler or translator needs - something that was not possible a few years ago when computer memory was at a premium.
We can use an emulator to dynamically profile the execution.
What we do is have the emulator keep arrays of information, with an array element for each byte of the ROM (or we might as well do it for each byte of the machine's address space to keep the code simple at the expense of using a lot more RAM - of which we have gigabytes available nowadays so who cares?)
This info is kept in a database for each program being profiled. It is dumped at the end of a run (or incrementally during a run for safety in case of emulator crashes or forced exits) and the last dump is loaded the next time you run the same program so that the data is added to incrementally. If possible the data collection should be done by all users of the emulator and shared. If done efficiently, this data collection will not affect the speed of the emulation and should allow users to run emulated games, programs etc as they normally do so that there can be a widespread collection of data by a user community, not just a one-off by a programmer creating a disassembly.
The information that needs to be captured on an emulation run includes at least these details: (There may be more needed)
This is an instruction that was executed. (record *all* values of DP seen when executing at this address.)
This is an parameter byte of an instruction that was executed (don't care too much which byte of parameter since easy to rediscover)
This instruction was explicitly jumped to (bra, jmp, jsr etc)
This instruction was indirectly jumped to (jmp (blah), jmp blah,x)
This instruction was indirectly jumped to (rts) (always needed for binary translator, needed by disassembler if return address was modified)
This instruction was indirectly jumped to (rti) (for modified return address, not for return from real interrupt)
This address was loaded from directly as a single byte
This address was loaded from directly as a double byte
This address was loaded from indirectly as a single byte
This address was loaded from indirectly as a double byte
This address was used as the base of a single byte indexed fetch - save lowest and highest offsets, useful in reconstructing data tables
This address was used as the base of a double byte indexed fetch - save lowest and highest offset
This address and the following byte contained the values of an indirect jump target address
This address and the following byte contained the values of an indirect data pointer address
Profiling information such as the number of times a code or data address was fetched could be useful too.
for each instruction address (opcode byte), record the values of each register: (DP handled differently; and record CC bits separately...)
- initialise to "FEEDBEEF"
- when executed, if cache is "FEEDBEEF" set to value of the register
if cache is not "FEEDBEEF" and not the same as the value, set to "DEADBEEF"
after execution, the cache will contain whether a register is always constant at that point in code (we might as well use 4 byte integers in our arrays - faster, easier to work with, and we can afford the RAM)
The register contents info isn't critical to disassembly and not strictly needed in a binary translation (although it *can* be used for optimisation), but if known, it can help a lot with commenting the disassembly.
I'm not asking any of our emulator authors (Vide, VecX, Vectrexy, Mame etc) to add these facilities but if you find yourselves at some point adding some subset of these features for some need of your own, do consider the bigger picture and think about adding all of them at the same time. (I've added this to my own ever-increasing job queue but it is *way* *way* down that stack...)
I'm pretty sure that with an instrumented emulator like this and suitable back-end code, we could do really good disassemblies of all the old Vectrex code in a way that would make it reassemblable, not to mention decompiling to C etc for retargetting to other architectures.
(Of course I'ld like to see something similar in general purpose emulators such as Mame to make retargetting of arcade games simpler too)
Graham
Some of them have trouble even determining what is data and what is code, though a semi-intelligent disassembler will do a tree-walk of code areas from known entry points to find areas of a program that it knows for sure are code. However indirect jumps and other tricks can mean that some of the code areas don't get discovered. Data areas are generally opaque but for a good disassembly we need to know at least when a data item is a two-byte value and whether it represents a code or data address.
Some of these issues are important if you're building a static binary translator (or decompiler), others are more important for a disassembly, especially if you need to modify the disassembly and re-assemble the program with changes. Inserting a NOP at the start of the code or an extra byte of data at the start of the data area can completely break a re-assembly if done naively.
On the 6809 it's not actually possible to statically make a complete labelled symbolic disassembly because of the DP register - data addresses can represent different areas dependent on the value of DP. In a binary translator that's not so much of an issue as you're just accessing a numeric offset but in a disassembler you want the name of the location so that if you move the data around and reassemble, everything still works.
However there is a way to get all the information that a disassembler or translator needs - something that was not possible a few years ago when computer memory was at a premium.
We can use an emulator to dynamically profile the execution.
What we do is have the emulator keep arrays of information, with an array element for each byte of the ROM (or we might as well do it for each byte of the machine's address space to keep the code simple at the expense of using a lot more RAM - of which we have gigabytes available nowadays so who cares?)
This info is kept in a database for each program being profiled. It is dumped at the end of a run (or incrementally during a run for safety in case of emulator crashes or forced exits) and the last dump is loaded the next time you run the same program so that the data is added to incrementally. If possible the data collection should be done by all users of the emulator and shared. If done efficiently, this data collection will not affect the speed of the emulation and should allow users to run emulated games, programs etc as they normally do so that there can be a widespread collection of data by a user community, not just a one-off by a programmer creating a disassembly.
The information that needs to be captured on an emulation run includes at least these details: (There may be more needed)
This is an instruction that was executed. (record *all* values of DP seen when executing at this address.)
This is an parameter byte of an instruction that was executed (don't care too much which byte of parameter since easy to rediscover)
This instruction was explicitly jumped to (bra, jmp, jsr etc)
This instruction was indirectly jumped to (jmp (blah), jmp blah,x)
This instruction was indirectly jumped to (rts) (always needed for binary translator, needed by disassembler if return address was modified)
This instruction was indirectly jumped to (rti) (for modified return address, not for return from real interrupt)
This address was loaded from directly as a single byte
This address was loaded from directly as a double byte
This address was loaded from indirectly as a single byte
This address was loaded from indirectly as a double byte
This address was used as the base of a single byte indexed fetch - save lowest and highest offsets, useful in reconstructing data tables
This address was used as the base of a double byte indexed fetch - save lowest and highest offset
This address and the following byte contained the values of an indirect jump target address
This address and the following byte contained the values of an indirect data pointer address
Profiling information such as the number of times a code or data address was fetched could be useful too.
for each instruction address (opcode byte), record the values of each register: (DP handled differently; and record CC bits separately...)
- initialise to "FEEDBEEF"
- when executed, if cache is "FEEDBEEF" set to value of the register
if cache is not "FEEDBEEF" and not the same as the value, set to "DEADBEEF"
after execution, the cache will contain whether a register is always constant at that point in code (we might as well use 4 byte integers in our arrays - faster, easier to work with, and we can afford the RAM)
The register contents info isn't critical to disassembly and not strictly needed in a binary translation (although it *can* be used for optimisation), but if known, it can help a lot with commenting the disassembly.
I'm not asking any of our emulator authors (Vide, VecX, Vectrexy, Mame etc) to add these facilities but if you find yourselves at some point adding some subset of these features for some need of your own, do consider the bigger picture and think about adding all of them at the same time. (I've added this to my own ever-increasing job queue but it is *way* *way* down that stack...)
I'm pretty sure that with an instrumented emulator like this and suitable back-end code, we could do really good disassemblies of all the old Vectrex code in a way that would make it reassemblable, not to mention decompiling to C etc for retargetting to other architectures.
(Of course I'ld like to see something similar in general purpose emulators such as Mame to make retargetting of arcade games simpler too)
Graham