First of all, note there is a difference between disassembling and decompiling:
- disassembly is basically just the straight machine code, displayed "conveniently" as semi-readable instructions, with some symbol lookups to help with understanding. It's relatively straightforward to implement, but obviously the result is still difficult to understand.
- decompiling is a return to source code, and is a much more difficult task - well, it's essentially impossible - some of the information (such as the names of your variables) are impossible to recover.
The
freesci website essentially has documentation of everything you need to disassemble/decompile scripts (subject to the limitations I mentioned above). That is, a description of what every assembly instruction does.
Assuming you have the disassembly for a script (SCI Companion does this, and I think Brian had a tool that did it too), these are the challenges:
- You'll need to look at patterns in the branching and jump assembly instructions, and convert them to: if statements, for loops, while statements, switch statements, etc... It's not always obvious what higher level construct they would map to, so this is a bit of a judgement call
- The arithmetic instructions are pretty straightforward to convert to the higher level operators. Pretty much a 1-to-1 mapping.
- You'll need to become familiar with the all the assignment (load/store) instructions to know whether its a script-local, global, function parameter, or function-local variable that's being used.
- The method calls are a little tricky - if I remember correctly I think in the freesci documentation I listed above there were a few errors about how the "send" instructions worker. But in general, by looking what gets pushed on the stack prior to the send instruction, you can figure out what parameters are being passed.
I got reasonably far along in writing a decompiler (the branching stuff was somewhat challenging, but I think I got that working pretty well in the end). I know I quit at some point though, so I must have run into some big remaining hurdles. There is a c++ file for the decompiler in the SCI Companion source code I provided, with a comment on top saying something like "abandoned attempt at writing a decompiler".
The one time I recall decompiling by hand was to create the Avoid script. Brian left this out of the original template game, even though many SCI games used "avoiders". I remember it being a pretty tedious process to do by hand.