Author Topic: SCI Decompiler?  (Read 31350 times)

0 Members and 1 Guest are viewing this topic.

Offline troflip

Re: SCI Decompiler?
« Reply #15 on: February 09, 2011, 11:16:26 PM »
First of all, note there is a difference between disassembling and decompiling:
- disassembly is basically just the straight machine code, displayed "conveniently" as semi-readable instructions, with some symbol lookups to help with understanding. It's relatively straightforward to implement, but obviously the result is still difficult to understand.
- decompiling is a return to source code, and is a much more difficult task - well, it's essentially impossible - some of the information (such as the names of your variables) are impossible to recover.

The freesci website essentially has documentation of everything you need to disassemble/decompile scripts (subject to the limitations I mentioned above). That is, a description of what every assembly instruction does.

Assuming you have the disassembly for a script (SCI Companion does this, and I think Brian had a tool that did it too), these are the challenges:
- You'll need to look at patterns in the branching and jump assembly instructions, and convert them to: if statements, for loops, while statements, switch statements, etc... It's not always obvious what higher level construct they would map to, so this is a bit of a judgement call
- The arithmetic instructions are pretty straightforward to convert to the higher level operators. Pretty much a 1-to-1 mapping.
- You'll need to become familiar with the all the assignment (load/store) instructions to know whether its a script-local, global, function parameter, or function-local variable that's being used.
- The method calls are a little tricky - if I remember correctly I think in the freesci documentation I listed above there were a few errors about how the "send" instructions worker. But in general, by looking what gets pushed on the stack prior to the send instruction, you can figure out what parameters are being passed.

I got reasonably far along in writing a decompiler (the branching stuff was somewhat challenging, but I think I got that working pretty well in the end). I know I quit at some point though, so I must have run into some big remaining hurdles. There is a c++ file for the decompiler in the SCI Companion source code I provided, with a comment on top saying something like "abandoned attempt at writing a decompiler".

The one time I recall decompiling by hand was to create the Avoid script. Brian left this out of the original template game, even though many SCI games used "avoiders". I remember it being a pretty tedious process to do by hand.
Check out my website: http://icefallgames.com
Groundhog Day Competition

Offline lance.ewing

Re: SCI Decompiler?
« Reply #16 on: February 10, 2011, 12:58:20 PM »
Thanks for that description and also the tip about the decompiler code in SCI Companion. It has occurred to me that a decompiler that might work for the original games may not work 100% out of the box for fan made games. The reason for this is that the are obviously different compilers. The Sierra compiler may choose slightly different vm instructions for the same basic source structures and vice versa.

Offline troflip

Re: SCI Decompiler?
« Reply #17 on: February 11, 2011, 04:22:14 PM »
Yes, that's true (and even the SCIStudio and SCI Companion produce different compiled code). I don't think the differences are that great, but there certainly are some (probably mostly around how loops and conditionals are implemented).  A good decompiler should be able to take any different "style" of compiled code and produce reasonable source code though - it's just more work :-).
Check out my website: http://icefallgames.com
Groundhog Day Competition

Offline lance.ewing

Re: SCI Decompiler?
« Reply #18 on: March 01, 2011, 01:10:45 PM »
Before I write a line of code, I want to do a lot of research and investigation into SCI and especially the VM instruction set. So I've read over that section in the SCI specs, I've even printed it out and carry it around in my pocket just in case there is some spare time to read over it again and again and again... i.e. get very familiar with the instructions and what they do.

What I've done over the past couple of days is attempt to work out the most commonly used instructions. I did this by using Brian''s SCI Disassembler for a few games (KQ4, COC, PQ2) and then with some grep/sed/cut/uniq/sort magic, I've now got a CSV file with the counts of each instruction across those games. I then loaded that into Excel and with a simple calculation I've now got the percentages for each instruction, i.e. the percentage of all instructions encountered were of the given instruction name.

It makes very interesting reading. At the top of the list by a long way is pushi. Next comes push1, then bnt, push0, send, push, ldi, push2, jmp, lofsa. That makes up the top ten. From this we can see that of the top 10, half of them are variations of push. In fact it turns out that nearly 50% of all instructions encountered were a variation of push. The pushi instruction accounts for 25% by itself. Add in the counts for the other push instructions and it is increased to nearly 50%. The top 10 instructions account for just over 70% of all instructions encountered. The top 20 for about 85% or so, and the top 30 for nearly 95%. What this tells me is what instructions I should spend most of my time researching. The others account for very little. Some of them don't even appear to have been used.

Now what I guess I'll do is look at how each of these top 10 (20 and 30) instructions have been used and manually deduce what the associated SCI source would have been. From that I'll hopefully see in what scenarios the SCI compiler used particular instructions.

Offline gumby

Re: SCI Decompiler?
« Reply #19 on: May 11, 2011, 09:42:03 PM »
Actually the better approach was to use UNLZEXE to obtain the original uncompressed SCIV.EXE file. After having done this, I can now clearly see the version number in the file:

0.000.685

So the interpreter packaged with SCI Companion is 0.000.685.

Just did the same for SCI Studio and it is also 0.000.685.

Maybe earlier versions of SCI Studio used 0.000.572. It seems like the SCI specs and potentially the template game could have been based on that version.


Just got the wonderful 'Oops' error from Companion that indicates that it's version is indeed 0.000.685 (in case there was still a question).
In the Great Underground Empire (Zork port in development)
Winter Break 2012 Rope Prop Competition

Offline OmerMor

Re: SCI Decompiler?
« Reply #20 on: July 26, 2011, 10:38:27 AM »
lance, any progress so far?
I'd love to hear about it.

Offline lance.ewing

Re: SCI Decompiler?
« Reply #21 on: July 27, 2011, 01:25:59 AM »
Unfortunately I haven't made all that much progress. When I started to get back into the fanmade AGI/SCI community again, I decided I'd pick one thing so that I didn't spread myself too thinly. So I decided that that would be the Java version of PICEDIT. But then I got fascinated with the whole SCI area, especially regarding the language, and for a few months I didn't do anything on PICEDIT at all. This is because I was thinking about the SCI decompiler and doing investigation into how that would work.

I thought I'd start by putting together a web site to capture all my thoughts and discoveries. The main part of that web site was going to be a table with all of the SCI instructions on the left hand side and then various notes about them on the right hand side, with the right most column being some thoughts about how the original code might have looked. I only got part way through building that table when I realised I was neglecting PICEDIT. My ultimate goal for PICEDIT is to add SCI support, but before adding that support, I wanted to bring it up to speed with the other AGI picture editors. So I put the SCI decompiling investigation on hold while I focussed more on PICEDIT. That is where I'm at at the moment.

You can have a look at the web site that I was starting to put together. A lot of the thoughts on there are things I mentioned in various posts on this forum. This is the URL:

http://www.scriptinterpreter.com

Offline Collector

Re: SCI Decompiler?
« Reply #22 on: July 27, 2011, 04:51:44 AM »
It might be good to track down Jeff Stephenson. I did a little Googling to see if I could find any trace of him. I didn't look too long, but came across this from someone that worked at Sierra:

Quote
My first OO language was SCI--Sierra Creative Interpreter. This language was written by Jeff Stephenson for programming adventure games at Sierra On-Line. My brother-in-law Chris Smith, who got me a job with Sierra in late 1989, sent me some sketchy documentation for SCI. One of the highlights of my programming life was sitting on my bed reading that documentation the day I got it. SCI was a combination of Lisp, C and Smalltalk, but the message-passing was very much Smalltalkesque. I loved the language.

http://mwilden.com/smalltalk/index.htm

Also found this http://marketplace.publicradio.org/standard/display/slideshow.php?ftr_id=60233

At least we know what he looks like, now.
« Last Edit: July 27, 2011, 05:07:00 AM by Collector »
KQII Remake Pic

Offline lance.ewing

Re: SCI Decompiler?
« Reply #23 on: July 28, 2011, 01:17:10 AM »
I did try to track down Jeff back around the start of this year and also found both of the pages that you sent through. I actually spoke to Mark Wilden quite a bit at the time but he unfortunately didn't remember much of the detail of the syntax of the language and he didn't know how to get in contact with Jeff.

I did manage to get one step further towards tracking him down though. I found an almost certain match for him on a social networking web site. I think it was mylife.com. I tried sending him a message through that site but didn't receive a reply back. So that was about as far as I got.

Even if we were to track him down, chances are that he wouldn't be able to remember much, unless of course he has documentation or something still with him. But when you're working in IT, you're not really meant to hang on to code or documentation from a previous employer. So memory is often all that there is to go on. Mark Wilden programmed in SCI for quite some time and yet he struggles to remember much about the syntax. I think that that small code snippet from Police Quest SWAT is possibly more useful than anything former Sierra employees can remember from 20 years ago.

Offline Collector

Re: SCI Decompiler?
« Reply #24 on: July 28, 2011, 04:06:46 AM »
That snippet might help to jar memories. I would not completely dismiss the idea that someone didn't keep code or documentation. After all, Mark Wilden had documentation sent to him, so something did get outside of Oakhurst.
KQII Remake Pic

Offline OmerMor

Re: SCI Decompiler?
« Reply #25 on: July 28, 2011, 10:09:19 AM »
BTW - I found that article about Jeff some time ago when I tried to look for information about Avis Durgan - Jeff's wife.
Her name is used an the encryption key in AGI. She is pictured with Jeff in that article.
Maybe you could track Avis down, and get in touch with Jeff through her. Just a though.
« Last Edit: December 08, 2015, 02:03:55 PM by OmerMor »

Offline OmerMor

Re: SCI Decompiler?
« Reply #26 on: July 28, 2011, 10:47:15 AM »

Offline lance.ewing

Re: SCI Decompiler?
« Reply #27 on: July 28, 2011, 12:33:38 PM »
Great detective work. I tried looking on the Wayback Machine 6 months ago when I was try to find the code and couldn't find it. I remember specifically looking around Brian's web site but couldn't see anything. I was probably looking at the wrong years. I did talk to Brian 6 months ago and he remembered the code but couldn't remember the details of what it was, where it had come from and he didn't think he had it anymore. So it was there to be found...  I just had to keep digging.

Offline lance.ewing

Re: SCI Decompiler?
« Reply #28 on: July 28, 2011, 12:35:49 PM »
BTW - I found that article about Jeff some time ago when I tried to look for information about Avid Durgan - Jeff's wife.
Her name is used an the encryption key in AGI. She is pictured with Jeff in that article.
Maybe you could track Avis down, and get in touch with Jeff through her. Just a though.

Actually it was the Avis Durgan name that lead me to mylife.com. There is a Jeff Stephenson with a friend called Avis Durgan registered on mylife.com. They both live in the right part of the US as well, so I'm fairly sure it was them.

Offline OmerMor

Re: SCI Decompiler?
« Reply #29 on: July 29, 2011, 06:14:15 AM »
So the only thing left to do it try contact Avis as well. Maybe is more socially involved than Jeff and checks her emails and messages more often.
Here is a facebook page for a woman called Avis Durgan: http://www.facebook.com/profile.php?id=100001115595747
I bet it's her. You should give it a try!


SMF 2.0.19 | SMF © 2021, Simple Machines
Simple Audio Video Embedder

Page created in 0.075 seconds with 23 queries.