Author Topic: LEA instruction (Read 12714 times)

lance.ewing · « **on:** April 21, 2015, 02:30:43 PM »

The following page is missing the number of bytes for the two LEA opcodes:

http://wiki.scummvm.org/index.php/SCI/Specifications/SCI_virtual_machine/The_Sierra_PMachine#The_instructions

It appears to suggest that both operands support a B and W:

op 0x5a: lea W type, W index ( bytes)
op 0x5b: lea B type, B index ( bytes)

...but that doesn't make much sense to me. The first operand would never need to be 16 bits.

I'm guessing Lars or Phil would know the answer to this?

troflip · « **Reply #1 on:** April 21, 2015, 02:55:18 PM »

Looks like it's either 3 bytes or 5 bytes (For the "wide" version). The "wide" version does indeed use 2 bytes for the first operand. Maybe just an oversight by Sierra?

Also, the description for lofsa and lofss is incomplete. SCI0 uses a relative offset, as described. But SCI1.1 uses an absolute offset (i.e. the value is used directly, it isn't added to the pc). And "in between" games use either... it depends on the interpreter version. As far as I know that's the only opcode that changed behavior in different SCI versions.

[edit for a 2nd time:] While I'm at this, the ugt?, uge?, ult? and ule? opcodes also set the prev register in the same way that the eq?, ne?, etc... do.

lance.ewing · « **Reply #2 on:** April 22, 2015, 06:00:37 PM »

Yeah, I guess it must have been an oversight by Sierra. It seems like a waste of one byte. I notice that the bit usage for bits 1, 2 and 4 is the same as for those bits in the 0x80-0xFF opcodes, which probably isn't a coincidence.

Has anyone spent any time attempting to guess what instruction sets the various SCI opcodes might have been borrowed from or been influenced by?

What about the opcodes that apparently don't exist? I'm guessing that those slots were used for something at some point in time, probably before SCI was ever used in a game. I guess we'll never know.

lskovlun · « **Reply #3 on:** April 22, 2015, 07:48:24 PM »

The mod opcode changed as well in late SCI0 (with respect to the behavior with negative numbers).
There were other changes of less consequence (early SCI0 treated the &rest opcode slightly differently).
From SCI2 on, some games contain line number information, which might be useful for a decompiler. This is implemented with special opcodes.
In SCI3, the object model is changed so that standard selectors like -info-, superClass and species no longer exist as such, but are implemented with opcodes. I have no idea whether they changed in source form or not.

troflip · « **Reply #4 on:** April 22, 2015, 08:14:00 PM »

Quote from: lance.ewing on April 22, 2015, 06:00:37 PM

Yeah, I guess it must have been an oversight by Sierra. It seems like a waste of one byte.

It does, but I don't think they were concerned with a few bytes here and there, given what I've seen (for example, unused local variables weren't even optimized out by Sierra's compiler - 2 bytes apiece. I've even seen a 50 byte unused local array in a script). And LEA isn't a very common opcode to use (it corresponds to the @ pointer syntax).

Quote from: lance.ewing on April 22, 2015, 06:00:37 PM

Has anyone spent any time attempting to guess what instruction sets the various SCI opcodes might have been borrowed from or been influenced by?

I found a copy of the old "Smalltalk 80" book (from 1989). It has a chapter that describes the bytecodes processed by the smalltalk interpreter. It's very similar to SCI. For instance, there is a send bytecode that specifies the number of arguments being sent, and the arguments are pushed onto the stack prior to the send. It doesn't look like there is an accumulator though, just a stack.
http://www.mirandabanda.org/bluebook/bluebook_chapter28.html

So I suspect it's mainly borrowed from that, maybe mixed with a little more traditional stuff from processors at the time (e.g. the accumulator and prev registers).

Quote from: lance.ewing on April 22, 2015, 06:00:37 PM

What about the opcodes that apparently don't exist? I'm guessing that those slots were used for something at some point in time, probably before SCI was ever used in a game. I guess we'll never know.

I mean, there's room for 128 opcodes. I suspect they just didn't need that many, so some of them are "blank".

lance.ewing · « **Reply #5 on:** April 23, 2015, 12:34:12 PM »

Yeah, I was assuming that that was the case in the last couple of unused opcode numbers, but there are gaps earlier on in the sequence, suggesting that there may have been something in those slots at some point.

troflip · « **Reply #6 on:** May 04, 2015, 01:03:49 PM »

calle is also wrong on the scumm page. The wide version is 6 bytes, not 5 (yes, 2 bytes are used for the dispindex, even though it seems unlikely you'd ever have more than 256 exports from a script).

lance.ewing · « **Reply #7 on:** May 04, 2015, 04:36:22 PM »

Yeah, you're right about calle. I hadn't noticed that one.

You've reminded me that I meant to continue the discussions about the unused opcodes. Ignoring 0x7e and 0x7f, the others are as follows:

op 0x4c
op 0x4d
op 0x4e
op 0x4f
These opcodes don't exist in SCI.

op 0x52
op 0x53
These opcodes don't exist in SCI.

op 0x5e
op 0x5f
These opcodes don't exist in SCI.

In the case of 0x5e, 0x5f, 0x4c, 0x4d, 0x4e, and 0x4f, they may have deliberating left a gap for expansion, e.g. for future instructions to be grouped with those that they are similar to. I say this because the used opcodes recommence on 0x50 in one case and 0x60 in another.

The gap that we can't say this about is 0x52 and 0x53. Surely there must have been something in this slot at some point. There seems no reason to leave that particular gap.

I'm not entirely convinced that the 0x4c to 0x4f gap is for future expansion either. The self and super instructions at 0x54 to 0x57 are quite similar to the send instruction at 0x4a/0x4b. If the gap is to allow for similar future instructions, it seems strange for the gap to be in the middle of the similar instructions. It feels a lot more like the gap is a result of redundant opcodes being removed. And if this was the case, I wonder what they might have been?

Collector · « **Reply #8 on:** May 04, 2015, 09:20:14 PM »

Could I ask that if you find anything wrong with the specifications would you make corrections to the Wiki?

troflip · « **Reply #9 on:** May 07, 2015, 09:13:42 PM »

Who is able to update that wiki? Anyone?

I've written the code to generate SCI1.1 script resources, and I'm trying to test it - unsuccessfully. I'm basically just recompiling an unmodified script 100 (the title screen) from SQ5.

Sierra's interpreter crashed with the script, but with ScummVM's debugger I was able to figure out the cause. It turns out there is another error on the pmachine page. The pushSelf opcode must not have the high bit set (at least in SCI1). After making changes to always output the "word" version of opcodes that have no operands (such as pushSelf, or ret), ScummVM happily ran the game. And then I poked around in the Scumm source code and found this:

Code: [Select]

	// Special handling of the op_line opcode
	if (opcode == op_pushSelf) {
		// Compensate for a bug in non-Sierra compilers, which seem to generate
		// pushSelf instructions with the low bit set. This makes the following
		// heuristic fail and leads to endless loops and crashes. Our
		// interpretation of this seems correct, as other SCI tools, like for
		// example SCI Viewer, have issues with these scripts (e.g. script 999
		// in Circus Quest). Fixes bug #3038686.
		if (!(extOpcode & 1) || g_sci->getGameId() == GID_FANMADE) {
			// op_pushSelf: no adjustment necessary
		} else {
			// Debug opcode op_file, skip null-terminated string (file name)
			while (src[offset++]) {}
		}
	}

So yeah, the "high bit" version of pushSelf is actually a different opcode.

Unfortunately even with my fix, Sierra's interpreter crashes...

[edit:]
Ok, I found the problem pretty quickly. It looks like any opcode that doesn't have variable-sized operands needs to have the high bit set. So it's not just pushSelf that's weird, even other opcodes like send need to not have the high bit set (e.g. 0x4a and not 0x4b for send)

I wonder if those opcodes have any use at all, or do they just make the interpreter crash?

Collector · « **Reply #10 on:** May 07, 2015, 11:33:02 PM »

Anyone that has an account on the Wiki can add or edit it. You should also have permissions to upload files. That was one of the points of our own Wiki, only the ScummVM team has access to edit the ScummVM Wiki.

SCIprogramming.com

Author Topic: LEA instruction (Read 12714 times)

lance.ewing

LEA instruction

troflip

Re: LEA instruction

lance.ewing

Re: LEA instruction

lskovlun

Re: LEA instruction

troflip

Re: LEA instruction

lance.ewing

Re: LEA instruction

troflip

Re: LEA instruction

lance.ewing

Re: LEA instruction

Collector

Re: LEA instruction

troflip

Re: LEA instruction

Collector

Re: LEA instruction