Community

SCI Programming => SCI Development Tools => Topic started by: ZvikaZ on January 23, 2022, 01:30:43 PM

Title: Is it possible to know selector's type?
Post by: ZvikaZ on January 23, 2022, 01:30:43 PM
Hi.
Is it possible to know the selector's type?
For example, in SQ1VGA, room 103, SCICompanion has decompiled all `lookStr` as strings. How does it know that it's a string?

Is it guessing?
If it's guessing because the selector's value matches a string address in the file - it might confuse a random value that just happens to match some address - does it avoid that?

Or maybe it has some list of string selectors?

Or something else?
Title: Re: Is it possible to know selector's type?
Post by: Kawa on January 23, 2022, 05:44:13 PM
The script resources are subdivided into a few different blocks, such as local variable storage (including initial values), class and instance definitions, actual script code, a pool of string literals, a pool of said specs, and probably one or two other types I forgot.

The decompiler can tell that the lookStr value is a pointer because it's a fairly high value (same way Print can tell a string literal from a text resource tuple), and it can tell that it's a string because said value matches a location in that resource's string literal pool. By the same token, it can recognize said specs. If it's none of those kind of thing even if it's a sufficiently high value, it's probably an integer constant.

In the later versions with the split SCR/HEP resources, all of those things went into the HEP resources except for the actual script code, which went in the SCR resources, while the said spec block was removed.

Certain later versions of Sierra's compiler allowed lackluster type tagging, where a selector was tagged as an int or an id. I'm not sure how much the compiler actually cared.



And you're right to think it might mistake a value for a string! I'm not sure what would happen and would like to find out. I'll get back to you.

Test results
(= test $02BD) where $02BD is also a string literal's location doesn't break much because that would emit an LDI opcode (Load Immediate), while assigning a string pointer would use LOFSS (Load OffSet to String?) and the decompiler knows what to do here.
However, as I expected, changing one the test room's cardinal exits to $02BD does confuse the decompiler into thinking it's a string.
HOWEVER HOWEVER, changing it to $02BE, just one byte off, made it an integer again! Apparently the decompiler is smart enough to check if it's the very start of a string. So no such risk.
Title: Re: Is it possible to know selector's type?
Post by: ZvikaZ on January 24, 2022, 09:41:56 AM
The script resources are subdivided into a few different blocks, such as local variable storage (including initial values), class and instance definitions, actual script code, a pool of string literals, a pool of said specs, and probably one or two other types I forgot.

The decompiler can tell that the lookStr value is a pointer because it's a fairly high value (same way Print can tell a string literal from a text resource tuple), and it can tell that it's a string because said value matches a location in that resource's string literal pool. By the same token, it can recognize said specs. If it's none of those kind of thing even if it's a sufficiently high value, it's probably an integer constant.

In the later versions with the split SCR/HEP resources, all of those things went into the HEP resources except for the actual script code, which went in the SCR resources, while the said spec block was removed.

Certain later versions of Sierra's compiler allowed lackluster type tagging, where a selector was tagged as an int or an id. I'm not sure how much the compiler actually cared.



And you're right to think it might mistake a value for a string! I'm not sure what would happen and would like to find out. I'll get back to you.

Test results
(= test $02BD) where $02BD is also a string literal's location doesn't break much because that would emit an LDI opcode (Load Immediate), while assigning a string pointer would use LOFSS (Load OffSet to String?) and the decompiler knows what to do here.
However, as I expected, changing one the test room's cardinal exits to $02BD does confuse the decompiler into thinking it's a string.
HOWEVER HOWEVER, changing it to $02BE, just one byte off, made it an integer again! Apparently the decompiler is smart enough to check if it's the very start of a string. So no such risk.

Thanks for the detailed answer and the experiment.
However, the conclusion (IMO) is that there is such a risk, only its probability is low. But I wouldn't call it very low. Considering there are, what, maybe (just guessing) 20 problematic values out of 64K, that's roughly 1 to 3000. In a game with 100 scripts, it's 1 to 30 chance to encounter such a problem.
Title: Re: Is it possible to know selector's type?
Post by: Kawa on January 24, 2022, 07:35:36 PM
Twenty very specific problematic values out of 64k, assuming there are twenty string literals in the string pool for the script resource currently being decompiled (noting that it only decompiles one script at a time), which includes class names.

I don't math well, but according to this percentage calculator I found, 20 out of 65535 is only 0.03%.
Title: Re: Is it possible to know selector's type?
Post by: lskovlun on January 25, 2022, 02:33:34 AM
Hi.
For example, in SQ1VGA, room 103, SCICompanion has decompiled all `lookStr` as strings. How does it know that it's a string?
Relocations. If that particular word of the script file is relocated, then it's a pointer. In fact, there was a bug about it.
Title: Re: Is it possible to know selector's type?
Post by: lskovlun on January 25, 2022, 03:28:04 AM
For reference, the bug was mentioned in thread
https://sciprogramming.com/community/index.php?topic=1601.15
and fixed in commit
https://github.com/Kawa-oneechan/SCICompanion/commit/094b0fb73f4f5809d57da274c8ac06a2570bb327

And the reason why that bug happened is because SCI Companion didn't use relocations for this. The fix doesn't either.