To quote the ScummVM wiki:
There are no real SCI0 translations. This is because this interpreter does not cover special characters like accent grave, accent egu, umlaut, etc. The SCI0 games that are translated are converted to the hybrid SCI01.
Well, they say that SCI0 translation don't exist, for Sierra's original interpreter; but it doesn't mean that it's not possible to create SCI0 translation, with modified interpreter (i.e. ScummVM with little changes).
On the subject of using vocab.900 from QFG2: that's an SCI01 game, while SQ3 is SCI0. So compatibility issues would be expected in that case.
Why is so?
Looking in ScummVM code (
https://github.com/scummvm/scummvm/blob/master/engines/sci/parser/vocabulary.cpp) the only difference between SCI0 and SCI01 is in the loading from file phase (lines 143-164) - the difference between using bit 8 to signal string end (SCI0) and using regular C-style '\0' char to signal string end (SCI01).
Once the internal words dictionary has been populated, there is no further difference.
Indeed, the problem with QfG2 is that the words are belonging to different groups, and it's a lot of work to change them to match SQ3 groups.
(I have also checked Sierra's German SQ3, but its vocab groups look more like QFG2's, so, no help here)
Therefore, I think that it makes perfect sense to just convert original SQ3 vocab to SCI01 format, and use it with ScummVM.
I can write such a tool, but if something already exists - it will save me re-inventing the wheel...
If there isn't such a tool, I need a little help here ;-)
SCI0 vocab is documented at
http://sciwiki.sierrahelp.com//index.php?title=SCI_Specifications:_Chapter_6_-_SCI_in_action#Vocabulary_file_formatsI can see the difference between it and SCI01 at ScummVM code.
But ScummVM ignores the initial pointer section.
Is there some documentation for that? (I want to make sure that my vocab will be loadable by SCICompanion/SCIStudio)
EDIT
===
It's working
Pasting here the Python code, it might help someone in the future:
# Sierra's SCI vocab format has "old" version (used by vocab.0) - with 7 bits ascii
# and 8th bit used for string end
# while "new" version (used by vocab.900) has 8 bits ascii
#
# for Hebrew translation, we need the vocab to be in the newer version
import pathlib
INPUT_FILE = r"C:\Zvika\Games\sq3ega.hebrew\sq3ega\vocab.000.orig"
OUTPUT_FILE1 = r"D:\ZVIKA\Sierra\Quest for Glory 2\VOCAB.900"
OUTPUT_FILE2 = r"C:\Zvika\Games\sq3ega.hebrew\sq3ega\VOCAB.000"
in_vocab = list(pathlib.Path(INPUT_FILE).read_bytes())
out_vocab = in_vocab[0:2] # vocab signature
# vocab.900 starts with 255 16-bit pointers
# they aren't interesting...
out_vocab.extend([0] * (256*2))
bytes_until_word_text = 0
for idx, val in enumerate(in_vocab[(26*2):]):
if bytes_until_word_text == 0:
if val < 0x80:
out_vocab.append(val)
else:
out_vocab.append(val - 0x80)
out_vocab.append(0)
bytes_until_word_text = 3
else:
out_vocab.append(val)
bytes_until_word_text -= 1
with open(OUTPUT_FILE1, "wb") as out_file:
out_file.write(bytes(out_vocab))
with open(OUTPUT_FILE2, "wb") as out_file:
out_file.write(bytes(out_vocab))
Just to be detailed, you need to modify that vocab.900 in QFG2 dir with SCICompanion, export patch, rename it to vocab.000, and save in SQ3 dir.