Author Topic: Should WORDS.TOK have even number of bytes?  (Read 2205 times)

0 Members and 1 Guest are viewing this topic.

Offline ZvikaZ

Should WORDS.TOK have even number of bytes?
« on: May 02, 2021, 08:35:38 AM »
It seems that WinAGI expects WORDS.TOK to have even number of bytes.
Is it on purpose?
If yes, please update the WORDS.TOK spec, in WinAGI's help.
If no, please fix it...

It can be demonstrated by creating a new WORDS.TOK file, with only 'a', 'anyword' and 'rol', and opening it in hex editor. It has an unexpected zero at the end. If it's removed, and loaded again to WinAGI, it complains that the file is illegal.



Offline lskovlun

Re: Should WORDS.TOK have even number of bytes?
« Reply #1 on: May 02, 2021, 10:27:10 AM »
Some of the old Sierra games had WORDS.TOK with odd file sizes (KQ3 in particular). If WinAGI couldn't open them, we would have heard about it, right?

Offline ZvikaZ

Re: Should WORDS.TOK have even number of bytes?
« Reply #2 on: May 02, 2021, 10:58:26 AM »
Some of the old Sierra games had WORDS.TOK with odd file sizes (KQ3 in particular). If WinAGI couldn't open them, we would have heard about it, right?
Yeah. You're right.
I just discovered that I was wrong - the point isn't parity.

However, there is still this additional mandatory zero.
Now I think that it always expects a trailing zero. But I'm not 100% convinced that I'm correct...

Offline AGKorson

Re: Should WORDS.TOK have even number of bytes?
« Reply #3 on: May 03, 2021, 12:05:04 PM »
The trailing zero is required. That's how the word search function knows it's reached the end of the file. If you strip that off, the word search function will continue looking at data from the heap that immediately follows the words.tok file (the OBJECT file), thinking it's valid word data; eventually, a null value will be found, which AGI interprets as the end of words. In most cases, this will happen without the player ever even knowing, because there is almost always a zero value (0x00) byte in the OBJECT file's header. That byte will cause the word search to end.

Even in the unlikely event that you have more than 85 inventory items (which means the header WON'T have a zero value byte), it would be even more unlikely that the rest of the OBJECT file header and item data would contain the exact characters needed to create a false word match before a zero value is eventually encountered.

I don't know the exact algorithms, but I would not be surprised if NAGI and SCUMMVM ignore the trailing null character; they most likely use array indices to keep track of the end of the word list (this is just speculation on my part though).

So, technically, you could have a WORDS.TOK file without the null character at the end, it will work 99.99999% of the time. But it's better to have that ending character, so WinAGI enforces it.

Offline ZvikaZ

Re: Should WORDS.TOK have even number of bytes?
« Reply #4 on: May 03, 2021, 02:38:14 PM »
The trailing zero is required. That's how the word search function knows it's reached the end of the file. If you strip that off, the word search function will continue looking at data from the heap that immediately follows the words.tok file (the OBJECT file), thinking it's valid word data; eventually, a null value will be found, which AGI interprets as the end of words. In most cases, this will happen without the player ever even knowing, because there is almost always a zero value (0x00) byte in the OBJECT file's header. That byte will cause the word search to end.

Even in the unlikely event that you have more than 85 inventory items (which means the header WON'T have a zero value byte), it would be even more unlikely that the rest of the OBJECT file header and item data would contain the exact characters needed to create a false word match before a zero value is eventually encountered.

I don't know the exact algorithms, but I would not be surprised if NAGI and SCUMMVM ignore the trailing null character; they most likely use array indices to keep track of the end of the word list (this is just speculation on my part though).

So, technically, you could have a WORDS.TOK file without the null character at the end, it will work 99.99999% of the time. But it's better to have that ending character, so WinAGI enforces it.

Thanks for the detailed explanation, you've convinced me ;)
Can you add that trailing zero to WinAGI help's description of WORDS.TOK format?

And does anyone here have write permission to http://agi.sierrahelp.com/Documentation/Specifications/8-2-WORDS_TOK.html and can update it there?
Or maybe there is somewhere more updated copy?


SMF 2.0.19 | SMF © 2021, Simple Machines
Simple Audio Video Embedder

Page created in 0.041 seconds with 23 queries.