Apparentely there is a preferred and ""correct"" way for nesting ANDs and ORs.
The "correct" form: ( ( (...) AND (...) ) AND (...) ) AND (...) [I will call it the 'lefty' form]
The "wrong" and potentially bugged nestings:
(...) AND ( (...) AND ( (...) AND (...) ) )
(...) AND ( ( (...) AND (...) ) AND (...) )
( (...) AND (...) ) AND ( (...) AND (...) )
etc.
Mixing ANDs and ORs:
100% wrong: ( #1 ) OR ( ( #2 ) AND ( #3 ) )
potentially working: ( ( #2 ) AND ( #3 ) ) OR ( #1 )
You may ask why? Spoiling some conclusions:
- I compiled some codes with "wrong" nestings and I noted that the order of brackets of ANDs and ORs are not respected. Only compiling the codes with the "lefty" form I found a reasonable pattern for the IF_JUMP command.
- Billy Thompson, in his missions scripts for Industrial District he always used the 'lefty' form whenever nesting ANDs. There can be another reasons for that (it's easer to count the brackets) but I suspect that he knew that the compiler is buggy when nesting in other ways.
In order to properly answer this question, I studied the hex data of IFs, WHILE's, AND's, OR's, ELSE's and NOT's compiling many simple codes and looking at the compiled files using a HEX editor.
First, we need to know what is the IF_JUMP command in T.M. decompiler.
The possible structure of IF_JUMP in hex data:
EDIT: the content below about IF_JUMP is outdated, and it isn't the real behavior specially if the IF_JUMP is in a WHILE_EXEC block. To see my most accurate theory about the IF_JUMP, see my last post on this thread: viewtopic.php?p=13711#p13711
1° form: (cpi) 62 00 (npi) (AND) (OR) (ENDIF index)
2° form: (cpi) 62 00 (aci) (AND: 00 00) (OR: 00 00) or else goto (eci)
where
cpi: current point index
npi: next point index -> always enter that point index after finishing the command the 'npi' is in
aci: access command index if receives TRUE
eci: exit command index if receives FALSE
The first form occurs when if at least (AND) or (OR) are non-zero.
If (AND) and (OR) are both 00 00, the second form of IF_JUMP always occurs and the last index in this case represents the exit index
Example of a IF command code (Don't criticize its senseless, it was created just to be decompiled):
Code: Select all
DO_NOWT
IF ( ( ( ( true17 = 255 ) AND ( true16 = 255 ) ) AND ( true15 = 255 ) ) AND ( true14 = 255 ) )
CHANGE_POLICE_LEVEL (4)
CHANGE_POLICE_LEVEL (6)
ENDIF
CHANGE_POLICE_LEVEL (3)
LEVELEND
15 00 01 01 16 00 00 00 (cpi: 15 00) DO_NOWT (npi: 16 00)
16 00 5E 00 18 00 00 00 12 00 FF 00 (cpi: 16 00) S_EQUAL_I (npi: 18 00) (WHILE_EXEC: No) (counter index: 12 00) (check value in Hex: FF 00)
17 00 5E 00 1A 00 00 00 11 00 FF 00 (cpi: 17 00) S_EQUAL_I (npi: 1A 00) (WHILE_EXEC: No) (counter index: 11 00) (check value in Hex: FF 00)
18 00 62 00 17 00 01 00 00 00 20 00 (cpi: 18 00) IF_JUMP (npi: 17 00) (AND: Yes) (OR: No) (ENDIF index 20 00)
19 00 5E 00 1C 00 00 00 10 00 FF 00 (cpi: 19 00) S_EQUAL_I (npi: 1C 00) (WHILE_EXEC: No) (counter index: 10 00) (check value in Hex: FF 00)
1A 00 62 00 19 00 01 00 00 00 20 00 (cpi: 1A 00) IF_JUMP (npi: 19 00) (AND: Yes) (OR: No) (ENDIF index 20 00)
1B 00 5E 00 20 00 00 00 0F 00 FF 00 (cpi: 1B 00) S_EQUAL_I (npi: 20 00) (WHILE_EXEC: No) (counter index: 0F 00) (check value in Hex: FF 00)
1C 00 62 00 1B 00 01 00 00 00 20 00 (cpi: 1C 00) IF_JUMP (fci: 1B 00) (AND: Yes) (OR: No) (ENDIF index 20 00)
1D 00 A2 01 1E 00 00 00 00 00 04 00 (cpi: 1D 00) CHANGE_POLICE_LEVEL (npi: 1E 00) (WHILE_EXEC: No) (??) (argument: 4)
1E 00 A2 01 21 00 00 00 40 CA 06 00 (cpi: 1E 00) CHANGE_POLICE_LEVEL (npi: 21 00) (WHILE_EXEC: No) (??) (argument: 6)
20 00 62 00 1D 00 00 00 00 00 21 00 (cpi: 20 00) IF_JUMP (aci: 1D 00) (AND: No) (OR: No) or else (eci: 21 00)
21 00 A2 01 22 00 00 00 B0 C6 03 00 (cpi: 21 00) CHANGE_POLICE_LEVEL (npi: 22 00) (WHILE_EXEC: No) (??) (argument: 3)
22 00 3C 00 FF FF (cpi: 22 00) LEVELEND (Finish)
Note: CHANGE_POLICE_LEVEL = A2 01. I'm not sure about the WHILE_EXEC, but it will be 01 00 if that code is in a WHILE_EXEC.
- How you can understand that IF block using hex data?
(I will put brackets between two 2byte hex so it is easer to read)
In the case of the atributtion logic operator ( counter = [int] ), it justs starts with it:
[16 00] 5E 00 [18 00] 00 00 [12 00] FF 00 (cpi: 16 00) S_EQUAL_I (npi: 18 00) (WHILE_EXEC: No) (counter index: 12 00) (check value in Hex: FF 00)
It does not indicate that there is an IF or WHILE, it justs appear there.
IMPORTANT: In the first nesting of ANDs, the 'npi' apparently ALWAYS point two index after the current index (16 00 -> 18 00) and this index SHOULD be an IF_JUMP, which is
[18 00] 62 00 [17 00] 01 00 [00 00] 20 00 (cpi: 18 00) IF_JUMP (npi: 17 00) (AND: Yes) (OR: No) (ENDIF index 20 00)
Now the 'npi' link the previous logic operator with the next operator, which is on index 17 00:
[17 00] 5E 00 [1A 00] 00 00 [11 00] FF 00 (cpi: 17 00) S_EQUAL_I (npi: 1A 00) (WHILE_EXEC: No) (counter index: 11 00) (check value in Hex: FF 00)
Now I think the gta2.exe will calculate the boolean value of ( true17 = 255 ) AND ( true16 = 255 ) and store it.
The new 'npi' now can point to the ENDIF index or to another IF_JUMP, which is the case here because there is another boolean check to do:
[1A 00] 62 00 [19 00] 01 00 [00 00] 20 00 (cpi: 1A 00) IF_JUMP (npi: 19 00) (AND: Yes) (OR: No) (ENDIF index 20 00)
The 'npi' must point to a boolean checking ( true15 = 255 ), which in this case is
[19 00] 5E 00 [1C 00] 00 00 [10 00] FF 00 (cpi: 19 00) S_EQUAL_I (npi: 1C 00) (WHILE_EXEC: No) (counter index: 10 00) (check value in Hex: FF 00)
The next sequence of commands is (in the order these are executed)
[1C 00] 62 00 [1B 00] 01 00 [00 00] 20 00 (cpi: 1C 00) IF_JUMP (npi: 1B 00) (AND: Yes) (OR: No) (ENDIF index 20 00)
[1B 00] 5E 00 [20 00] 00 00 [0F 00] FF 00 (cpi: 1B 00) S_EQUAL_I (npi: 20 00) (WHILE_EXEC: No) (counter index: 0F 00) (check value in Hex: FF 00)
Now the 'npi' points to a IF_JUMP in which AND = 0 and OR = 0, so there is an ENDIF:
[20 00] 62 00 [1D 00] 00 00 [00 00] 21 00 (cpi: 20 00) IF_JUMP (aci: 1D 00) (AND: No) (OR: No) or else (eci: 21 00)
This means that the nesting of ANDs ended, and the gta2.exe will decide if it enter the IF block or ignore it.
If the returning boolean is TRUE, npi = aci = 1D 00.
If the returning boolean is FALSE, npi = eci = 21 00.
The index 1D 00 is the first command in this IF block, the index 21 00 is the first command outside this IF block.
- But if there is an ELSE in the IF block, how can I know?
The easy way to check this is looking at the 'eci' index in the IF_JUMP ENDIF. If 'eci' is higher than 'cpi', then there is no ELSE in the IF block.
- Example of a IF command with ELSE:
Code: Select all
IF ( true17 = 255 )
CHANGE_POLICE_LEVEL (4)
ELSE
CHANGE_POLICE_LEVEL (6)
ENDIF
CHANGE_POLICE_LEVEL (3)
LEVELEND
In hex data:
15 00 5E 00 19 00 00 00 12 00 FF 00 (cpi: 15 00) S_EQUAL_I (npi: 19 00) (EXEC: No) (counter index) (counter value in Hex)
16 00 A2 01 1A 00 00 00 C0 DD 04 00 (cpi: 17 00) CHANGE_POLICE_LEVEL (npi: 1A 00) (WHILE_EXEC: No) (??) (argument: 4)
17 00 A2 01 1A 00 00 00 04 D4 06 00 (cpi: 18 00) CHANGE_POLICE_LEVEL (npi: 1A 00) (WHILE_EXEC: No) (??) (argument: 6)
19 00 62 00 16 00 00 00 00 00 17 00 (cpi: 19 00) IF_JUMP (aci: 16 00) (AND: No) (OR: No) or else (eci: 17 00)
1A 00 A2 01 1B 00 00 00 FE FF 03 00 (cpi: 1A 00) CHANGE_POLICE_LEVEL (npi: 1B 00) (WHILE_EXEC: No) (??) (argument: 3)
1B 00 3C 00 FF FF (cpi: 1B 00) LEVELEND (Finish)
As you can see, 'eci' < 'cpi' in the ENDIF IF_JUMP, and it locate exactly where the ELSE is: after the command 16 00 and before 17 00
- How can I differ a IF from a WHILE?
The difference is that if there is a WHILE, a GOTO command will appear at the end, right after the ENDIF. This makes sense, since WHILE is just a IF with a GOTO in the end.
- Example of using WHILE:
Code: Select all
WHILE ( true17 = 255 )
CHANGE_POLICE_LEVEL (4)
CHANGE_POLICE_LEVEL (6)
ENDWHILE
CHANGE_POLICE_LEVEL (3)
LEVELEND
In hex data:
15 00 5E 00 18 00 00 00 12 00 FF 00 (cpi: 15 00) S_EQUAL_I (npi: 18 00) (WHILE_EXEC: No) (counter index) (counter value in Hex)
16 00 A2 01 17 00 00 00 E8 5D 04 00 (cpi: 16 00) CHANGE_POLICE_LEVEL (npi: 17 00) (WHILE_EXEC: No) (??) (argument: 4)
17 00 A2 01 1B 00 00 00 00 00 06 00 (cpi: 17 00) CHANGE_POLICE_LEVEL (npi: 1B 00) (WHILE_EXEC: No) (??) (argument: 6)
18 00 62 00 16 00 00 00 00 00 1C 00 (cpi: 18 00) IF_JUMP (aci: 16 00) (AND: No) (OR: No) or else (eci: 1C 00)
1B 00 4D 00 15 00 00 00 18 00 00 00 (cpi: 1B 00) GOTO (npi: 15 00) (WHILE_EXEC: No) (ENDIF index: 18 00) (??)
1C 00 A2 01 1D 00 00 00 00 00 03 00 (cpi: 17 00) CHANGE_POLICE_LEVEL (npi: 1B 00) (WHILE_EXEC: No) (??) (argument: 3)
1D 00 3C 00 FF FF (cpi: 1D 00) LEVELEND (Finish)
Look at the code flow if S_EQUAL_I is TRUE:
15 00 -> 18 00 -> 16 00 -> 17 00 -> 1B 00 -> 15 00
Conclusion: If you want to know if there is a IF block or a WHILE block, you must check if there is a GOTO command right before the first command outside that block, which is
1C 00 A2 01 1D 00 00 00 00 00 03 00 (cpi: 17 00) CHANGE_POLICE_LEVEL (npi: 1B 00) (WHILE_EXEC: No) (??) (argument: 3)
I'm not sure but apparently you can also check if there is a GOTO command three index after the ENDIF (18 00 -> 1B 00), which is more easer to do.
- What if there is a NOT operator in the boolean checking?
The NOT operator (value 47 00) appears explicitely in hex data and it ALWAYS refer to the previous boolean check:
(cpi) 47 00 (npi) (??)
Example:
15 00 01 01 16 00 00 00 (cpi: 15 00) DO_NOWT (npi: 16 00)
16 00 5E 00 17 00 00 00 12 00 FF 00 (cpi: 16 00) S_EQUAL_I (npi: 17 00) (EXEC: No) (counter index: 12 00) (check value in Hex: FF 00)
17 00 47 00 19 00 00 00 (cpi: 17 00) NOT (npi: 19 00) (??)
18 00 5E 00 1B 00 00 00 11 00 FF 00 (cpi: 18 00) S_EQUAL_I (npi: 1B 00) (EXEC: No) (counter index: 11 00) (check value in Hex: FF 00)
19 00 62 00 18 00 01 00 00 00 1F 00 (cpi: 19 00) IF_JUMP (npi: 18 00) (AND: Yes) (OR: No) (ENDIF index 1F 00)
This is the same of
DO_NOWT
IF/WHILE ( ( NOT ( true17 = 255 ) ) AND ( true16 = 255 ) ...... )
where
'true17' index = 12 00
'true16' index = 11 00
Conclusion: the NOT operator always appears immediately after the boolean checking it is applying.
- Why is the nesting ( #1 ) OR ( ( #2 ) AND ( #3 ) ) 100% wrong?
After compiling this code
Code: Select all
DO_NOWT
IF ( ( true17 = 255 ) OR ( ( true16 = 255 ) AND ( true15 = 255 ) ) )
CHANGE_POLICE_LEVEL (4)
CHANGE_POLICE_LEVEL (6)
ENDIF
CHANGE_POLICE_LEVEL (3)
LEVELEND
I found:
15 00 01 01 16 00 00 00 (cpi: 15 00) DO_NOWT (npi: 16 00)
16 00 5E 00 18 00 00 00 12 00 FF 00 (cpi: 16 00) S_EQUAL_I (npi: 18 00) (EXEC: No) (counter index) (counter value in Hex)
17 00 5E 00 19 00 00 00 11 00 FF 00 (cpi: 17 00) S_EQUAL_I (npi: 19 00) (EXEC: No) (counter index) (counter value in Hex)
18 00 5E 00 1A 00 00 00 10 00 FF 00 (cpi: 18 00) S_EQUAL_I (npi: 1A 00) (EXEC: No) (counter index) (counter value in Hex)
19 00 62 00 1E 00 01 00 00 00 1E 00 (cpi: 19 00) IF_JUMP (npi: 1E 00) (AND: Yes) (OR: No) (ENDIF index 1E 00)
1A 00 62 00 17 00 00 00 01 00 1E 00 (cpi: 1A 00) IF_JUMP (npi: 17 00) (AND: No) (OR: Yes) (ENDIF index 1E 00)
1B 00 A2 01 1C 00 00 00 58 CA 04 00 (cpi: 1B 00) CHANGE_POLICE_LEVEL (npi: 1C 00) (EXEC: No) (??) (argument #1)
1C 00 A2 01 1F 00 00 00 28 CA 06 00 (cpi: 1C 00) CHANGE_POLICE_LEVEL (npi: 1F 00) (EXEC: No) (??) (argument #1)
1E 00 62 00 1B 00 00 00 00 00 1F 00 (cpi: 1A 00) IF_JUMP (aci: 1B 00) (AND: No) (OR: No) or else (eci: 1F 00)
1F 00 A2 01 20 00 00 00 C8 71 03 00 (cpi: 1F 00) CHANGE_POLICE_LEVEL (npi: 20 00) (EXEC: No) (??) (argument #1)
20 00 3C 00 FF FF (cpi: 20 00) LEVELEND (Finish)
Note that the compiled code is applying a OR between ( true17 = 255 ) and ( true15 = 255 ), and the first S_EQUAL_I is pointing at another S_EQUAL_I, which is weird and probably it is not the intended way.
You may ask, why the code is skipping the second "S_EQUAL_I".
Based on this question and other testings I have done, I concluded that the first IF_JUMP of the nested ANDs and ORs should always be at the index 'npi' of the previous boolean checking.
If we assume this, there won't be any S_EQUAL_I pointing at another S_EQUAL_I. Then S_EQUAL_I will always pointing at a IF_JUMP.
This explains why the "lefty" form of nesting ANDs and ORs is the ""correct"" way if considering the weird behavior of the miss2.exe compiler.
I hope all this information may help someone to finally finish the T.M.s decompiler.