Intro
With the release of LEGO Star Wars: The Skywalker Saga on Steam several days ago, I decided to revisit some old titles in the series. Luckily, I've been doing a lot of mapping lately between various architectures (XBOX360-to-PCIntel, LinuxELF-to-PCIntel, NintendoSwitch-to-PCIntel, etc.) so I got the hang of it. So much that it became a bit like riding the bike. As such, I ended up having a look at the NX (short for Nintendo Switch) version of LEGO: The Incredibles. Why? The retail NX version contains debug symbols.
The tools I've used for this MANUAL mapping:
- le brain
- understanding how the compiler works on each platform
- flow interpretation
- string references
- acknowledging some code can be inlined or not or missing completely
- acknowledging not all of the code in the NX build may be available in the PC build (e.g.: debug code) and vice-versa
- a shit-load more attention and stop signs
Keep reading and you'll see where I ended up with this Even though I didn't plan to.
Mapping
To map NX to PC I had to use an image of the NX version, IDA 7.5 SP3 and a [Link]. And I got this:
From ^ that, I then opened the x64 Steam binary in [Link] and started working, using all of the elements I mentioned. It took me 3-4 days to get to a comfy list like the one below:
Suffices to say I won't be able to provide these tools for you, so please don't PM me asking for a lend.
LEGO Unlock Codes
LEGO games have a feature that lets you type in unlock codes in a certain menu:
If you google for codes for the various games in the series, you will find them easily. Question is where did they come from? Several of them were obtained from developers, several from reverse-engineering, several from filling out SURVEYS and so on.
Here's some examples:
[Link]
Once you enter these CODES in, you can unlock characters or get Studs (in-game currency). Like so:
So I was curious enough to learn what's going on in the function handling the Enter Code menu. And I found that here, in the manually mapped data in my executable:
Hashing Function
Long story short, the developers use the FNV132 hashing algorithm throughout the Nu2 Engine for hashing strings, to get faster look-up results from a high volume of data. All of this while having a low collision rate (meaning the coincidence that 2 different strings return the same hash should be close to 0%, very low). The codes I found online are:
The above codes unlock two characters in the list of 113 possible characters unlocks:
How string hashing works
You may have noticed other unlock codes in my screenshots. Where do those come from? Well.. to begin with..
- the Nu2 Engine uses this part to hash strings, here:
Code: Select all
00000001413FD6A5 | BA C59D1C81 | MOV EDX,811C9DC5 |
00000001413FD6AA | 48:8BCB | MOV RCX,RBX |
00000001413FD6AD | E8 7E0D1CFF | CALL <lego the incredibles_dx11.NuStrHashUpperCaseFNV1(char const*, unsigned int)> |
NuStrHashUpperCaseFNV1:
00000001405BE430 | 4C:8BC1 | MOV R8,RCX |
00000001405BE433 | 48:85C9 | TEST RCX,RCX |
00000001405BE436 | 74 29 | JE lego the incredibles_dx11.1405BE461 |
00000001405BE438 | 0FB601 | MOVZX EAX,BYTE PTR DS:[RCX] |
00000001405BE43B | 84C0 | TEST AL,AL |
00000001405BE43D | 74 22 | JE lego the incredibles_dx11.1405BE461 |
00000001405BE43F | 90 | NOP |
00000001405BE440 | 0FBEC0 | MOVSX EAX,AL |
00000001405BE443 | 8D48 9F | LEA ECX,QWORD PTR DS:[RAX-61] |
00000001405BE446 | 83F9 19 | CMP ECX,19 |
00000001405BE449 | 77 03 | JA lego the incredibles_dx11.1405BE44E |
00000001405BE44B | 83C0 E0 | ADD EAX,FFFFFFE0 |
00000001405BE44E | 69D2 93010001 | IMUL EDX,EDX,1000193 |
00000001405BE454 | 49:FFC0 | INC R8 |
00000001405BE457 | 33D0 | XOR EDX,EAX |
00000001405BE459 | 41:0FB600 | MOVZX EAX,BYTE PTR DS:[R8] |
00000001405BE45D | 84C0 | TEST AL,AL |
00000001405BE45F | 75 DF | JNE lego the incredibles_dx11.1405BE440 |
00000001405BE461 | 8BC2 | MOV EAX,EDX |
00000001405BE463 | C3 | RET |
- strings are passed as arguments and a hash is obtained
- for example, if I enter AAAAAA in the Enter Code menu, the function above returns 0x938E9F77 as value
- this value is then compare internally in a chain of Events and unlocks happen, if the HASHES match
Several other considerations:
- the unlock code can only be 6 characters in length
- the alphabet is A-Z (uppercase) and 0-9, so there's a total of 36 characters
- a person would be nuts if they'd try to BRUTE-FORCE all of the possible combinations of A-Z|0-9 characters, which is 36 by 6
Why am I mentioning all of this, you ask? Because I AM THAT CRAZY PERSON!
The logic
The logic I had is this: OK, so the Engine doesn't statically store all of the strings. Especially the unlock codes. Whenever it needs some hash, they just run the FNV132 function applied to that specific string, then that result (which is a DWORD, 32-bit) is queried in a look-up table. So let's see if I can come up with a way to BRUTE-FORCE all potential combinations of 6 characters given my 36 characters alphabet. That means trying every combination like this:
AAAAAA <- first one
AAAAAB
AAAAAC
AAAAAD
..
AAAAAZ
AAAAA0
..
AAAAA9
AAAABA
..
..
999999 <-- last one
I found a function where the hashes for these codes are checked and they are looking for these:
So the idea is test all of the combinations, get their hashes, then check which one matches those 5 you see above: 0x8D7C719F, 0x7CAA34CA,0x40E8B9BB,0x64DCA50D,0x546CE5F5. For starters.
Getting there
I'm always up for challenges, even if they might seem meaningless. To me they pose value, because from them I LEARN stuff. I was curious to determine if this brute-forcing would work and how long it would take to complete the WHOLE set. In order to get there, I had to perform intermediate steps:
- generate the alphabet I need
Quick google helped: [Link]
To the above I added '0 1 2 3 4 5 6 7 8 9'.Code: Select all
Uppercase Alphabets A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
So my complete alphabet of characters is:
Code: Select all
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9
- print all combinations of a given length with known alphabet
Quick google helped as well: [Link]
I modified the source code to do this instead:
Like I mentioned, I wanted to print all combinations of 6 characters from my alphabet. Running that modified C++ code didn't complete, even if left processing for several hours long. So I had to take it all to my local MSVC2019 and see WTF is going on.Code: Select all
int main() { cout << "First Test" << endl; char set1[] = {'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9'}; int k = 6; printAllKLength(set1, k, 36); }
If it doesn't work, keep at it
The tests above didn't return a result, but I didn't give up.
So the next step for me was to actually test what's going on with the code in MSVC2019. Does it run, does it hang? Also.. if I am doing this, why not also implement the HASHING FUNCTIONS in the code together with the printing of combinations? That way I'd have both the 6 characters string and its hash displayed on the same line
- how does the FNV132 hash work
Google helped:
There's also a variant of it, FNV1A32:
The Nu2 Engine uses FNV132 (remember "NuStrHashUpperCaseFNV1(char const*, unsigned int)" ?). Furthermore, it always converts lower-case to upper-case.
Then I looked for some C++ implementation of FNV132 online, github. Found only for FNV1A32, here: [Link]. So mad propz to the author:
Code: Select all
constexpr uint32_t fnv1a(const char *s, uint32_t h = 0x811C9DC5)
{
return !*s ? h : fnv1a(s + 1, (h ^ (uint8_t)*s) * 0x01000193);
}
Code: Select all
constexpr uint32_t fnv1(const char *s, uint32_t h = 0x811C9DC5)
{
return !*s ? h : fnv1(s + 1, (h * 0x01000193) ^ (uint8_t)*s);
}
Code: Select all
(((uint8_t)(*s - 0x61) <= 0x19) ? ((uint8_t)*s - 0x20) : (uint8_t)*s))
So all in all, the code, combining everything, looks like this:
Code: Select all
// PrintStringCombinations.cpp : This file contains the 'main' function. Program execution begins and ends there.
//
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
typedef unsigned int uint32;
#define FNV1_32A_INIT 0x811c9dc5
#define FNV_32_PRIME 0x01000193
fstream oFile("test.txt", ios::out);
char buf[255] = { 0 };
constexpr uint32_t __declspec(noinline) StrHashUpperCaseFNV132(const char* s, uint32_t h = 0x811C9DC5) {
return !*s ? h : StrHashUpperCaseFNV132(s + 1, (h * 0x01000193) ^ (((uint8_t)(*s - 0x61) <= 0x19) ? ((uint8_t)*s - 0x20) : (uint8_t)*s));
}
constexpr uint32_t __declspec(noinline) StrHashUpperCaseFNV1A32(const char* s, uint32_t h = 0x811C9DC5) {
return !*s ? h : StrHashUpperCaseFNV1A32(s + 1, (h ^ (((uint8_t)(*s - 0x61) <= 0x19) ? ((uint8_t)*s - 0x20) : (uint8_t)*s)) * 0x01000193);
}
// The main recursive method
// to print all possible
// strings of length k
void __declspec(noinline) printAllKLengthRec(char set[], string prefix, int n, int k)
{
// Base case: k is 0,
// print prefix
if (k == 0)
{
//cout << (prefix) << endl;
//oFile << (prefix) << endl;
snprintf(buf, sizeof(buf), "%6s|%08X", prefix, StrHashUpperCaseFNV132(prefix.c_str()));
oFile << buf << endl;
return;
}
// One by one add all characters
// from set and recursively
// call for k equals to k-1
for (int i = 0; i < n; i++)
{
string newPrefix;
// Next character of input added
newPrefix = prefix + set[i];
// k is decreased, because
// we have added a new character
printAllKLengthRec(set, newPrefix, n, k - 1);
}
}
void __declspec(noinline) printAllKLength(char set[], int k, int n)
{
printAllKLengthRec(set, "", n, k);
}
int __declspec(noinline) main()
{
char set[] = {'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9'};
int k = 6;
printAllKLength(set, k, 36);
oFile.close();
system("pause");
}
// Run program: Ctrl + F5 or Debug > Start Without Debugging menu
// Debug program: F5 or Debug > Start Debugging menu
// Tips for Getting Started:
// 1. Use the Solution Explorer window to add/manage files
// 2. Use the Team Explorer window to connect to source control
// 3. Use the Output window to see build output and other messages
// 4. Use the Error List window to view errors
// 5. Go to Project > Add New Item to create new code files, or Project > Add Existing Item to add existing code files to the project
// 6. In the future, to open this project again, go to File > Open > Project and select the .sln file
To compile the above: just open Visual Studio 2019, 'Create a new project', pick 'Console App' and give the project a name. Then just copy-paste the code above and compile it as 'Release' (I compiled it as x64, not x86; your call).
What the program does is to create a file on disk called "test.txt", in which it prints all the combinations of a 36 characters alphabet, in sequence of 6, along with their FNV132 hashes
In the middle of the madness
The above led to the generation of a 34.4 GB .txt file on disk. Completion took almost 4 HOURS and yes, I had to make room on disk in advance, as I didn't know exactly how much I'd needed (though I calculated a roughly ~40 GB). I did want to leave it running, even if it would take 24-48 hours to complete, just to see this insane project come to the conclusion I was looking for.
So:
The file being so big, obviously, you'd not be able to open it with Notepad++ or any text editor. So I used [Link] to do that:
From the above, you can see that the FNV132 hash for 'AAAAAA' is '938E9F77'. If I scroll all the way to the bottom of the file, the last one is '999999' with its hash, '5C7FFC87'.
The practicality of the madness
OK, a 30+ GB file on disk. What's its use now? Remember our 5 hashes? 0x8D7C719F, 0x7CAA34CA,0x40E8B9BB,0x64DCA50D,0x546CE5F5. What happens now if I search for them in my 30-ish GB file? Well..
And the results:
- 8N5GX4|8D7C719F
- O8GXPG|7CAA34CA
- ACUIMG|40E8B9BB
- L948GL|64DCA50D
- 697R3U|546CE5F5
- 7OHKNK|546CE5F5
And what do these UNLOCK CODES do? Well, they give you 10.000 Studs each. Another thing to note is these unlock codes can be used only once per save-game file. So '697R3U' and '7OHKNK' above won't give you 2x10.000, as they both produce the same hash. When the Nu2 Engine processes the unlock, it sets a flag. If you enter '697R3U' and run it, that flag is set. If you then use '7OHKNK', the hash would be the same, so you don't get anything, but you will see an "Unlocked" message flashing in the Enter Code menu
What else can you do with the table/file?
I could send it to [Link] I bet they'd love a dictionary of upper-case A-Z and 0-9, len 6 hashes
Files
The password for both archives below is sunbeam:
x64dbg database with all 834 symbols you can drop into your own \x64dbg\x64\db folder:
[Link]
PrintStringCombinations MSVC2019 project (contains source code, compiled exe and pdb):
[Link]
#eof
P.S.: I saw the unlock codes for LEGO Star Wars have 7 characters now Wonder if the FNV132 function is still used there. If still the same algo, then, like I said, not that many changes in the "new" Engine.