Modding OCTOPATH TRAVELER

So if you're my Patron, it's probably no surprise to you that I love dabbling into modding games and seeing how they work. Recently I finished playing through Octopath Traveler, and had a strong desire afterwards to see if the game could be modded. Particularly, I thought it would be super cool if I could somehow mod a custom boss battle into the game! This seemed like a long shot though, due to a few very big obstacles:



Nonetheless, in the end, I gave it a try and succeeded, so I wanted to share an overview of the journey I took. I think it's pretty interesting to explore the process of how you might figure something like this out only with some research, reverse-engineering, and intuition.


Start of the Journey

So step one was to see what I was working with – looking into the game's directory and seeing what files were in there. If you're lucky, some games might just have all the files sitting there in plain day... raw image files, text files, etc, that you can just freely edit to your heart's content. That is rare, and is definitely not the case here. There wasn't much... the only files of interest was the EXE file for the game, and a single 3 GB large compressed file presumably containing all of the game's resources, assets, and data. The file extension of this compressed file was *.pak.


So off to Google to find out if there's any known way to get the data out of a pak file. After a quick search, it appears Unreal Engine comes with a command line utility called UnrealPak, which is responsible for creating the pak files... but also has the ability to reverse the process and extract the files back out of a pak. That is, as long as a key (aka: password) was not provided when the pak was created, otherwise you need to know what the key was to unpack it. Luckily, supposedly encrypting your pak with a key increases loading times, so many developers omit the key to increase responsiveness of their game -- Octopath Traveler's was not keyed.


I installed Unreal Engine, ran UnrealPak on the pak file, and progress! I got thousands upon thousands of files exported out of that thing. Only problem, they still weren't in any recognizable format that I knew how to work with. All of the extracted files were proprietary UE4 binary files, most with the extension *.uasset or *.uexp.



Texture Swapping

Of course, off to Google again to research these new filetypes. The results are pretty grim... these types of files are the output that Unreal Engine produces when you compile and cook your project for release. This is a one way process, there doesn't really seem to be any way to reverse the process and get the original files back out. These binary files are not designed to be edited, and people online gave the impression that trying to edit them is a fools errand. Nor does there appear to be any documentation anywhere about how these files are internally structured.


The one exception is uasset/uexp files which contain Texture or Model data. Someone made a handy tool called UE Viewer that can let you browse and view the contents of those types of cooked assets, and can even export the contents of a texture asset to a PNG file.



Using UE Viewer, I extracted some test textures from Octopath to PNG files on disk. However, this is only half of the battle, as once I make edits to the texture(s), how do I get them back into the game? UE Viewer does not have functionality for the other way around, to take an image file and convert it back into the uasset/uexp format.


Unreal Engine itself can be used for that purpose, so long as the version of UE4 I use matches the version of UE4 that the game was created with. So what version does Octopath use? Within the files extracted from the pak is a *.uproject file, and opening that in a text editor gives me the answer that the game was created with Unreal Engine 4.18.



With Unreal Engine 4.18 installed, I created a new empty project, and moved the edited PNG files into there. To note, for this to work, the directory structure and file names within this new project (for the files I was editing) needed to match exactly the same as they were in the original Octopath pak file/project. Then I just cooked my new project, and Unreal Engine spit out nice brand new uasset/uexp files from my edited textures for me to use.


Now, the next step would seemingly be to replace the original uasset/uexp files with the new ones, then use UnrealPak to make a new pak file, and replace the original game's pak file with the modded one. This sounds like a giant pain in the ass if this process will need to be redone after every single little edit that I try to make to the game. Turns out it doesn't need to be that way. While Googling stuff, I stumbled upon a tiny little piece of advice mentioned in passing, buried in the middle of some random forum thread... it's the only place I'd ever seen it mentioned, and it's soo incredibly useful: You can remove the original game's pak file and swap it out for the entirety of the unpacked files, and the game will still run just fine! It will run directly off of the unpacked files, and so there's no need to compile new pak files when you're developing! You just edit any of the unpacked files, run the game, and your changes are immediately there. Things are looking exciting now.


The Hard Part

So this is cool, I now know how to edit any of the textures in the game. However, if I'm to reach my ambitious dream of making a custom boss battle, that's going to require a hell of a lot more than just swapping images. I need to edit stats, I need to edit movesets, I need to edit encounter data, I need to change names and edit text, etc, etc, etc. Unfortunately, at this point, no amount of Google searching is going to help me anymore with any of that, I'm completely on my own from here.


Here's what I've gathered. Each original asset is broken up into two cooked files: one uasset file & one uexp file. From looking at filesizes of some of the texture ones I messed with already, the uasset file is always small, and the uexp file is always much larger. My theory then, is that the uasset file probably just contains metadata, while the uexp file contains the actual binary data.


Opening the uassets in a hex editor confirms this theory, they seem to primarily consist of metadata regarding variable names & data types, as well as some mapping of file paths/assets to probably the associated area in the uexp file? I can't make sense of how the mapping works, but yeah... it's metadata.



The uexp of one of the texture assets on the other hand? Gibberish as expected... presumably the contents of that uexp is just compressed image data. So uexp is what I'm really after, if I'm looking to mod some of the game's data.


Digging through more of the game files, I see lots of intriguing stuff that seems relevant to my interests that I wish I could edit. “BattleEncounterData.uexp”, “EnemyGroupData.uexp”, “EnemyDB.uexp”, “GameTextEN.uexp”, etc. So, lets take a look at one of these files... can we make sense of it, and reverse-engineer its structure?



Well, at an initial glance it looks like incomprehensible gibberish, and the entirety of the file looks like this from beginning to end. But if it's not encrypted (which it doesn't necessarily appear to be), then there has to be some logical structure to it. So I start by trying to look for any patterns in the data, any frequency analysis, something... that might give hints to what that structure is. And I do spot some kind of repeating pattern. In the image below, I highlight some of the pattern that I'm seeing with color coded marks on the left, and on the right one full instance of the pattern is highlighted:



Analyzing this pattern more and looking through more of the file, what it appears to me is that this pattern repeats all the way until the end of the file, it appears to repeat after the same number of bytes each time (aka, the pattern is a fixed length), and the patterns seem to start at 0x0000002D in the file (before that might be a file header). So I create myself a python script to extract blocks of bytes of that fixed pattern length, and output it to a text file -- one instance of the pattern per line. Viewing the output of my script makes the pattern much more apparent:



One thing noticeable here is some bytes in the pattern are the same for every single instance of the pattern throughout the whole file. Other bytes change from one instance of the pattern to the next. Those changing ones are the ones interesting to me, as those are the ones that must represent meaningful data to be edited. I create a second python script to find only the bytes that change, and generate a CSV file with each of those changing instances as a separate column with its respective byte offset within the pattern.



The bolded numbers in the first row are the byte offsets from the start of the pattern. Everything afterwards is the pattern data. So each row here represents one asset/entity/whatever. For example, in the EnemyGroupData.uexp (which is the example in the image above), each row is the data for one pre-defined collection of enemies that you can encounter together in a wild battle, or boss battle. In the EnemyDB.uexp file, each row would be the data for one enemy in the game. Each column in this CSV is some relevant variable that can be modified about those entities. For example, in the EnemyGroupData.uexp, one of the columns represents whether you're allowed to flee from the battle. One column represents the formation the enemies appear in (are they positioned in a circle, in a line, in a traingle, etc?). Some of the columns represent which specific enemies appear in that encounter. And so forth. So the challenge then becomes to go from the cryptic jumble of numbers in the image above, to deciphering what each column/row actually represents. The more columns/rows you can manage to decipher, the more aspects of the game you can learn how to modify. Solving this has a lot of similarities to the techniques you would use to solve one of those logic grid puzzles:



As an example, say I'm trying to decipher the data in EnemyDB.uexp. I start by focusing on one enemy, let's call them BAD GUY. I know by playing the game that BAD GUY has 500 HP. So I pull out my calculator, convert 500 to hexadecimal (0x01F4), and search for any instance of that number in any adjacent columns in my CSV file. I find that five of the rows contain 0x01F4, and now know that one of those five rows cooresponds to BAD GUY. I find some other information from the game... I know that BAD GUY gives 10000 EXP when defeated. So again I convert 10000 to hex (0x2710), and search those five rows for that value. Only one of the five rows contains it, so now I've successfully learned three things: Which row cooresponds to BAD GUY, which column(s) coorespond to HP, and which column(s) coorespond to EXP. I can continue this process to learn more things.


Another alternate way I can learn things is by just taking some column that I haven't deciphered, and blindly change the values in that column. Then go into the game and play around and see if I find anything noticable that has been changed. This can be a hit-or-miss method, but if you can manage to spot the difference in-game, it can help you figure out some of the more challenging and obscure columns. My preferred method for doing this is to just change a huge number of columns at once, and from that there'll usually be at least a few immediately noticable things different in-game. Then start reverting the changed columns by half-chunks at a time (think O(log n) Binary Search kind of approach), and see when the changes persist or go away, using that to narrow down to specific columns that caused the changes.


You can also get an idea of the column contents just by the type of data you see in that column. All 1's and 0's? Probably some True/False boolean variable. Are the values all big numbers or small numbers? Do the numbers vary a lot or a little? Are all the numbers in that column unique across all rows, or are there some repeat values across rows? Any patterns you notice in the column data? Things like this can help you to maybe make a hypothesis and narrow down what the column might be for.


In the end, through a lot of going back and forth with trial and error, I manage to decipher many things, enough to be able to make the custom boss battle that I wanted to make.






References

Here are some handy documents, notes, and sheets I put together during my process of reverse engineering stuff in the process of creating this mod, that may prove useful for other people continuing to discover how to mod things in the game:



Below is the custom boss battle in action: