Megaman Legends 2 File Format Analysis Thread
Apr 4, 2018 0:09:53 GMT -5
Trege and ShadyRounds like this
Post by kion on Apr 4, 2018 0:09:53 GMT -5
Textures and Methodology
I was planning on working with Megaman Legends 1 more before moving on to Megaman Legends 2. But since I saw that there was interest in MML2 on twitter, I decided to take a look at it. So as I poke through MML2, I'll write what I find out here as a reference.
A quick note on how the files are laid out on the PSX version. There is a COMMON folder which has the meshes that make up Megaman, the title screen, and game over screen. And then there is DAT which has all of the information for NPC's and enemies stored in each scene. If you use the Debug Version Of Megaman Legends 2, you can use that to see which scene is loaded, and then look up which files are used in that scene from there. One major difference that sticks out between MML1 and MML2 is the lack of variation between scenes. In MML there would be several versions of each scene that would load different characters depending on the scenario, whereas MML2 only has a few different scenarios per scene. I don't know if this is because the game is more linear or because they were able to pack everything they need into one file and then selectively load depending on the resources needed for that given scene.
I haven't been able to get the PC version of the game to run on Windows 10, through Wine, on KVM, or OpenBox. It doesn't seem like much of a loss as the PSX seems to have clear headers where-as in the PC version all of the data is mashed together. Depending on progress with the PSX, it might be nice to be able to get the PC version running for contrast. Though for the purposes of this thread it can be assumed that information pertains to the PSX version unless stated otherwise.
Originally I intended to skip over textures in favor of looking at the meshes, but I wasn't aware of the existence of the debug version, so I figured that starting with textures might allow me to export them as a reference for which scene is which. And in the back of my mind I was hoping they'd be in the same format with a different header. So I created a save state on Yosyonke Pad, and then wrote a small script that took a small part of each file in the DAT folder and compared it with the save state data to see what matched and found that ST08T.BIN matched. I then noticed that there is a ST08.BIN and a ST08T.BIN, so one might be for the geometry and the other for the texture. I then started to erase data and see what changed. And was able to determine that 0x02 is the header for pallets and 0x03 for images.
The troubling part is that with Megaman Legends 1, writing over part of an image file meant writing over a few pixels. But with Megaman Legends 2, writing over a few bytes causes the entire texture to become distorted.
Another issue is that there doesn't seem to be a place for the pallet. In MML1, the pallet was included in the top of each TIM, file. But in MML2, that doesn't appear to be the case (or so I thought). Since a pallet is made up of several different colors, you'd expect to see several different uint16 values at the top of the file, but what you see instead (in the case of the area map) is 0xff7f0000 followed by a long list of F's.
So I thought that maybe only the image content was included in the 0x03 files and the pallets were stored in the framebuffer separately. But after commenting out all of the pallet files, several textures were unaffected. Meaning that it looks like the pallets are likely in there somewhere. Though in order to determine the format I'll have to test more options.
While I don't have a clear idea yet of what the texture format is, the file header is pretty simple to read. Here's a pallet file:
So 0x02 is the magic number for pallets. 0x20 is the file length in bytes. Not sure what 0x01 is yet, I can comment out this dword and things will generally still work. 0x0000fe00 is the pallet framebuffer x,y location (comment this out and it goes away). And then 0x10000100 is 0x10 is the number of colors as the first short and 0x01 as the number of pallets as the second short, and the data starts at 0x30 after the header.
For images, the header uses the same format and then appends more information on the end. For the area map texture, we have the following header:
So 0x03 is the magic number for images. 0x8020 seems like the expanded length for image + included texture (though 0x8000 bytes seems kind of big). 0x0000f701 is the pallet framebuffer x,y. 0x10000100 is 0x10 for 16 colors and 0x01 for one pallet. 0x40030001 is the image framebuffer x and y. And then 0x40000001 is 0x40 for the width and 0x100 for the height. Though it looks like they do something similar to Megaman Legends one for the width and height. For 4 bit indexed images, the width get's multiplied by 4, which would give us a 0x100 x 0x100 or 256x256 texture size. And lastly I'm not sure what 0xf8000200 at this point in time is yet.
So playing around with the header, you can do things like comment out the pallet framebuffer x,y to make objects in the environment to disappear.
Or swap the image framebuffer x and y, so that Roll's texture get's drawn on to the area map:
Since I managed to find out that there is a debug version of Megaman Legends 2, that allowed me to easily look up that the first dungeon is ST0F in the data. And in ST0F00 there is a file that looks similar to the EBD files in MML1, so I will be looking into that. For right now I figured I'd document what I managed to parse out for images in terms of magic numbers and headers.
Edit:
An unexpected discovery. I decided to take another look at the PC version files, and while they don't have file headers as clean as the PSX version, it looks like the data is a lot easier to read. We can see on the pc version, 0x2c is the length of the segment, pallet x/y, number of colors and pallets followed by the pallet (highlighted). And then for the next segment 0x80c for the length, framebuffer x/y, width, height and then the data. And the length actually makes sense, since 0x0c is the length of the header. And since the texture is 0x100 by 0x100 or 0x1000 pixels in size, and since we have two pixels per bytes, that gives us 0x800 which is what is described in the header. So if the PC version is easier to read, I might have to do some more testing to figure out how to run it.
Edit:
Exported two textures from the PC version:
As luck would have it, the PC version packs files one on top of the other. Which makes it hard to determine exactly what data is being determined. But it looks like there are two hints to get started with using the information. The first hint is that it looks like the PC version either uses the same framebuffer coordinates (for some reason?) as the PSX game. So by figuring out where a given texture is in the PSX data, I can then search for the framebuffer or pallet coordinates and find them in the PC data. The other hint for the PC is that file lengths are very accurately defined. Something that doens't seem to work out very well in the PSX data. Which makes it pretty easy to trace though a file until I find the data needed.
That aside it looks like there aren't any tricks for the PC data. You read the pallet, you read the image, and all of the pixels are in order. So all you have to do is reference the pallet to draw the texture. There doesn't seem to be any weird optimization tricks to make parsing the data particularly difficult.
I was planning on working with Megaman Legends 1 more before moving on to Megaman Legends 2. But since I saw that there was interest in MML2 on twitter, I decided to take a look at it. So as I poke through MML2, I'll write what I find out here as a reference.
A quick note on how the files are laid out on the PSX version. There is a COMMON folder which has the meshes that make up Megaman, the title screen, and game over screen. And then there is DAT which has all of the information for NPC's and enemies stored in each scene. If you use the Debug Version Of Megaman Legends 2, you can use that to see which scene is loaded, and then look up which files are used in that scene from there. One major difference that sticks out between MML1 and MML2 is the lack of variation between scenes. In MML there would be several versions of each scene that would load different characters depending on the scenario, whereas MML2 only has a few different scenarios per scene. I don't know if this is because the game is more linear or because they were able to pack everything they need into one file and then selectively load depending on the resources needed for that given scene.
I haven't been able to get the PC version of the game to run on Windows 10, through Wine, on KVM, or OpenBox. It doesn't seem like much of a loss as the PSX seems to have clear headers where-as in the PC version all of the data is mashed together. Depending on progress with the PSX, it might be nice to be able to get the PC version running for contrast. Though for the purposes of this thread it can be assumed that information pertains to the PSX version unless stated otherwise.
Originally I intended to skip over textures in favor of looking at the meshes, but I wasn't aware of the existence of the debug version, so I figured that starting with textures might allow me to export them as a reference for which scene is which. And in the back of my mind I was hoping they'd be in the same format with a different header. So I created a save state on Yosyonke Pad, and then wrote a small script that took a small part of each file in the DAT folder and compared it with the save state data to see what matched and found that ST08T.BIN matched. I then noticed that there is a ST08.BIN and a ST08T.BIN, so one might be for the geometry and the other for the texture. I then started to erase data and see what changed. And was able to determine that 0x02 is the header for pallets and 0x03 for images.
Offset | Observed Effect |
0xe800 | Gate Into Town |
0x13800 | House other textures |
0x1c000 | Outer fence, ladder and stairs |
0x20800 | Docking Tower next to Flutter |
0x26000 | |
0x2c800 | |
0x34000 | |
0x37800 | Ground Textures |
0x3c800 | |
0x3e000 | Area Map |
0x3f800 | Roll |
0x42000 | |
0x44000 | NPC |
0x46000 | Data The Monkey |
0x47000 | Row 16 column 2 |
Flutter | 0x4b800 |
The troubling part is that with Megaman Legends 1, writing over part of an image file meant writing over a few pixels. But with Megaman Legends 2, writing over a few bytes causes the entire texture to become distorted.
Another issue is that there doesn't seem to be a place for the pallet. In MML1, the pallet was included in the top of each TIM, file. But in MML2, that doesn't appear to be the case (or so I thought). Since a pallet is made up of several different colors, you'd expect to see several different uint16 values at the top of the file, but what you see instead (in the case of the area map) is 0xff7f0000 followed by a long list of F's.
00003e000: 03000000 20800000 03000000 0000f701 .... ...........
00003e010: 10000100 40030001 40000001 00000000 ....@...@.......
00003e020: 00000000 f8000200 00000000 00000000 ................
00003e030: ff7f0000 ffffffff ffffffff ffffffff ................
00003e040: ffffffff ffffffff ffffffff ffffffff ................
So I thought that maybe only the image content was included in the 0x03 files and the pallets were stored in the framebuffer separately. But after commenting out all of the pallet files, several textures were unaffected. Meaning that it looks like the pallets are likely in there somewhere. Though in order to determine the format I'll have to test more options.
- Try compression like lz77
- Look at the other versions in hopes of more readable formats
- Brute force dump the framebuffer as a 1MB for a given scene
- Set a breakpoint in NO$PSX and trace through the framebuffer population
While I don't have a clear idea yet of what the texture format is, the file header is pretty simple to read. Here's a pallet file:
000025000: 02000000 20000000 01000000 0000fe00 .... ...........
000025010: 10000100 00000000 00000000 00000000 ................
000025020: 00000000 00000000 00000000 00000000 ................
000025030: 000065a9 45a524a1 049de394 c290f2a5 ..e.E.$.........
000025040: d1a1b09d 6e9d91c6 edb9c794 a6908590 ....n...........
So 0x02 is the magic number for pallets. 0x20 is the file length in bytes. Not sure what 0x01 is yet, I can comment out this dword and things will generally still work. 0x0000fe00 is the pallet framebuffer x,y location (comment this out and it goes away). And then 0x10000100 is 0x10 is the number of colors as the first short and 0x01 as the number of pallets as the second short, and the data starts at 0x30 after the header.
For images, the header uses the same format and then appends more information on the end. For the area map texture, we have the following header:
00003e000: 03000000 20800000 03000000 0000f701 .... ...........
00003e010: 10000100 40030001 40000001 00000000 ....@...@.......
00003e020: 00000000 f8000200 00000000 00000000 ................
00003e030: ff7f0000 ffffffff ffffffff ffffffff ................
00003e040: ffffffff ffffffff ffffffff ffffffff ................
So 0x03 is the magic number for images. 0x8020 seems like the expanded length for image + included texture (though 0x8000 bytes seems kind of big). 0x0000f701 is the pallet framebuffer x,y. 0x10000100 is 0x10 for 16 colors and 0x01 for one pallet. 0x40030001 is the image framebuffer x and y. And then 0x40000001 is 0x40 for the width and 0x100 for the height. Though it looks like they do something similar to Megaman Legends one for the width and height. For 4 bit indexed images, the width get's multiplied by 4, which would give us a 0x100 x 0x100 or 256x256 texture size. And lastly I'm not sure what 0xf8000200 at this point in time is yet.
So playing around with the header, you can do things like comment out the pallet framebuffer x,y to make objects in the environment to disappear.
Or swap the image framebuffer x and y, so that Roll's texture get's drawn on to the area map:
Since I managed to find out that there is a debug version of Megaman Legends 2, that allowed me to easily look up that the first dungeon is ST0F in the data. And in ST0F00 there is a file that looks similar to the EBD files in MML1, so I will be looking into that. For right now I figured I'd document what I managed to parse out for images in terms of magic numbers and headers.
Edit:
An unexpected discovery. I decided to take another look at the PC version files, and while they don't have file headers as clean as the PSX version, it looks like the data is a lot easier to read. We can see on the pc version, 0x2c is the length of the segment, pallet x/y, number of colors and pallets followed by the pallet (highlighted). And then for the next segment 0x80c for the length, framebuffer x/y, width, height and then the data. And the length actually makes sense, since 0x0c is the length of the header. And since the texture is 0x100 by 0x100 or 0x1000 pixels in size, and since we have two pixels per bytes, that gives us 0x800 which is what is described in the header. So if the PC version is easier to read, I might have to do some more testing to figure out how to run it.
Edit:
Exported two textures from the PC version:
As luck would have it, the PC version packs files one on top of the other. Which makes it hard to determine exactly what data is being determined. But it looks like there are two hints to get started with using the information. The first hint is that it looks like the PC version either uses the same framebuffer coordinates (for some reason?) as the PSX game. So by figuring out where a given texture is in the PSX data, I can then search for the framebuffer or pallet coordinates and find them in the PC data. The other hint for the PC is that file lengths are very accurately defined. Something that doens't seem to work out very well in the PSX data. Which makes it pretty easy to trace though a file until I find the data needed.
That aside it looks like there aren't any tricks for the PC data. You read the pallet, you read the image, and all of the pixels are in order. So all you have to do is reference the pallet to draw the texture. There doesn't seem to be any weird optimization tricks to make parsing the data particularly difficult.
// ST08T.DAT (pc version)
// Area Map
var pallet_ofs = 0x4a508;
var pallet = new Array(16);
// Read the pallet
for (var i = 0; i < pallet.length; i++) {
pallet[i] = fp.readUInt16LE(pallet_ofs + (i * 2));
}
var image_ofs = 0x4a534;
var image_body = new Array();
// Read the image
for (var i = 0; i < 0x8000; i++) {
var byte = fp.readUInt8(image_ofs + i);
image_body.push(pallet[(byte & 0xf)]);
image_body.push(pallet[(byte >> 4)]);
}
var canvas = document.createElement("canvas");
canvas.width = 256;
canvas.height = 256;
var ctx = canvas.getContext("2d");
// Convert from RGBA5551 to RGBA8888 and draw pixel by pixel
var ofs = 0;
for (var y = 0; y < canvas.height; y++) {
for (var x = 0; x < canvas.width; x++) {
var r = ((image_body[ofs] >> 0x00) & 0x1f) << 3;
var g = ((image_body[ofs] >> 0x05) & 0x1f) << 3;
var b = ((image_body[ofs] >> 0x0a) & 0x1f) << 3;
var a = image_body[ofs] > 0 ? 1 : 0;
ctx.fillStyle = "rgba(" + r + "," + g + "," + b + "," + a + ")";
ctx.fillRect(x, y, 1, 1);
ofs++;
}
}