Table of Contents

SC2Inspector - C#

Synopsis

This project was originally envisioned to be a C# version of SC2Gears. The goal was to provide an extremely detailed look at the SC2 Replay File format.

Reference

MPQ\x1B Format

Serialized Data (most up to date):

0 (ArrayWithKeys) - Base Array
|--- 0 (BinaryData) - Starcraft II Replay 11
|--- 1 (ArrayWithKeys) - Data Array
|    |--- 0 (NumberInVLF) - Unknown
|    |--- 1 (NumberInVLF) - Version: Major
|    |--- 2 (NumberInVLF) - Version: Minor
|    |--- 3 (NumberInVLF) - Version: Patch
|    |--- 4 (NumberInVLF) - Version: Build
|    `--- 5 (NumberInVLF) - Unknown, might be Revision? Seems close to Build
|--- 2 (NumberInVLF) - Unknown
`--- 3 (NumberInVLF) - Game Length (in 1/16 seconds)

Extended information, all correct except for where noted at the end.

Attribute Location (d) Hex Value Interpretation
Id x0-x3 4D 50 51 1B MPQ\x1B
UserDataMaxSize x4-x7 00 02 00 00 512
HeaderOffset x8-x11 00 04 00 1024
UserDataSize x12-x15 3C 00 00 00 60
DataType x16 05 DataType indicated upcomming data is of type Array with keys
NumberOfElements x17 08 Indicates 4 elements in array (VLF)
Index x18 00 Sets index to 0
DataType x19 02 DataType indicated upcomming data is of type Binary Data
NumberOfElements x20 2C Indicates 22 elements in the upcomming array
StarCraftII x21-x42 53 74 61 72 43 72 61 66 74 20 49 49 20 72 65 70 6C 61 79 1B 31 31 A bunch of hex values which resolve to a byte array of ASCII values which end up as “Starcraft II Replay 11”
Index x43 02 Sets index to 1
DataType x44 05 DataType indicated upcomming data is of type Array with keys
NumberOfElements x45 0C Indicates 6 elements in array (VLF)
Index x46 00 Sets index to 0
DataType x47 09 DataType indicated upcomming data is of type VLF
Version x48 02 Major Version
Index x49 02 Sets index to 1
DataType x50 09 DataType indicated upcomming data is of type VLF
Version x51 02 Minor Version
Index x52 04 Sets index to 2
DataType x53 09 DataType indicated upcomming data is of type VLF
Version x54 00 Patch Version
Index x55 06 Sets index to 3
DataType x56 09 DataType indicated upcomming data is of type VLF
Version x57 00 Revision Version
Index x58 08 Sets index to 4
DataType x59 09 DataType indicated upcomming data is of type VLF
Version x60 EA Build Version
Unknown x61-x75 FB 01 0A 09 DA F0 01 04 09 04 06 09 FE 9E 05 FB I'm unsure after this part. Nothing seems to add up correctly.

replay.details Format

0 (ArrayWithKeys) - Base array
|--- 0 (SimpleArray) - Array of player structs
|    `--- x (ArrayWithKeys) - Player struct
|         |--- 0 (BinaryData) - Player name
|         |--- 1 (ArrayWithKeys) - Probably some further details
|         |    |--- 0 (NumberInVLF) - Unknown
|         |    |--- 1 (NumberOfFourBytes) - Unknown
|         |    |--- 2 (NumberInVLF) - Unknown
|         |    `--- 4 (NumberInVLF) - RealID
|         |--- 2 (BinaryData) - Localized race name
|         |--- 3 (ArrayWithKeys) - Array of player color values
|         |    |--- 0 (NumberInVLF) - Alpha
|         |    |--- 1 (NumberInVLF) - Red
|         |    |--- 2 (NumberInVLF) - Green
|         |    `--- 3 (NumberInVLF) - Blue
|         |--- 4 (NumberInVLF) - Unknown
|         |--- 5 (NumberInVLF) - Unknown
|         |--- 6 (NumberInVLF) - Handicap
|         |--- 7 (NumberInVLF) - Team
|         `--- 8 (NumberInVLF) - Unknown
|--- 1 (BinaryData) - Localized map name
|--- 2 (BinaryData) - Unknown
|--- 3 (ArrayWithKeys) - Array containing map preview file names
|    `--- 0 (BinaryData) - Map preview file name
|--- 4 (NumberOfOneByte) - Unknown
|--- 5 (NumberInVLF) - Save time of the replay
|--- 6 (NumberInVLF) - Unknown
|--- 7 (BinaryData) - Unknown
|--- 8 (BinaryData) - Unknown
|--- 9 (BinaryData) - Unknown
|--- 10 (SimpleArray) - Likely something about the map file
|    |--- 0 (BinaryData) - Unknown
|    `--- 1 (BinaryData) - Unknown
|--- 11 (NumberOfOneByte) - Unknown
|--- 12 (NumberInVLF) - Unknown
`--- 13 (NumberInVLF) - Unknown

replay.attributes.events Sample Data

0x0BBF
    0x1 - Part
    0x2 - Part
0x01F4
    0x1 - Humn
    0x2 - Humn
0x0BB9
    0x1 - Zerg
    0x2 - Terr
0x07DC
    0x1 - T3
    0x2 - T1
0x07E2
    0x1 - T3
    0x2 - T4 
0x07D7
    0x1 - T1
    0x2 - T2
0x0BBB
    0x1 -  100
    0x2 -  100
0x07D6
    0x1 - T3
    0x2 - T4
0x07D2
    0x1 - T2
    0x2 - T1
0x0BBA
    0x1 - tc04
    0x2 - tc07
0x07D8
    0x1 - T1
    0x2 - T2
0x07D4
    0x1 - T1
    0x2 - T2
0x07D3
    0x1 - T2
    0x2 - T2
0x0BBC
    0x1 - Medi
    0x2 - Medi
0x07DB
    0x1 - T1
    0x2 - T2
0x07D5
    0x1 - T1
    0x2 - T2
0x03E8
    0x10 - Dflt
0x0BC0
    0x1 - Obs
    0x2 - Obs
0x0BC2
    0x10 - yes
0x07E1
    0x1 - T3
    0x2 - T4
0x07D0
    0x10 - t2

Project Log

1/2/2012 @ 05:26

Ok so I've been working on this most of the day. I've finished the InitData file as well as the AttributesEvents file! It looks like I'm 100% done with the game metadata about the players. Here's all the information I have: SC2Inspector ReplayDetails Locals

The attributes.events file has kind of an interesting format. This one bit of code pretty much does all the work:

uint NumAttribs = BinaryReader.ReadUInt32();
uint AttribHeader;
uint AttribId;
int PlayerId;
string AttribVal;
int NumSlots;
for (int i = 0; i < NumAttribs; i++) {
	AttribHeader = BinaryReader.ReadUInt32();
	AttribId = BinaryReader.ReadUInt32();
	PlayerId = BinaryReader.ReadByte();
	AttribVal = Conversion.ReverseString(Encoding.Default.GetString(BinaryReader.ReadBytes(4))).Replace("\0", String.Empty);
	if (!AttribDict.ContainsKey(AttribId)) {
		AttribDict.Add(AttribId, new Dictionary<int, string>());
	}
	AttribDict[AttribId].Add(PlayerId, AttribVal);
}
if (NumAttribs == 0) {
	throw new Exception("Zero attributes.");
}

I run that code after I read four bytes from the beginning. It splits everything out into what can be used as a multidimensional associative array. Here's some sample data: replay.attributes.events Sample Data. There's some information in there which I'm not sure about either. Either way this segment is done. I FINALLY think that tomorrow I can go ahead and start on replay.events!

Time to update this table:

Filename Purpose
replay.details Contains sometimes inaccurate (?) data about players including their name, RealId, Race, Map name, save time, etc.
replay.initData Contains information about who is playing (names), as well as the Realm, Map hash, and some sort of account identifier.
replay.attribute.events Contains detailed information about the players, their race, difficulty, color, team info, game speed, etc.
replay.game.events Actions
replay.message.events Chat, Ping
replay.smartcam.events Presumably player cameras
replay.sync.events Presumably consistency checks

Committed r7.

12/31/2011 @ 03:12

Ok so I've been slacking on my documentation. I have fully parsed all of the documented fields in the replay.details file. Here is what the output looks like: ReplayDetails Locals Window

Wow ok, so after a 45min battle of trying to get that file to upload with the new Dokuwiki install, we're back on track. I ran into an issue with ParseVLFNumber() where it wouldn't spit out (what I expected to be) a VERY large number representing the timestamp of the replay. I went around and around and decided that the issue was that ParseVLFNumber() (modified from sc2replay-csharp) was using ints for everything. Obviously this number was too big for the int so I modified ParseVLFNumber() to use longs for everything. I don't expect to see a number bigger than a long, but who knows. That allowed me to pull out the date successfully. I had some issues determining whether the timestamp was UTC or the user's local time. I discovered the timestamp is UTC and that there is an additional timezone field which tells how much of an UTC offset the recorder had.

I also modified InspectorViewModel and ReplayViewModel to allow for multiple replays to be loaded. I'm very happy with how fast the program is (not that the files are that big, but it is complex) at the moment. I've finished replay.details. I think I need to move on to InitData next, but I'm not sure.

Because I went back and modified MPQ\x1B UserData retrieval I was able to successfully extract the actual game playtime. The value that is given is in 1/16s of a second. I do not know where this value comes from. I do know that the current version of Sc2gears (8.8) is displaying incorrect game lengths. It seems to be applying ~1/22 of a second to the values. I verified this by viewing a replay in the StarCraft II client itself. The game time is 20:02 and SC2gears reports it as 14:28. Weird.

Committed r6.

12/31/2011 @ 00:16

I've taken some time to update the wiki page. I've added the references section and went back and converted the old MPQ\x1B information from raw byte “queries” into the serialized data parser. Everything is MUCH cleaner now.

12/30/2011 @ 20:22

I've been successful so far in decrypting the replay.details file. I spent some time and developed a serialized data parser. I've been able to piece together the format of the replay.details file: replay.details File Format

12/30/2011 @ 18:03

I've finally gotten to the point where I've read and enumerated all of the files in the MPQ. I think I did a really good job laying out my classes. I have a MPQArchive class which looks like this (displaying listfile contents): MPQArchive Locals Window

Now I have to determine what exactly is in each replay.* file.

Filename Purpose
replay.details Basic metadata
replay.initData Unknown
replay.attribute.events Unknown
replay.game.events Actions
replay.message.events Chat, Ping
replay.smartcam.events Presumably player cameras
replay.sync.events Presumably consistency checks

12/30/2011 @ 00:39

Ok well I took a short break to hang out with my roommates but I've finally be able to retrieve the raw uncompressed data for each file! I'm fairly sure this means that the next step is to start parsing the actual SC2 data! Right here I'd like to give a big shout out to Foole, the author of “mpqtool” (http://code.google.com/p/mpqtool/). This stuff is very complex and I don't understand a lot about how to decrypt the hash tables and such. The code from his tool has helped me IMMENSELY. Looking at his copyright it looks like most of his code is based off StormLib by Ladislav Zezula. Thanks to both of you!

I've spent some time getting my comments up to date. I've also begun using SharpZipLib (http://www.icsharpcode.net/opensource/sharpziplib/) to do the BZip2/GZip decompression. I went into the project wanting to do everything myself and I think I've done good so far. Writing a decompressor for BZip2/GZip would be a project within itself. I think it's good that I've done the MPQ stuff myself. I can make changes should Blizzard change the format of the SC2Replay file.

Once again I'm looking over my code and it doesn't seem like it's TOO complicated. I've gotten very good at manipulating bytes and streams. Actually lastnight I discovered the Hexadecimal display for Locals/Watches this has been immensely helpful for looking over the variable vales and comparing them to the actual file's data. I has been using “(listfile)” as the file to test my decryption and everything on. The file is very short (only a few hundred bytes). I extracted the file with Ladik's MPQ Editor and look at the data in a hex editor. I compared the data to the RawContents variable that is supposed to have the decompressed contents of the file. They matched! I then changed the file I'd been retrieving to the one with all the data: “replay.game.events”. I was very excited to see that the length of the byte[] matched the size of the file and that the first several bytes match as well as the last several bytes. I think that's enough to call this portion of the project done. The only things I expect to have to deal with MPQ files are maybe playing with “(listfile)” or “(attributes)”.

I'm going to take a short break then see what I can decipher from the other replay.game.* files. Committed r5.

12/29/2011 @ 20:44

Well I've been working on this for about an hour and I decided to redo the block stuff. Now both the BlockTable and the HashTable are Hashtables with Hash and Block objects stored within. This makes it easier to resolve the HashTable entries to their BlockTable entries. I've been checking all my numbers against “Ladik's MPQ Editor” (amazing tool) and everything is coming out perfectly. The next step is to read the actual file information. I'm tempted to make a MPQFile class but I think I'll store all the data within MPQBlock and keep everything in one place (this already has file flags, compressed size, location, etc). Onward! Committed r3.

12/29/2011 @ 05:57

Finally making some promising progress! I believe I last left off where I had just finished parsing MPQ\x1B and the User Data. I've since made quite a bit of progress parsing MPQ\x1A (forever after known as “The MPQ File”). I was able to successfully parse the Header from the MPQ file, which told me where and how long the Hash Table and Block Tables were. I was then able to decipher the Hash Table and Block Tables. I was THEN successfully able to take a filename (for instance “(listfile)”), hash it, compare the hash to the hash table and find it's hash. I believe I can then use the BlockIndex in the Hash Table and look it up in the Block Table to find the file's location, which can then be “cut out” and decrypted or decompressed.

I've used more code than I'd like from other people, but most of this is bit-shifting magic that is beyond me. These being most of the Hash Table encryption/decryption. I think I'm finally getting close to the actual SC2 stuff. I used MPQExtractor to take a look at some of the replay.whatever files and they're in their own internal format, which should be a daunting task to figure out. I just added a bunch of useful resources in case I can't find them again.

I feel like I should be writing more, but when I look over the code it's pretty simple in hindsight. It's simply using BinaryReader to read a bunch of numbers from a binary file. The only complex parts are the decryption parts which I find very confusing. I think I'll write a BROAD bit of pseudo-code about how this thing works:

Open SC2Replay rile
Read first 4 bytes
If first 4 bytes are MPQ\x1B Then
	Read User Data from MPQ1B header (Version, etc), most importantly the Header Offset (usually 1024)
End If
Advance Byte Buffer to Header Offset (1024)
Read next 4 bytes
If first 4 bytes are MPQ\x1A Then
	Read information about MPQ1A header (ArchiveSize, MPQVersion, HashTable & BlockTable Size & Position, etc)
End If
Advance Byte Buffer to beginning of HashTable
Read next (HashTableSize * 16) Bytes
Decrypt HashTable
Read the following over and over (16 bytes total) from the decrypted stream: Name1, Name2, Locale, BlockIndex
Store the above in a C# HashTable so we can easily access it
Advance Byte Buffer to beginning of BlockTable
Read next (BlockTableSize * 16) Bytes
Decrypt BlockTable
Read the following over and over (16 bytes total) from the decrypted stream: FileOffset, (compute FilePos), Compressed Size, FileSize, Flags
Store the above in a C# List so we can easily access it

In order for the Hashtable to be super optimized the filename is converted to a number (hash) which is then stored. This hash contains information about which blockindex the block data resides in. The blockindex then tells us where the data for the filename is in the file. It's also important to note that there are apparently several different Hash Types.

From what I can determine:

Hash Type Purpose
0x000 Hashes an Index
0x100 Hashes Name1
0x200 Hashes Name2
0x300 Hashes Table Data

I'm not 100% sure where these Hash Types factor into the encryption. There appears to be two seeds and then the Hash Type is the offset for the hash. I don't know what this means. See SC2Inspector.MPQLogic.MPQUtilities.HashString(string input, int offset).

It seems like I done an incredible amount of work in just one day. I've been at this for over 12 hours it looks like. Oh well, I'm off to bed since it's 6am. I _REALLY_ hope I resume this project tomorrow. Not even going to close all my VS/Chrome windows.

Committed r2.

12/28/2011 @ 23:46

Ok, so I've gotten a decent amount done. I was looking at code for an MPQ parser and it looks like the SC2Replay files have a slightly different format. They start with MPQ\x1B then 1024 bytes into the file have another MPQ\x1A which actually starts the normal MPQ file. MPQ1B seems to be a StarCraft II only option for displaying additional metadata without having to read the file.

Using the following data taken from a random SC2Replay:

I was able to determine the following: MPQ\x1B Format

VLF represents a “Variable Length Format” integer.

Additionally, SC2Replay files have a quirk concerning the way integers are stored. An integer consists of a variable number of bytes in Big Endian order. When parsing an integer, the first i.e. most significant bit of a byte indicates that the succeeding byte is counted towards the integer's value. After parsing all bytes of a number, the least significant bit of the result indicates the sign. Extract this bit and shift the number's value to the right by one. If the bit is set, change the sign to negative, otherwise leave it positive.
Source: http://trac.erichseifert.de/warp/wiki/SC2ReplayFormat#VariableLengthFormat

I've taken the following code from a C# SC2Replay client to do this VLF for me:

private static int ParseVLFNumber(BinaryReader reader) {
	var bytes = 0;
	var first = true;
	var number = 0;
	var multiplier = 1;
	while (true) {
		var i = reader.ReadByte();
		number += (i & 0x7F) * (int)Math.Pow(2, bytes * 7);
		if (first) {
			if ((number & 1) != 0) {
				multiplier = -1;
				number--;
			}
			first = false;
		}
		if ((i & 0x80) == 0) {
			break;
		}
		bytes++;
	}
	return (number / 2) * multiplier;
}

This took almost five hours to decipher with a LOT of help from various sources around the internet. I can't find it anymore but I thought I read somewhere that the game length was supposed to be in the header, but this could be incorrect. I would think the game length would be with the game recording date, players, colors, etc.

Next step is to work on extracting the different files from the ACTUAL MPQ (MPQ\x1A) then parse through the details there.

12/28/2011 @ 18:31

Decided to start on this project. I've added the ViewModelBase and made some changes to App.xaml and App.xaml.cs. Committed r1.

Resources