I am playing with the binary format of .NET assemblies, described here.
.NET metadata is stored inside regular Windows executable PE file. I explored PE file format in detail when I was working on the GenTestAsm project. There is one striking difference between PE and CLR:
PE is optimized for speed, CLR is optimized for size |
I am not sure what exactly is the reason for this. By the time they developed the CLR metadata format, disk space was not a big issue anymore. Maybe, they wanted to be able to transfer CLR files over the Internet uncompressed, or something.
In PE, all tables have fixed or almost fixed format. It is easy to find any required spot in the file without complex calculations. All addresses in the PE file are RVAs, or relative virtual addresses, which are trivially translated to memory locations when the file is loaded in memory.
In CLR metadata things are very different. Take a look at the structure of the "#~"
stream that is the central structure for metadata information (¶24.2.6). The stream is structured like this:
Stream Header |
Array of N 4-byte integers, indicating the number of rows in each present table |
Sequence of N tables |
“Tables” are lists of managed types, methods, properties, and other, more obscure things. They are truly relational tables with fixed number of columns and identical-sized rows. Each table has a number associated with it, e.g. “Methods” table has number 6. There may be up to 64 tables, although only 38 are currently defined. Not all of them must be present. Typical executable contains a dozen or so tables.
The problems start with the array of table sizes. Only actually present tables are included in the array. A 64-bit mask in the header tells us which tables are present, and which are not. So, to get a size of the methods table, which is number 6, we need to check the mask. If bit number 6 is not set, there is no methods table. If bit number 6 is set, we need to check how many of the bits 0-5 are set. This may be any number from 0 to 6. We then use this number as an index into the table sizes array.
This calculation is not trivial, and it is an order of magnitude slower that just getting element number 6 from the row sizes array. Not storing empty tables in the sizes array saves us a whopping 200 bytes in a typical executable.
But this is only the beginning. There is no direct indication of how many bytes are in each table’s row. So, even if we know that table #0 has 1 row and table #1 has 15 rows, we cannot say where exactly table #2 begins. To calculate where a particular table begins, we need to calculate row widths of all preceding tables. This means, we need to know the structure of these tables. For the methods table (which is #6) we need to know the structure of tables 0, 1, 2, 3, 4, and 5. The trouble is, tables #3 and #5 are not defined yet. A current executable will not include these, but if they ever get defined, our code reading the methods table will break!
Furthermore, while all rows in a particular table have the same width, this width may depend on a number of factors. There are three types of columns in CLR metadata table:
- Constant width column
- Single-table index
- Multi-table index
Constant width columns are easy – their width is fixed. Singe-table index may be 2 or 4 bytes, depending on the number of rows in the table it references. Multi-table index may reference one of several tables: each particular index has a specific list. Multi-table index contains table selector bits and table index bits. The size of multi-table selector fields depends on the maximum number of rows in the tables it may reference. If 16 bits is enough to code table selector and maximum possible index value in the referenced tables, then multi-table index will be 16 bits (2 bytes). Otherwise it will be 32 bits (4 bytes). The specification does not say what to do if table selector plus table index do not fit in 32 bits. Probably, they hope it can never happen.
To summarize: in order to find the beginning of the methods table, we need to:
- Determine which tables are present.
- Read number of rows in each present table.
- For each column in tables preceding the method table:
- Determine column type
- If the column is single-table or multi-table index, calculate its size based on the number of rows in referenced table(s)
- Add column widths to obtain table row widths
- Calculate total size in bytes of all preceding tables and add these sizes up
This definitely allows to save some space in the file, but it slows down the look up a lot. Now I understand why reflection APIs are so slow…