Comparing .NET XML Serializers: Part One
Introduction
I was working on a large .NET application (.NET 3.5 SP1), and we began to argue what XML serializer is better for saving application state: user preferences, open windows, queries, and the like. The power of serializers and quality of resulting XML is best investigated on actual samples. I have built a small engine that runs each serializer on a set of samples, captures resulting XML or exception, and saves them in an XML output file. The results of this work are below.
Upcoming Parts
This parts contains mostly pure facts and very little analysis. I started to write a long text with insights and recommendations, which grew longer and longer, until I realized I am having a bad case of scope creep. Thus, I decided to apply the agile approach and release the first part without further delay.
XML Serializers Used
Starting with .NET 3.0 there are at least two other serializers available for general use besides XmlSerializer
:
Class Name | .NET version | Component |
---|---|---|
XmlSerializer |
1.0 | General |
XamlWriter / XamlReader |
3.0 | WPF |
DataContractSerializer |
3.0* | WCF |
* in .NET 3.5 SP1 Microsoft made a significant change
in the behavior of DataContractSerializer
.
Qualities Researched
We were interested primarily in these things:
- Power: what can be serialized and what cannot.
- Versioning: how difficult it is to change serialized classes.
- Elegancy: how pleasant the resulting XML is to look at (yes, this is subjective).
Serialization Samples
WCF serializer is run in two modes: the "ref" mode that allows for cycles in the object graph, and the regular mode that does not permit cycles. The following table summarizes the results for .NET 3.5 SP1. This is important, because XAML serializer has much better generics support in .NET 4.0.
Samples used to be here. They were lost when the web site moved between servers. I am working on retoring the samples.
Brief Analysis of the Serialization Samples
XML Serializer
Good:
By decorating your classes and properties with custom attributes you can achieve remarkable control over generated XML.
XmlSerializer
is available in all versions of .NET and is familiar to most developersThere is decent versioning support, you can achieve backward compatibility by using old XML attribute names, and when everything else fails, you can do custom serialization with
IXmlSerializable
.Null properties are not written to the document by default.
Bad:
Out of the box,
XmlSerializer
cannot serialize dictionaries, generic or otherwise, but one can use XML serializable dictionary class.XML serializer by default emits elements for properties of built-in types. They must be decorated with
[XmlAttribute]
custom attribute to convert them to attributes and produce shorter XML.Many built-in classes such as
Color
are either serialized as long list of properties (WPF color), or cannot be serialized at all (System.Drawing.Color
).Complex types cannot be serialized to attributes.
XAML Serializer
Good:
Properties are serialized as attributes by default. This applies even to complex types if they have a suitable convertor. This leads to more elegant XML.
Bad:
Generic collections are serialized poorly. Sometimes XAML serializer will produce XML it cannot read. Most of this is fixed in .NET 4.0.
Versioning support is poor. When people talk about versioning XAML they mean writing XAML document in such a way that old readers would understand it. The task of reading a document produced by an old writer is not addressed. If a class does not have a property mentioned in the document, this causes exception. There is no way to rename the property, or change the manner in which it is (de)serialized. Custom serialization is not supported.
XAML serializer tightly couples XML with exact .NET types. It is impossible to rename a class or put it in a different CLR namespace without breaking backward compatibility.
Null properties are written to the document, unless decorated with
[DefaultValue(null)]
attribute.
DataContractSerializer
Good:
This is the most powerful serializer of all. It can even serialize object graphs with cycles (in "ref" mode).
There is significant control over names of elements and their namespaces, albeit not as strong as with
XmlSerializer
, but enough for versioning purposes.If you implement
IExtensibleDataObject
, the deserializer will put all "extra" fields in the catch-allExtensionData
property. This helps versioning a whole lot.
Bad:
There are significant differences in behavior depending on the way the type under serialization is decorated. For
[DataContract]
only fields and properties marked with[DataMember]
are serialized. For[Serializable]
all fields are serialized by default.Starting from .NET 3.5 SP1 non-decorated types can be serialized as well, but there is very little control over this serialization. Also, one can neither disable this behavior, nor choose to use older version of the serializer. If your application can run on .NET 3.5 or .NET 3.5 SP1, getting consistent results may be a little difficult.
The XML produced is verbose and sometimes peppered with namespace declarations. Also, the serializer makes elements, not attributes, out of properties.
Serialization Engine
Serialization engine that produced the above table (with some XSLT) turned to be quite interesting program. You can download the source here; XmlSerializersTest.zip (45K). The application produces a large XML file with sample source and serialization results from all serializers. This file can be then transformed into a table via XSLT.
Conclusion
Based on this brief analysis, I think it is clear that XAML serializer is not suitable for serializing application state, mostly due to versioning issues and limitations on generics (prior to .NET 4.0).
Whether you use XmlSerializer
or DataContractSerializer
depends on a number of factors and personal preferenes, we will look at those
in further detail in next parts.