Comparing .NET XML Serializers: Part One

Introduction

I was working on a large .NET application (.NET 3.5 SP1), and we began to argue what XML serializer is better for saving application state: user preferences, open windows, queries, and the like. The power of serializers and quality of resulting XML is best investigated on actual samples. I have built a small engine that runs each serializer on a set of samples, captures resulting XML or exception, and saves them in an XML output file. The results of this work are below.

Upcoming Parts

This parts contains mostly pure facts and very little analysis. I started to write a long text with insights and recommendations, which grew longer and longer, until I realized I am having a bad case of scope creep. Thus, I decided to apply the agile approach and release the first part without further delay.

XML Serializers Used

Starting with .NET 3.0 there are at least two other serializers available for general use besides XmlSerializer:

Class Name.NET versionComponent
XmlSerializer 1.0 General
XamlWriter / XamlReader 3.0 WPF
DataContractSerializer 3.0* WCF

* in .NET 3.5 SP1 Microsoft made a significant change in the behavior of DataContractSerializer.

Qualities Researched

We were interested primarily in these things:

  • Power: what can be serialized and what cannot.
  • Versioning: how difficult it is to change serialized classes.
  • Elegancy: how pleasant the resulting XML is to look at (yes, this is subjective).

Serialization Samples

WCF serializer is run in two modes: the "ref" mode that allows for cycles in the object graph, and the regular mode that does not permit cycles. The following table summarizes the results for .NET 3.5 SP1. This is important, because XAML serializer has much better generics support in .NET 4.0.

Samples used to be here. They were lost when the web site moved between servers. I am working on retoring the samples.

Brief Analysis of the Serialization Samples

XML Serializer

Good:

  • By decorating your classes and properties with custom attributes you can achieve remarkable control over generated XML.

  • XmlSerializer is available in all versions of .NET and is familiar to most developers

  • There is decent versioning support, you can achieve backward compatibility by using old XML attribute names, and when everything else fails, you can do custom serialization with IXmlSerializable.

  • Null properties are not written to the document by default.

Bad:

XAML Serializer

Good:

  • Properties are serialized as attributes by default. This applies even to complex types if they have a suitable convertor. This leads to more elegant XML.

Bad:

  • Generic collections are serialized poorly. Sometimes XAML serializer will produce XML it cannot read. Most of this is fixed in .NET 4.0.

  • Versioning support is poor. When people talk about versioning XAML they mean writing XAML document in such a way that old readers would understand it. The task of reading a document produced by an old writer is not addressed. If a class does not have a property mentioned in the document, this causes exception. There is no way to rename the property, or change the manner in which it is (de)serialized. Custom serialization is not supported.

  • XAML serializer tightly couples XML with exact .NET types. It is impossible to rename a class or put it in a different CLR namespace without breaking backward compatibility.

  • Null properties are written to the document, unless decorated with [DefaultValue(null)] attribute.

DataContractSerializer

Good:

  • This is the most powerful serializer of all. It can even serialize object graphs with cycles (in "ref" mode).

  • There is significant control over names of elements and their namespaces, albeit not as strong as with XmlSerializer, but enough for versioning purposes.

  • If you implement IExtensibleDataObject, the deserializer will put all "extra" fields in the catch-all ExtensionData property. This helps versioning a whole lot.

Bad:

  • There are significant differences in behavior depending on the way the type under serialization is decorated. For [DataContract] only fields and properties marked with [DataMember] are serialized. For [Serializable] all fields are serialized by default.

  • Starting from .NET 3.5 SP1 non-decorated types can be serialized as well, but there is very little control over this serialization. Also, one can neither disable this behavior, nor choose to use older version of the serializer. If your application can run on .NET 3.5 or .NET 3.5 SP1, getting consistent results may be a little difficult.

  • The XML produced is verbose and sometimes peppered with namespace declarations. Also, the serializer makes elements, not attributes, out of properties.

Serialization Engine

Serialization engine that produced the above table (with some XSLT) turned to be quite interesting program. You can download the source here; XmlSerializersTest.zip (45K). The application produces a large XML file with sample source and serialization results from all serializers. This file can be then transformed into a table via XSLT.

Conclusion

Based on this brief analysis, I think it is clear that XAML serializer is not suitable for serializing application state, mostly due to versioning issues and limitations on generics (prior to .NET 4.0).

Whether you use XmlSerializer or DataContractSerializer depends on a number of factors and personal preferenes, we will look at those in further detail in next parts.