API Building Philosophy: .NET vs. Java

Introduction
Parsing an XML string
.NET version
Java version
Initial analysis
Flexibility vs. complexity
Separation of concerns
Extending functionality: classes vs. interfaces
The uncertainty principle
Fragmented documentaion
Summary

Introduction

The purpose of this article is to demonstrate some philosophical differences between .NET and Java approaches to building APIs , using XML parsing as an example. See "Is .NET really a Java clone?" for more .NET vs Java comparisons.

In a nutshell, .NET tends to use concrete predefined implementations, while Java relies on interfaces and pluggability. We will demonstrate how it plays for XML parsing and then discuss pros and cons of each approach.

Parsing an XML String

The task in hand is quite simple: take an string, build a W3C DOM model out of it, and print the name of the root element (or whatever else you migrht want to do with the DOM model).

.NET Version

// C#
using System;
using System.Xml;

namespace ParseXml
{
    class Program
    {
        static void Main(string[] args)
        {
            string xml = "<root />";
            XmlDocument doc = new XmlDocument();
            doc.LoadXml(xml);
            Console.WriteLine(doc.DocumentElement.Name);
        }
    }
}

This code is relatively straightforward. Create a concrete DOM object (XmlDocument) that doubles as a parser, feed it the string and use the results.

Java Version

// Java
package com.ikriv.parsexml;

import java.io.Reader;
import java.io.StringReader;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;

public class Program {

    public static void main(String[] args) 
    {
        try
        {
            String xml = "<root />";
            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            factory.setNamespaceAware(true);
            DocumentBuilder builder = factory.newDocumentBuilder();
            Reader reader = new StringReader(xml);
            InputSource inputSource = new InputSource(reader);
            Document doc = builder.parse(inputSource);
            System.out.println(doc.getDocumentElement().getNodeName());
        }
        catch (Exception ex)
        {
            System.out.println(ex.getMessage());
        }
    }
}

Java version is a little bit more involved. First we obtain a document builder factory, which gives us a DocumentBuilder, which knows how to parse XML documents and yields a Document object.

Unfortunately, DocumentBuilder cannot parse strings directly: it wants either an InputStream or an InputSource. We then have to dive into the intriguing world of the Java I/O library, having to figure out the difference between streams, sources, readers, and the like. The working combination is an InputSource, that takes a StringReader, that takes a string.

Of course, we could chain several calls in one expression:

// Java
Document doc = factory.newDocumentBuilder.parse(new InputSource(new StringReader(xml)));

This makes the code a little shorter, but hardly more readable. And last, but not least, we must not forget to make the factory XML namespace aware. Otherwise, if our document has namespaces, it would not parse correctly. .NET implementation is namespace aware by default.

Initial Analysis

.NET version is all about simplicity. All classes involved are concrete, and it gets the job done with just two method calls. The trouble is, it firmly ties you to the party line, i.e. the Microsoft's XML parser. If you want to switch to another parser (e.g. Xerces.Net), you will have to change a lot of code. If your application is large enough, this becomes impractical. So, you are stuck with whatever Microsoft gave you.

Java implementation is all about interfaces and flexibility. Concrete implementation of the document builder factory is obtained by the static DocumentBuilderFactory.newInstance() method via complex lookup rules. This gives you several ways to provide an alternative XML parser implementation. If you don't, you get a "platform default" implementation.

Another aspect of Java implementation is minimalism: there is no overload of DocumentBuilder.parse() that takes a string. Instead, you must find a way to convert the string to either InputSource or InputStream. This is a no brainer for an experienced Java programmer, but it is quite confusing for a beginner, that just looks in awe at several generations of weridly named Java I/O classes (why input stream is not an input source?) and scratches his head.

Flexibility vs Complexity

One obvious conclusion from the above is that interfaces give you flexibility at the expense of more complex code. Java favors flexibility, .NET favors simple code. This is a general trend, which manifests itself in other APIs, not only in XML parsing.

Separation of Concerns

Another point that .NET sacrifices for simplicity is separation of concerns. XmlDocument class plays at least two roles: it is both a parser and a node container. This is one role too many. Java, on the other hand, has DocumentBuilder (parser) and Document (node container). The concerns are nicely separated, but there is one more interface to learn.

Extending Functionality: Classes vs Interfaces

Interfaces decouple clients from the implementation, but by themselves they are completely inflexible. Removing methods will break interface's clients, adding methods will break the implementors. Modifying existing methods will break both. Even changing pre- or post-conditions on a method may break existing code. If backwards compatibility is taken seriously, interfaces must remain immutable once released. This may eventually lead to trouble. Consider this initial design:

interface IFoo
{
    void DoSomething();
}

interface IBar
{
    IList<IFoo> GetFoos();
}

Now, imagine we want to add something to IFoo. Since IFoo cannot be changed, we create an IFoo2:

interface IFoo2 : IFoo
{
    void DoSomethingElse();
}

Now we have trouble with the IBar: it returns IFoos, not IFoo2s. So, we must create

interface IBar2: IBar
{
   IList<IFoo2> GetFoos2();
}

If we have a large framework of related interfaces, adding one method like DoSomethingElse() may cause ripple effect and force us to create lots of new interfaces. This is exactly what happened with COM interfaces at times, and this is why interfaces fell out of favor with some folks in Microsoft.

Java avoids interface maintenance hell by relaxing the rules and allowing to occasionally break the implementors. This opens a window for adding new methods, at least to the interfaces that are not normally implemented by clients.

Concrete classes create less maintenance problems. They by definition are not implemented by clients, and thus new methods can be added to them freely. In C# you don't even have to worry about accidentally creating a method with the same name as in some client's derived class: the language sorts it out and things continue to work properly.

The Uncertainty Principle

Any API documentation has some degree of uncertainty. Very few APIs are documented to the extent that externally observable behavior is 100% defined. There are usually many "what happens if" questions that can be answered only by conducting an experiment. After all, people that write documentation are only humans. Furthermore, some cases are left explicitly undefined: documentation just says that they are "implementation specific".

In presence of multiple implementations this becomes a problem. Different implementors may have taken different assumptions in the corner cases, so experimenting with one implementation is not enough. Worse yet, unless your program is explicitly locked to a particular implementation, you may get a random ("platform specific") implementation that may not support all the features you need, not to mention bugs.

The bottom line is, if you want to run your code against multiple implementations, you must test against every one of them, and you still may fail if running with an unexpected implementation. If your code uses multiple APIs, the number of possible implementation combinations grows exponentially.

Fragmented Documentation

Separation of interface and implementation is great, but this leads to fragmentation of documentation. Overall behavior of an implementation is defined in two places: general API documentation, and "implementation specific" notes relevant only to this particular implementation. The latter is often incomplete or outright missing. This makes things hard to find.

Concrete classes do not have this problem: all the documentation, good or bad, is by defintion in one place, and usually written by a single author.

Summary

Whether to use interfaces or classes in APIs is a big debate. Java libraries tend to rely on interfaces, while .NET APIs, at least those from Microsoft, more heavily use classes. Note, that this is only a matter of philosophy: either platform can support both designs. The table below summarizes properties of interfaces vs. classes:

PropertyInterfacesClasses
FlexibilityYesNo
Ease of useWorseBetter
Certainty of behaviorNoYes
All documentation in one placeNoYes
Evolution optionsWorseBetter

This may seem like an easy win for classes, but things are not that simple. When you really need that alternative implementation, the little "yes" in the "flexibility" column easily outweighs all the bad stuff in other columns. The bottom line is - there is no silver bullet, and this is why we have the diversity.