Is .NET Really a Java Clone?

Introduction
Fundamental Differences
Java is portable, .NET is not
.NET is multilanguage, Java is not
Java is all about interfaces, .NET is not
Interoperability
Hosting
.NET is self hosted by default
Java uses external hosting
Versioning
Security
Events (callbacks)
Value types and boxing
Properties
Operators
Conclusion

Introduction

When two or more good programmers meet, a philsophical dsicussion is virtually inevitable. During such discussions I frequently hear an opinion that .NET is merely a Microsoft's clone of Java, implying that the differences, if any, are cosmetic and insignificant.

In this article I summarize my view of the subject. I am not trying to prove that one platform is better than the other. I have programmed on both, but I am much more experienced with .NET. I am trying to be as objective as possible, but I do not claim to know the absolute truth.

Click here to comment on this article.

Similarities

On the surface, .NET and Java look very similar indeed. They both are object oriented platforms that use a virtual machine (JVM↔CLR) that executes architecture independent instructions (byte codes↔IL).

Both come with a rich standard library. The libraries are not identical, but similar in core capabilities.

Both platforms allow dynamic loading of external code, reflection, custom annotations↔attributes, generics, etc.

In other words, from 30,000 feet .NET and Java are practically twins. This makes many people wonder why anybody would need .NET in the first place. After all, Java predates .NET by many years and it can run on Windows just fine. Why reinvent the wheel?

Fundamental Differences

Java Is Portable, .NET Is Not

.NET runs only on Windows while Java runs on virtually any OS. This is more of a idelogical than technical limitation. Microsoft could make .NET portable if they wanted to. The Mono project (http://www.mono-project.com/), a .NET implementation for Linux, is a living proof of that. Naturally, Microsoft wants to promote Windows, so it has little interest in supporting .NET on other platforms.

Focus on Windows is not necessarily pure evil. It allows introducing useful platform-specific features without guilt: P/Invoke, COM interop, Windows Forms, WPF, etc.

.NET Is Multi-Language, Java Is Not

This is difference is also more ideological than technical. Microsoft positions .NET as a multi-language platform. Out of the box, .NET supports several languages, and third party compilers exist for a couple of dozen more. Microsoft encourages these efforts and even comes up with experimental languages of its own, most notably F#.

Technically, nothing prevents compiling other languages into Java byte code. Indeed, such compilers do exist^[1]. However, multi-language support was not an initial design goal of JVM, and executing languages other than Java on JVM is not mainstream.

Since .NET is multi-language, comparing Java and .NET is sometimes difficult. E.g. there is no such thing as ".NET syntax". When language-specific information is needed, we will use C# as a representative of the .NET side.

Java Is about Interfaces, .NET Is Not

Many API specifications in Java are defined in terms of interfaces. There usually is a single concrete factory class that gives you first interface reference, and then you deal only with interfaces.

.NET class library is all about classes, not interfaces. Extensive use of interfaces is oficially discouraged ^[2] ^[3].

Both approaches have advantages and disadvantages. Interfaces allow switching to another implementation without rewriting lots of code, but they are harder to use, make documentation fragmented (API vs. concrete implementation details), and versioning difficult. Concrete classes are easier to use and leave less uncertainty about what exactly will happen, but they are set in stone and make switching to another implementation next to impossible. I discuss this in detail in the "API Building Philosophy: .NET vs. Java".

Interoperability

Interoperability with existing code received significant attention in .NET^[5]. For Windows developer, .NET offers much more convenient interoperability options than Java.

Calling C

Java offers two methods to call external C code. Standard JNI^[4] can only call code that adheres to specific signature and naming convention and includes a custom generated header file. Shared stubs method^[6] can call any code, but the call syntax is quite complicated, and the performance is questionable. Shared stubs method is not part of the standard Java library: the supporting code must be tailored into your application, and it includes some C and even assembly. The bottom line is: in Java you cannot call an existing external function using natural syntax result = externalFunction(param1, param2);

.NET answer to this is P/Invoke, which offers a relatively elegant solution for the simple case. Calling an existing external function with a simple signature is straightforward. Anders Hejlsberg, the author of C#, discusses the topic here.

Calling COM

Since Microsoft COM is a platform-specific feature, Java ignores it. Calling COM from Java requires a third-party solution such as com4j or similar. I am not sure whether exposing Java component as a COM service is even possible. Microsoft JVM had this feature, but it was discontinued in 2001^[7].

Hosting

Both .NET and Java are executed using virtual machines. A virtual machine is a native code that must be run (or hosted) in some process. Java and .NET have slightly different approaches to hosting.

.NET: Self-Hosting by Default

.NET application compiles into a native executable (.EXE) that for the user is indistinguishable from any other kind of Windows application. When invoked, the executable automatically starts the CLR and begins executing .NET code.

From one hand, self-hosting is The Good Thing. Users don't need to worry about whether the application is .NET, VB6, or C++. All nasty implemenation details are hidden as they should be. From the other hand, the parameters of the default hosting are fixed and there is no way to change them.

.NET also offers an elaborated hosting API^[14] that allows executing .NET code from a native application. Using this API, you can customize many options and provide your own hooks for things like loading assemblies, etc. The API is complex, but still usuable by mere mortals.

Java: External Hosting

Java has its own code packaging format (.jar), which requires external hosting. JARs cannot be directly executed by Windows, UNIX or a command shell. Even for "execitable" jars, users must specify a command line like java -jar myjar.jar.

External hosting makes Java applications different from native applications. Usually it is not a problem: one can always write a one-line script that invokes java with the right parameters. Occasionally, however, inability to package the code in one file may be an issue. JDK does not support packaging JARs into an EXE. There are commercial third party tools that claim to do that.

One positive consequence of the external hosting is that you can tweak certain parameters with ease^[15]. E.g. you can change the class path of a third party application by executing java -cp custompath -jar 3rdparty.jar, or you can change default XML parser implementation by running java -Djavax.xml.parsers.DocumentBuilderFactory=my.factory.class -jar 3rdparty.jar. It is quite difficult to achieve comparable effect in .NET.

Versioning

Versioning in Java

Java does not assign versions to classes or JARs. When a class is needed, its code is looked up by the class loader. The class loader searches for the class in the locations specified by the CLASSPATH environment variable, which is typically a list of JARs.

Versioning is achieved by giving the JARs version-specific names, such as xerces-1.0.3.jar, and specifying the class path accordingly. To the best of my knowledge, side-by-side execution of different versions is not supported. E.g. if one part of your application wants Xerces 1.0.3 and the other wants Xerces 2.4.1, you are out of luck.

Java is, therefore, prone to the classic "DLL hell" problem. Specifying the class path for a complex application may be quite an interesting and amuzing endeavor (guess how I know :-)).

Versioning in .NET

In .NET each class reference actually has two parts: class name and assembly name. The assembly name includes not only the name string, but also assembly version, and optionally a public key token, a digital signature that protects the assembly from modification by third parties^[8]. Assembly names with a public key token are called strong names.

Thus, the application will always attempt to load the assembly version it was compiled against. If different parts of the application request different versions of the assembly, it is not a problem: .NET supports loading two versions of the same assembly side-by-side.

This sounds good in theory, but in practice it creates some issues. Let's suppose our application requests assembly Foo, version 1.2.3.0. Imagine that this version has a bug, which is fixed in Foo, version 1.2.4.0. If we simply replace old version of Foo with the new one, the application will stop working with the error "The located assembly's manifest definition does not match the assembly reference.".

If we make both versions of Foo available (e.g. by putting them in the GAC), the application will pick up the old version. To fix the problem, we need to setup an assembly redirect, or a publisher policy, i.e. put some ugly XML in some config files, and do it for every machine on which the application is deployed.

In order to survive side-by-side execution, the assembly must not have global state (e.g. singletons). Otherwise, things may get ugly. Also, classes from different versions of the same assembly are not compatible, which results in weird errors like "cannot cast Foo to Foo".

Errors resulting from the accidental side-by-side execution are cryptic and difficult to debug.

.NET assembly loading machinery is complex and brittle^[9]. .NET has a fixed set of directories where it will look for the dependent assemblies: generaly it is either under the application directory or in the GAC. There are no built-in means for an enterprise-level assembly repository, where appoved assemblies reside on the network. It is not a problem with Java - just set the CLASSPATH to point to the network drive.

Security

Java and .NET security are similar in principle: code is assigned different security privileges depending on where it came from and who signed it^[10] ^[11]. One important difference is that in Java local apps run without security by default. In .NET all applications are subject to security, but local applications execute in "full trust" mode, which is close (albeit not equivalent) to no security.

Another characteristic difference is that in Java the SecurityManager class is pluggable via System.setSecurityManager(), and can be replaced by a user implementation. In .NET security algorithms are hard coded in system code. .NET security system is somewhat more involved, and it comes with a set of graphical tools to edit the security. It also allows to specify security declaratively via attributes.

Events (Callbacks)

Being an object oriented language, Java defines callbacks as interfaces, e.g. MouseListener. If I want to process an event, I must create a special listener class. This class must implement all interface methods: I cannot handle mouseClicked without also handling mouseExited, even if all I care about is clicks. To alleviate this problem Java has the MouseAdapter class: "The methods in this class are empty. This class exists as convenience for creating listener objects.". If I were this class, I would feel kinda empty :-) Anonymous classes ease the situation somewhat, but it is still a lot of typing:

// Java
button.addMouseListener(
    new MouseAdapter() { 
        public void mouseClicked(MouseEvent event) { /* handle click */ };
	});

.NET offers more elegant solution.It has a concept of a delegate, which is in fact an "object + method" pair (or, rather, a linked list of such pairs^[12]). Anders Hejlsberg discusses design decisions behind delegates in his interview ^[5]. .NET also has a built-in implementation of the observer pattern. Callbacks such as mouse click are declared with the event keyword, and clients can subscribe/unsubscribe to the callback by supplying a handler delegate.

Anonymous delegates take it even further. Besides, unlike Java's anonymous classes they implement closures, i.e. they can capture variables from the environment.

// C#
string s = "Captured string";
button.Click += delegate(object sender, EventArgs args) { Trace.WriteLine(s); };

Generics

In both Java and .NET generics were an afterthought, an extension of an already existing language. However, Java and .NET took somewhat different routes on implementing the generics. In Java, generics are purely compile-time feature. Neither the virtual machine, nor the reflection API know about the generics. This allowed preserving backwards compatibility at the expense of type safety. Old, non-generic-aware JVM can run new Java code with generics. From the other hand, once compiled, List<MyClass> is just List, and at run time it will accept objects of any type.

The code below generates a number of compiler warnings, but it runs without exception:

// Java
List<String> stringList = new ArrayList<String>();
List objList = stringList;
objList.add(new Integer(42));

In .NET generics are a run-time feature. The virtual machine and the reflection API are fully aware of generics: IList<MyClass> is distinct from IList<OtherClass> and from legacy IList. .NET equivalent of the Java code above compiles (not the explicit cast to IList), but won't run:

// C#
IList<string> stringList = new List<string>();
IList objList = (IList)stringList;
objList.Add(42); // throws: 42 is not a string

Unfortunately, introducing generics required changing the assembly file format. Older (.NET 1.1) runtime cannot execute code commpiled for .NET 2.0, even if generics are not used. Fortunately, .NET 2.0 runtime can execute code compiled for 1.1.

This lack of compatibility created tremendous headaches for developers and administrators: it could take several months to roll-out .NET 2.0 in a big enterprise, and developers could not release .NET 2.0 applications before new version of the framework was installed on all customer machines.

However, once the painful transition period was over, .NET developers ended up in a better situation than Java developers, because now they have generics that are type-safe at runtime, and compatibility problems are the matter of the past.

Value Types and Boxing

In Java the only value types (i.e. types copied by value) are built-ins, such as int. All user types are copied by reference. In .NET users can create their own value types (known as structs). This addition really complicates the language, but does give a performance benefit. It also allows to have List<int> - something Java cannot have.

Another difference is that in Java boxing, i.e. conversion between built-in types and corresponding wrapper types (int↔Integer) is explicit. In C# boxing is implicit, which is usually good, but sometimes it leads to surprising results.

Properties

Java has a concept of properties, but they don't have direct language support. A property is implemented via getXXX() and setXXX() methods. This results in more typing and occasional spelling errors. Various tools operate in terms of properties, but for that to work, you get and set methods must be properly named and must have proper signatures:

// Java
class Square {
    double _side;
    
    public double getSide() { return _side; }
    public void setSide(double value) { _side = value; }
    
    public double getArea() { return _side*_side; }
    public void setArea(double value) { _side = Math.sqrt(value); }
}
...
mySquare.setArea(43.8);

In .NET, properties are first class citizens:

// C#
class Square {
    double _side;
    
    public double Side 
    { 
        get { return _side; } 
        set { _side = value; } 
    }
    
    public double Area
    {
        get { return _side*_side; }
        set { _side = Math.Sqrt(value); }
    }
}
...
mySquare.Area = 43.8;

However, properties can be easily abused. E.g. property assignment looks like a simple operation, and it is easy to forget that it may have significant side effects. I encountered at least one bug in Windows Forms code where the programmer did something like window.X = foo; window.Y = bar; failing to realize, that the first assignment is actually a function call that may (and occasionally does) completely rearrange the window due to docking, etc.

Operator Overloading

Java does not have operator overloading. This causes certain headaches, e.g. you compare string by doing ugly stuff like "MyString".equals(s) (never s.equals("MyString"), because s may be null!). If you wrote s == "MyString" it would not be right. Worse yet, it may occasionally work, and blow up later, when you least expect it.

In C# equality for strings is overloaded, and you can use the natural syntax s=="MyString" without fear. You can also overload other operators, which occasionally comes handy. E.g. I used operator overloading for handling time periods, where you can write (ThisPeriod & ThatPeriod) | OtherPeriod.

Operator overloading has, of course, its problems. C++ gave operator overloading especially bad name, since it allowed overloading operators as free-standing functions. In C++, an operator involving classes C and D could be defined outside of C or D, and thus may be quite hard to locate.

C# does not have free-standing functions, so this problem is gone. However, even in C# the meaning of the operator depends on the static (compile-time) type of the variable. If you assign your object to a variable of the base type, the meaning of the operators may (and probably will) suddenly change.^[13]

// C#
string s1 = "aaa";
object obj1 = s1;
if (obj1 == "aaa") // reference equality, not text comparison!
{
   ...
}

Conclusion

While similar in principle, .NET is by no means a Java twin.

Java and .NET are significantly diffierent in philosophy, deployment considerations, maintenance strategy and implementation details. Java locks you into a single language, is less friendly to legacy code, but is usually more flexible in terms of "plugability" and alternative implementation of things. .NET offers multiple languages, great interoperability on Windows, but certain solutions are locked to the "party line" and cannot be changed.

Which platform to choose depends largely of your environment, existing applications, developer's skills and what you want to do. E.g. it would make little sense to choose .NET for a web-based solution on UNIX, or to choose Java to write a desktop app for Windows.

In the distributed world, it does not have to be a black-and-white decsion. It is definitely possible to have a heterogenous system where some parts are written in Java and others in .NET, although they come with their own set of problems.

References

^[1] Wikipedia: List of JVM Languages.
^[2] Krzysztof Cwalina , Brad Abrams. Framework Design Guidelines, p.81.
^[3] Discussion of "Do favor classes over interfaces" in my blog.
^[4] JNI example from Sun.
^[5] A Conversation with Anders Hejlsberg by Bill Venners with Bruce Eckel.
^[6] JNI Shared Stubs example.
^[7] Wikipedia: Microsoft Java Virtual Machine.
^[8] Strong name makes it impossible for third parties to modify the assembly. Only the original author can do so. However, unlike an SSL certificate, strong name does not guarantee the identity of the author. I.e., from the assembly alone there is no way to know whether strong name 123456 belongs to Microsoft, or to EvilHackers, Inc.
^[9] Suzanne Cook. Choosing a Binding Context.
^[10] Overview of Java Security.
^[11] Understanding .NET Code Access Security.
^[12] C# delegates and events in depth.
^[13] Equality in C#.
^[14] CLR inside out: CLR Hosting APIs.
^[15] Java application launcher command line.

Ivan Krivyakov

Premature optimization is the root of all evil