Overview of .NET Collections
.NET collection interfaces are defined in the namespace System.Collections.
Interfaces
IEnumerator - A Forward Const Iterator
interface IEnumerator { object Current {get;} bool MoveNext(); void Reset(); }
IEnumerator defines an enumerator, which is an analog of forward
const iterator from the STL world. "Forward" means enumerator can move in only
one direction - forward. "Const" means that it cannot change the underlying
collection. Current property is read-only.
Objects that implement IEnumerator are usually returned from IEnumerable.GetEnumerator().
By convention, new enumerator points to a location "just before" the beginning
of the collection. You should call MoveNext() to move the
enumerator to the first element of the collection. MoveNext() will
return false if collection is empty.
IEnumerable - An Enumerable Collection With Read-Only Forward-Only Access
interface IEnumerable
{
IEnumerator GetEnumerator();
}
IEnumerable is implemented by virtually all .NET collections. It
defines a single method - GetEnumerator(). This method returns an
enumerator for the collection. Any object that implements IEnumerable
can be used with the foreach operator.
ICollection - Synchronizable Collection With Count
interface ICollection : IEnumerable { int Count {get;} bool IsSynchronized {get;} object SyncRoot {get;} void CopyTo( Array array, int Index ); }
ICollection is a little bit mixed-up interface. It tries to deal
with several things at once.
First, it defines a collection with Count. For some
implementations of IEnumerable number of elements may not be
known in advance. For instance, imagine an IEnumerable
whose enumerator reads items from a communication stream such as pipe or socket.
In most cases, however, number of element is known, and thus ICollection
provides extension of IEnumerable interface with the
Count property.
Second, it defines IsSynchronized and SyncRoot properties
that deal with multi-threaded access to the collection. Note, that IEnumerable
and IEnumerator totally ignore this issue. Enumeration is proclaimed an
inherently non thread-safe operation. If collection gets modified by other threads, its enumerators
are invalidated and throw InvalidOperationException if used.
If IsSynchronized is true, synchronization
is built-in into collection operations, and collection can be used safely from multiply threads.
When we say "operations" we mean things like Add() or Remove(),
not defined by ICollection interface.
If IsSynchronized is false, it means
that collection operations are not thread-safe, and user must lock the collection instance,
or use some other synchronization mechanism if he wants to work with the collection from
several threads. .NET documentation does not define whether multithreaded reads are safe for all
collections if IsSynchronized is false.
SyncRoot is appearantly supposed to return an unsynchronized version
of given collection. SyncRootTest.cs sample supports this point
of view for ArrayList. Description of SyncRoot in the .NET help
is not absolutely clear about this.
CopyTo method copies a collection to specified array. It seems to be
largely superfluous, since the same operation can be performed in terms of other public
methods.
IList - Modifiable Collection with Integer Index
interface IList : ICollection { bool IsFixedSize { get; } bool IsReadOnly { get; } object this[ int Index ] {get; set;} int Add( object value ); void Clear(); bool Contains( object value ); int IndexOf( object value ); void Insert( int Index, object value ); void Remove( object value ); void RemoveAt( int Index ); }
The name IList is somewhat misleading. Suffice is to say that class Array
implements IList, but class SortedList does not. Because of the integer index,
IList behaves more like an array rather than a list. Its interface resembles that of
vector from STL. Unlike STL, .NET does not provide operation complexity guarantees for the methods
of IList. In particular, it is not formally guaranteed that all IList implementation
will perform indexed access in constant time, or even in amortized constant time.
IDictionary - Associative Container
interface IDictionary : ICollection { bool IsFixedSize { get; } bool IsReadOnly { get; } object this[ object key ] {get; set;} ICollection Keys{ get; } ICollection Values{ get; } void Add( object key, object value ); void Clear(); bool Contains( object value ); new IDictionaryEnumerator GetEnumerator();1 void Remove( object key ); }
1C# does not support covariant return types. Thus, ICollection.GetEnumerator
returning IEnumerator and IDictionary.GetEnumerator returning IDictionaryEnumerator
are two different, unrelated methods.
IDictionary defines an associative container that stores key and value pairs. The interface is relatively
straightforward. IDictionaryEnumerator is a helper interface that derives from IEnumerator.
Classes
System.Array : IList - Fixed Size Array
This class is an abstract base class for all built-in C# arrays. It represents an array of fixed
size and implements IList interface. This class is interesting, because
it is probably the only class in .NET that is both abstract and sealed.
Thus, you can neither instantiate it, nor derive from it. Built-in C# arrays derive from it implicitly.
You as a mortal user cannot declare your class abstract sealed - this causes
a compilation error.
ArrayList : IList - Dynamic Array
This class is a dynamic array, similar to STL vector or
MFC CArray. It implements the IList interface.
The differences between ArrayList and vector
are that
ArrayListis heterogenous - it can contain anyobject. The downside of this is thatArrayListindexer (operator []) returnsobject, and you need to use explicit casts to convert it to a real type.vectoron the other hand, is always a homogenous array.- Complexity of
ArrayListoperations (such as insertion/deletion) is not specified.
In addition, ArrayList can work as an adaptor for
arbitrary IList. Such adapters are created via static
method ArrayList.Adapter(). This allows to use methods
normally not available for IList, such as RemoveRange(),
LastIndexOf() or Sort(). Sort() may seem like an
attractive feature, but don't get your expectations too high. In .NET 1.1 it copies
contents of the wrapped IList to a temporary array, sorts that array,
and they copies the items back. It's clearly not a very efficient solution for large arrays.
ArrayList internals
With the help of reflector tool, we find that
ArrayListusesArrayto store items internally (unless it acts as adapter).- Array capacity is automatically doubled when more room is needed for elements.
- Array capacity is never automatically reduced when removing elements.
Add()takes amortized constant time.Insert()in the beginning of the array takes linear time.Sort()usesQuickSort(), which is quadratic in the worst case.
BitArray : ICollection - Array of Bits
This class implements a fixed-sized array of Boolean values, or bits.
Note that it implements merely an ICollection - not a
full IList. Once constructed, the size of the bit array
never changes. Besides, it does not implement IList methods
such as IndexOf(), Contains(), etc.
From the other hand it implements a number of additional, Boolean-specific
methods such as And(), Or(), and Xor()
HashTable : IDictionary
Hash table is an unordered collection of key-value pairs.
Key objects must correctly implement Equals() and GetHashCode.
Hashtable It is very similar to MFC CMap. Hashtable
provides more flexibility than CMap - it allows
you to specify your own hash code provider, and your own load factor.
SortedList : IDictionary
Sorted list is a sorted collection of key-value pairs. Besides implementing IDictionary
interface, it also allows access to its items by integer index via GetByIndex()
and SetByIndex() methods. Complexity of SortedList
operations is not clear. It looks like it is based on array, and thus adding elements may take linear time,
while by-index access takes constant time. Thorough experiments or disassembly are required
to clarify this issue.
Queue : ICollection
Queue implements standard collection operations plus
Enqueue(), Dequeue() and Peek()
methods. Using reflection we can see that internally it uses standard array (Object[]) -
much like an ArrayList.
Stack : ICollection
Stack implements standard collection operations plus
Push(), Pop() and Peek()
methods. For some reason complexity is specified for these methods.
Pop() is an O(1) operation, Push() is an O(1)
unless current stack capacity is reached, when it is O(n). If stack capacity is then
increased by the factor of 2 (which is not specified), than Push()
takes amortized constant time.
CollectionBase and ReadOnlyCollectionBase
These abstract classes implement IList. They
represent a base for type-safe collections that manipulates concrete types
instead of generic object. Internally CollecationBase
is a wrapper around ArrayList. Potentially type-unsafe calls such as
Insert() are implemented in two steps. First virtual method
OnInsert() is called. This method should check
the validity of the inserted object (e.g. whether it belongs to the collection
type). If OnInsert() does not throw, CollectionBase
proceeds with calling Insert() on the underlying ArrayList.
DictionaryBase
DictionaryBase is an abstract base for typesafe dictionaries.
The idea is similar to CollectionBase. Internally DictionaryBase
relies on a HashTable.
System.Collections.Specialized.StringCollection : IList
Implements type-safe collection of strings. Appearantly, it does not derive
from CollectionBase. Well, Microsoft is not really expected to follow
their own recommendations.
System.Collections.Specialized.StringDitionary: IEnumerable
Implements string-to-object map. Interestingly enough, this class neither derives
from DictionaryBase, nor implements IDictionary.
System.Collections.Specialized.NameValueCollection: ICollection
Implements string-to-string map sorted by key.
System.Collections.Specialized.ListDictionary: IDictionary
Implements a dictionary based on a linked list.
System.Collections.Specialized.HybridDictionary: IDictionary
Implements a dictionary based on a linked list for small sizes (below 10), and on hash table for bigger sizes.