Overview of .NET Collections
.NET collection interfaces are defined in the namespace System.Collections
.
Interfaces
IEnumerator - A Forward Const Iterator
interface IEnumerator { object Current {get;} bool MoveNext(); void Reset(); }
IEnumerator
defines an enumerator, which is an analog of forward
const iterator from the STL world. "Forward" means enumerator can move in only
one direction - forward. "Const" means that it cannot change the underlying
collection. Current
property is read-only.
Objects that implement IEnumerator
are usually returned from IEnumerable.GetEnumerator()
.
By convention, new enumerator points to a location "just before" the beginning
of the collection. You should call MoveNext()
to move the
enumerator to the first element of the collection. MoveNext()
will
return false if collection is empty.
IEnumerable - An Enumerable Collection With Read-Only Forward-Only Access
interface IEnumerable
{
IEnumerator GetEnumerator();
}
IEnumerable
is implemented by virtually all .NET collections. It
defines a single method - GetEnumerator()
. This method returns an
enumerator for the collection. Any object that implements IEnumerable
can be used with the foreach operator.
ICollection - Synchronizable Collection With Count
interface ICollection : IEnumerable { int Count {get;} bool IsSynchronized {get;} object SyncRoot {get;} void CopyTo( Array array, int Index ); }
ICollection
is a little bit mixed-up interface. It tries to deal
with several things at once.
First, it defines a collection with Count
. For some
implementations of IEnumerable
number of elements may not be
known in advance. For instance, imagine an IEnumerable
whose enumerator reads items from a communication stream such as pipe or socket.
In most cases, however, number of element is known, and thus ICollection
provides extension of IEnumerable
interface with the
Count
property.
Second, it defines IsSynchronized
and SyncRoot
properties
that deal with multi-threaded access to the collection. Note, that IEnumerable
and IEnumerator
totally ignore this issue. Enumeration is proclaimed an
inherently non thread-safe operation. If collection gets modified by other threads, its enumerators
are invalidated and throw InvalidOperationException
if used.
If IsSynchronized
is true, synchronization
is built-in into collection operations, and collection can be used safely from multiply threads.
When we say "operations" we mean things like Add()
or Remove()
,
not defined by ICollection
interface.
If IsSynchronized
is false, it means
that collection operations are not thread-safe, and user must lock the collection instance,
or use some other synchronization mechanism if he wants to work with the collection from
several threads. .NET documentation does not define whether multithreaded reads are safe for all
collections if IsSynchronized
is false.
SyncRoot
is appearantly supposed to return an unsynchronized version
of given collection. SyncRootTest.cs sample supports this point
of view for ArrayList
. Description of SyncRoot
in the .NET help
is not absolutely clear about this.
CopyTo
method copies a collection to specified array. It seems to be
largely superfluous, since the same operation can be performed in terms of other public
methods.
IList - Modifiable Collection with Integer Index
interface IList : ICollection { bool IsFixedSize { get; } bool IsReadOnly { get; } object this[ int Index ] {get; set;} int Add( object value ); void Clear(); bool Contains( object value ); int IndexOf( object value ); void Insert( int Index, object value ); void Remove( object value ); void RemoveAt( int Index ); }
The name IList
is somewhat misleading. Suffice is to say that class Array
implements IList
, but class SortedList
does not. Because of the integer index,
IList
behaves more like an array rather than a list. Its interface resembles that of
vector
from STL. Unlike STL, .NET does not provide operation complexity guarantees for the methods
of IList
. In particular, it is not formally guaranteed that all IList
implementation
will perform indexed access in constant time, or even in amortized constant time.
IDictionary - Associative Container
interface IDictionary : ICollection { bool IsFixedSize { get; } bool IsReadOnly { get; } object this[ object key ] {get; set;} ICollection Keys{ get; } ICollection Values{ get; } void Add( object key, object value ); void Clear(); bool Contains( object value ); new IDictionaryEnumerator GetEnumerator();1 void Remove( object key ); }
1C# does not support covariant return types. Thus, ICollection.GetEnumerator
returning IEnumerator
and IDictionary.GetEnumerator
returning IDictionaryEnumerator
are two different, unrelated methods.
IDictionary
defines an associative container that stores key and value pairs. The interface is relatively
straightforward. IDictionaryEnumerator
is a helper interface that derives from IEnumerator
.
Classes
System.Array : IList - Fixed Size Array
This class is an abstract base class for all built-in C# arrays. It represents an array of fixed
size and implements IList
interface. This class is interesting, because
it is probably the only class in .NET that is both abstract
and sealed
.
Thus, you can neither instantiate it, nor derive from it. Built-in C# arrays derive from it implicitly.
You as a mortal user cannot declare your class abstract sealed
- this causes
a compilation error.
ArrayList : IList - Dynamic Array
This class is a dynamic array, similar to STL vector
or
MFC CArray
. It implements the IList
interface.
The differences between ArrayList
and vector
are that
ArrayList
is heterogenous - it can contain anyobject
. The downside of this is thatArrayList
indexer (operator []
) returnsobject
, and you need to use explicit casts to convert it to a real type.vector
on the other hand, is always a homogenous array.- Complexity of
ArrayList
operations (such as insertion/deletion) is not specified.
In addition, ArrayList
can work as an adaptor for
arbitrary IList
. Such adapters are created via static
method ArrayList.Adapter()
. This allows to use methods
normally not available for IList
, such as RemoveRange()
,
LastIndexOf()
or Sort()
. Sort()
may seem like an
attractive feature, but don't get your expectations too high. In .NET 1.1 it copies
contents of the wrapped IList
to a temporary array, sorts that array,
and they copies the items back. It's clearly not a very efficient solution for large arrays.
ArrayList internals
With the help of reflector tool, we find that
ArrayList
usesArray
to store items internally (unless it acts as adapter).- Array capacity is automatically doubled when more room is needed for elements.
- Array capacity is never automatically reduced when removing elements.
Add()
takes amortized constant time.Insert()
in the beginning of the array takes linear time.Sort()
usesQuickSort()
, which is quadratic in the worst case.
BitArray : ICollection - Array of Bits
This class implements a fixed-sized array of Boolean values, or bits.
Note that it implements merely an ICollection
- not a
full IList
. Once constructed, the size of the bit array
never changes. Besides, it does not implement IList
methods
such as IndexOf()
, Contains()
, etc.
From the other hand it implements a number of additional, Boolean-specific
methods such as And()
, Or()
, and Xor()
HashTable : IDictionary
Hash table is an unordered collection of key-value pairs.
Key objects must correctly implement Equals()
and GetHashCode
.
Hashtable
It is very similar to MFC CMap
. Hashtable
provides more flexibility than CMap
- it allows
you to specify your own hash code provider, and your own load factor.
SortedList : IDictionary
Sorted list is a sorted collection of key-value pairs. Besides implementing IDictionary
interface, it also allows access to its items by integer index via GetByIndex()
and SetByIndex()
methods. Complexity of SortedList
operations is not clear. It looks like it is based on array, and thus adding elements may take linear time,
while by-index access takes constant time. Thorough experiments or disassembly are required
to clarify this issue.
Queue : ICollection
Queue
implements standard collection operations plus
Enqueue()
, Dequeue()
and Peek()
methods. Using reflection we can see that internally it uses standard array (Object[]
) -
much like an ArrayList
.
Stack : ICollection
Stack
implements standard collection operations plus
Push()
, Pop()
and Peek()
methods. For some reason complexity is specified for these methods.
Pop()
is an O(1) operation, Push()
is an O(1)
unless current stack capacity is reached, when it is O(n). If stack capacity is then
increased by the factor of 2 (which is not specified), than Push()
takes amortized constant time.
CollectionBase and ReadOnlyCollectionBase
These abstract classes implement IList
. They
represent a base for type-safe collections that manipulates concrete types
instead of generic object
. Internally CollecationBase
is a wrapper around ArrayList
. Potentially type-unsafe calls such as
Insert()
are implemented in two steps. First virtual method
OnInsert()
is called. This method should check
the validity of the inserted object (e.g. whether it belongs to the collection
type). If OnInsert()
does not throw, CollectionBase
proceeds with calling Insert()
on the underlying ArrayList
.
DictionaryBase
DictionaryBase
is an abstract base for typesafe dictionaries.
The idea is similar to CollectionBase
. Internally DictionaryBase
relies on a HashTable
.
System.Collections.Specialized.StringCollection : IList
Implements type-safe collection of strings. Appearantly, it does not derive
from CollectionBase
. Well, Microsoft is not really expected to follow
their own recommendations.
System.Collections.Specialized.StringDitionary: IEnumerable
Implements string-to-object map. Interestingly enough, this class neither derives
from DictionaryBase
, nor implements IDictionary
.
System.Collections.Specialized.NameValueCollection: ICollection
Implements string-to-string map sorted by key.
System.Collections.Specialized.ListDictionary: IDictionary
Implements a dictionary based on a linked list.
System.Collections.Specialized.HybridDictionary: IDictionary
Implements a dictionary based on a linked list for small sizes (below 10), and on hash table for bigger sizes.