Unfortunately, .NET framework collections lack support for an ordered set of
objects, similar to STL's set, or Java's HashSet and
TreeSet. SortedList comes close, but there are several
problems with it:
SortedList
requires to store dummy "value" for every element of the set, which clutters
the program.
SortedList.Add takes linear time
(you can examine it using Lutz Roeder's
reflector).
SortedList.Contains does not return contained object.List stores strings
with case-insensitive sorting, and List.Contains("test") is true,
we don't know whether contained string is "test", "Test", or "TEST".
SortedList
does not support multi-sets.
SortedList does not support operations such as union and
intersection.www.ikriv.com.Collections.zip contains source code (C#, VS.NET 2003), signed binaries, and documentation in .CHM format. See also readme.html inside the archive.
We attempted to create containers that behave similarly to STL's set
and multiset. In particular, we support:
A number of existing set implementations can be found on the Internet:
Extensions.Set
by Richard Bothne, Jim Showalter
Iesi.Collections.*Set
by Jason Smith
Lambda.Collections.Generic.Set<T>
by Rudiger Klaehn
SetCollection.Set
by Seth Peck
Most of these implementations are ultimately based either on Hashtable
or on a regular array. SortedList is also based on array. None of
them is based on a balanced tree structure.
In particular, Extensions.Set and Iesi.Collections.HashedSet
are based on Hashtable.
Lambda.Collections.Generic.Set
is based on .NET 2.0 generic Dictionary, which in fact is also hash table.
Iesi.Collections.SortedSet and SetCollection.Set are
based on array.
Iesi.Collections.ListSet stands out, because it is based on linked
list, but this generally performs similar to or worse than array.
Hash table has very fast insertion and removal. Ideally, these take constant time on average, but in real life it may be worsened by non-uniform hash function and excessive collisions. On the other hand, hash table has some significant downsides:
Hash table is not sorted: there is no efficient way to iterate through hash table elements in sorted order: from small to large, or vice versa.
Hash Table Requries Good Hash Function: Strings have a hash (allegedly good) hash function provided by .NET framework. Most other objects must invent their own, and creating a good hash function is not easy. Bad hash function may result in severe performance degradation, down to linear inserts, finds and removes.
Hash Function Must Be Synchronized with Key Comparison: It is imperative that if two keys compare equal, they have equal hash codes. Otherwise, hash table will not work properly. E.g., if we use case-insensitive comparison for string keys, we must use case-insensitive hash code provider. This requirement is easy to overlook and hard to verify. It makes putting non-trivial objects into hash table difficult and error-prone.
Multiple Equivalent Keys Are Not Allowed Hash table is not designed to hold several keys that compare equal. Therefore, it cannot support multisets.
Arrays are usually kept sorted, which allows for logarithmic search time. However, arrays take linear time to add or remove an element. Arrays can support multisets, although none of provided implementation does so.
None of the above implementations returns contained object as a result of search
operation. They just return a Boolean that indicates whether an equivalent key
has been found in the set. This is unsatisfactory for keys with non-trivial
comparison, e.g. case insensitive strings. For instance, if we searched for
"test" and the search returned true, there is no way to know
(quickly) whether the set actually contains "test", or "Test", or "TEST", or,
perhaps "TeSt".
Here is a small feature comparison chart for these implementations and www.ikriv.com.Collections.*Set:
| Feature | Extensions | Iesi/SortedSet | Iesi/HashedSet | Lambda | SetCollection | www.ikriv.com |
|---|---|---|---|---|---|---|
| Sorted | no | yes | no | no | yes | yes |
| Add() complexity | const | n | const | const | n*logn | logn |
| Remove() complexity | const | n | const | const | n2 | logn |
| Find/Contains complexity | const | logn | const | const | logn | logn |
| Find/Contains returns object | no | no | no | no | no | yes |
| Union/Intersection/etc. | yes | yes | yes | yes | yes | yes |
| Multisets | no | no | no | no | no | yes |
| Supports .NET 1.x | yes | yes | yes | no | yes | yes |
| Tested with nUnit | no | no | no | no | no | yes |
www.ikriv.com.Collections library is Copyright (c) 2005, Ivan
Krivyakov.
Permission is hereby granted, free of charge, to any person obtaining a copy of library binaries, source code, documentation andother accompanying files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Please kindly send your comments, bug reports, feature requests and other thoughts to this e-mail address.