F# Enums and Discriminated Unions
Discriminated Unions
Degenerate Discriminated Unions
Enums
Annoyances
Discriminated Unions
In F# (and OCaml) a discriminated union is a type that can have one or more
"branches", something similar to VARIANT
of COM days. Each branch is a record
type that may contain zero or more fields. E.g. the following code
ype Shape = | Point | Circle of double | Square of double | Rectangle of System.Drawing.Point | ArbitraryLine of System.Drawing.Point array
defines a (floating) shape that can be a variety of different types.
Internally, F# creates an inner class for each "branch" of the union,
and all these inner classes are derived from Shape
:
// approximate reflector output class Shape { public class _Point : Shape {} public class _Circle : Shape { public double Value; } public class _Square : Shape { public double Value; } public class _Rectangle : Shape { public Point Value; } public class _ArbitraryLine : Shape { public System.DrawingPoint[] Value; } public static Shape Circle(double x) { return new _Circle... } }
Discriminated unions come very handy with pattern matching.
One of the problems of the discriminated unions is that by default they don't
have a suitable ToString()
, which in particular makes testing
difficult. Diagnostic messages like expected: _Rectangle, actual: _Rectangle
don't really help.
Fortunately, a ToString()
implementation may be added by hand:
type Shape = | Point | Circle of double ... override self.ToString() = match self with | Point -> "Point" | Circle(x) -> String.Format("Circle({0})", x) ...
The branches of a discriminated union may be accessed either by short name, e.g. Rectangle
,
or by long name, e.g. Shape.Rectangle
. In case of ambiguity, the compiler
does not warn you, and appears to choose the most recent suitable definition:
type Pen = | Ball | Point let x = Point // Pen.Point let y = Shape.Point // Shape.Point
Degenerate Discriminated Union
A degenerate discriminated union is a discriminated union where none of the branches has any members. Note, that this is not an official F# definition, I just use it for the purpose of this article. A degenerate union may serve as an enum-like object:
type Color = | Red | Green | Blue
The F# compiler appears to have a special optimization for this case.
No derived classes _Red
, _Green
, and _Blue
will be created. Instead class Color
will have three static members
for red, green, and blue:
// approximate reflector output class Color { public static Color _red; public static COlor _green; public static Color _blue; public static Color Red { get { return _red; } } }
This optimization is not just an implementation detail. Since all the "guts" of the generated classes are public, this affects interoperability of your F# object with other languages.
There is a number of problems with using degnerate discriminated unions as enums that may or may not be important:
- Discriminated union branches are comparable via > and <, but "next" or "previous" are not defined.
In particular it is not possible to write
for color in Red..Blue do...
- No conversion to integer
- No combining values together: although you can happily apply
[<Flags>]
to it, it does not mean anything - No default
ToString()
implementation as mentioned above
Enums
Enums are defined very similar to the degenerate discriminated unions, but each "branch" is assigned an integer value:
type ColorEnum = | Red = 1 | Green = 2 | Blue = 3
You must explicitly assign values to all branches. You may assign the same value to multiple branches. No warning is issued in this case.
The enums are translated into native .NET enums as expected:
// approximate reflector output enum ColorEnum { Red = 1, Green = 2, Blue = 3 }
There is a number of important distinctions between enums and discriminated unions:
- Enum values
must
be specified by the fully qualified name:ColorEnum.Red
. - Enums may be (explicitly) converted to and from integers and used in "for" loops.
- Enum values may be combined using the
|||
operator ([Flags]
attribute is currently not checked). - Enums have a default
ToString()
implementation that returns the symbolic name, e.g. "Red".
Annoyances
- Rule of the least astonishment is violated: in other languages you can leave out the integer values and it gives you a default-ordered enum from zero up. In F# it gives you a completely different object.
- The syntactic distinction between denegerate unions and enums is quite subtle. The consequences, however, are significant, for both external interface and internal usage.
- You must explicitly assign integral values to all enum constants, even if you don't care about them.
- There is no warning if you assign the same value to multiple constants by mistake. This may easily happen when adding or removing a value from a large enum.
- You must write a manual
ToString()
implementation if you choose to use discriminated unions. This implementation is tedious to write and may easy get out of sync with the actual branches.