Network Challenged Streams
Executive Summary
- Don't use
StreamReader
to read text from a socket. If you do, you open yourself to random hangs. - You can use
XmlTextReader
to read form a scoket, but you must be careful, or you open yourself to (predictable) hangs.
Downloads
Download test source code and binaries (25K).
Background
I was writing a piece of software that implements a request/response mechanism over a TCP socket. The protocol stipulates that both requests and responses are XML fragments.
I ended up with a combination of XmlTextReader
, a StreamReader
and a NetworkStream
, and it worked alright, until it started to hang on
long messages. The client would behave as if received only a part XML message and wait for the rest, while I could clearly see that the entire message was delivered.
I started digging to the bottom of this, and found that the root cause of my headaches is weird (if not to say buggy) behavior
of the StreamReader
class when it points to a network stream.
Streams and TextReaders
In .NET Framework, classes derived from Stream
represent streams of binary data. They operate in terms of bytes. Classes derived from TextReader
represent text input and operate in terms of characters. The class that converts a binary Stream
to a text input is called StreamReader
.
Its job is to read the underlying binary stream and convert incoming bytes into characters using the specified encoding.
For some encodings this process is straightforward. ASCII encoding converts exactly one byte to exactly one character. UTF-16 encoding converts every two bytes to a character. Other encodings, such as popular UTF-8, have variable number of bytes per character. This creates certain complications. For example, if the binary input is broken into chunks of equal size, a multi-byte character may begin in one chunk and end in the next.
Bytes (UTF-8):
Chunk 1 | Chunk 2 | ||||||
97 | 97 | 97 | 194 | 163 | 98 | 98 | 98 |
Characters:
Chunk 1 | Chunk 2 | ||||||
a | a | a | £ | b | b | b |
Another important feature of the network streams is that "no more data" and "end of stream" events are clearly distinct. The server may stop sending data, wait for 10 minutes and then close the connection. Or it may wait for 10 minutes and send more data. Until the connection is closed, there is no way to know whether more data will arrive or not. Contrast this to file streams where both "no more data" and "end of stream" are achieved simultaneously at the end of file.
Problems with StreamReader over a NetworkStream
StreamReader
can accept any stream as its data source. Unfortunately, StreamReader
does
not behave well when working with a NetworkStream
.
StreamReader.Read() will always request more data from the underlying network stream, even if there is some text left in the read buffer. |
This may not sound like something important, but let's consider the following scenario. The server sends a message to the client which is 552 bytes long. The message ends with the characters "</message>". Then the server stops transmitting and waits for the client to acknowledge the message. Suppose the client reads the message 512 characters at a time:
private static void ReadText(NetworkStream stream) { var buffer = new char[512]; using (var reader = new StreamReader(stream))) while (true) { var chars = reader.Read(buffer, 0, buffer.Length); if (chars == 0) return; var s = new string(buffer, 0, chars); if (s.EndsWith("</message>>")) { // send acknowledgement } } }
This simplistic client ignores the fact that the character sequence "</message>" may be split between two chunks of input. Suppose the messages are padded, so that it never happens.
Unfortunately, the code above does not work: the client hangs. The following sequence of events leads to the hang:
- The client requests up to 512 characters from the
StreamReader
. StreamReader
requests up to 1024 bytes from the network stream. 1024 bytes is the stream reader's default buffer size.StreamReader
returns first 512 characters of the message to the client. The remainder of the message is stored in the stream reader's buffer.- The client again requests up to 512 characters from the
StreamReader
. - The
StreamReader
does not return the remainder of the message to the client. Instead, it requests additional 1024 bytes from the network stream. The server is waiting for the client to respond, so it won't send any more data. The client is waiting for theStreamReader
to retrieve remainder of the message, and theStreamReader
is waiting for more data from the server. The system enters a deadlock state and hangs.
You can observe this scenario in practice using server scenario 1 and client scenario 2 in my test code: see "Running the tests".
The hang could have been prevented if StreamReader.Read()
call immediately returned when at least some data is available, just like
NetworkStream.Read()
does. Unfortunately, this is not the way StreamReader.Read()
works. This makes StreamReader
unsuitable for request-response style network communication.
One may suggest that the problem would go away if the client always requests exactly 1024 characters from the stream reader. This way the client will exhaust the stream reader's buffer on every read, and no leftover characters would cause the hang. This solution "almost" works. Keep in mind that the client requests 1024 characters, and the stream reader's buffer is 1024 bytes. If all characters were exactly one byte, this solution would indeed work. However, if the stream can contain multi-byte (international) characters, we may end up with a situation when 1024 characters is not exactly 1024 bytes. Thus, we will lose the buffer alignment, there will be leftover characters in the buffer, and the client will hang. This is demonstrated by server scenario 2 and client scenario 3.
Reading XML from NetworkStream
As you remember, in my real life scenario the sevrer and the client exchange XML messages. Naturally, I wanted to use .NET's built-in XML parser
to parse those messages. Fortunately, this can be done without running into the StreamReader
bugs, but there is plenty of caveats.
XmlDocument.Load
XmlDocument.Load(Stream)
seems like a good way to load an XML document from a socket, but it will not return
until the input stream is closed. This makes XmlDocument.Load()
unsuitable for the request-response scenario.
As strange as it seems, XmlDocument.Load()
has legitimate reasons not to return until the end of the stream. Even
after the root element is closed, XML document may have comments, processing instructions and whitespace.
Thus, there is no way to know whether we read the whole document until the input stream is closed.
XmlTextReader Class
To quote MSDN, XmlTextReader
"provides fast, non-cached, forward-only access to XML data". It can work with either a binary
Stream
or a TextReader
. Amazingly, when XmlTextReader
is supplied with a binary stream, it does not
create an intermediate text reader. Instead, it calls the binary stream directly and converts bytes to characters using its own mechanisms.
In other words, XmlTextReader
has two branches of code: one that works with text input and the other that works directly with
binary input, bypassing TextReader
parsing mechanisms.
The binary parser built into the XmlTextReader
does not suffer from the hanging bug like StreamReader
.
This allows us to parse XML from a network stream, provided we feed the network stream directly to XmlTextReader
. If
we create an intermediate StreamReader
,
Loading Root Element into an XmlDocument
XmlTextReader
is a great class, but it parses the document one node at a time, which may be too low-level.
If we want to retrieve an entire root element of the incoming XML message, we can use ReadSubTree()
method
as follows:
private void ReadXmlSubtree(NetworkStream stream) { using (var xmlReader = new XmlTextReader(stream)) { while (xmlReader.Read()) { if (xmlReader.NodeType != XmlNodeType.Element) continue; using (var subTreeReader = xmlReader.ReadSubtree()) { var doc = new XmlDocument(); doc.Load(subTreeReader); Console.WriteLine("Received XML document: {0}", doc.OuterXml); return; } } } }
The subTreeReader
exits when it reads the end element tag, and therefore does not hang forever. This method is
illustrated by client scenario 6.
Summary of Test Results
I developed 4 server and 7 client scenarios that re-create different situation that may occur on the wire. The protocol for all scenarios is as follows:
- The client connects to the server.
- The client sends scenario number to the server
- The server produces valid XML whose exact content depends on the scenario.
- The server sends this XML to the client, either in one piece or in chunks.
- The client tries to process the XML on its end. Different client scenarios use different processing techniques.
- The client sends and ACK byte to the server.
- The connection is closed on both ends.
Note that in some scenarios client processing hangs. In this case the client must be forcefull terminated, and the server never receives the ACK byte.
Server Scenarios
There are four server scenarios that are defined below. In every scenario the server sends valid XML to the client:
- Send the text '
<?xml version='1.0'?><message>aaa...aaa</message>
', all in one transmission. There are 512 characters 'a' in the message. - Send the text '
<?xml version='1.0' encoding='utf-8'?><message>
', then wait 100 ms and send the text 'aaa...aaa£</message>
with 1023 characters 'a'. The message encoding is UTF-8. All characters, except for the pound sign £ are represented by one byte on the wire. The pound sign is represented by two bytes. - Same as #2, but with 4095 characters 'a'.
- Send the text '
<?xml version='1.0' encoding='utf-8'?><message>
', then wait 100 ms and send the text 'aaa...aaa</message>
with 4048 characters 'a'. The message encoding is UTF-8. All characters are represented by one byte on the wire.
Client scenarios and their interaction with the server scenarios are shown in the table below:
Client scenario | Server scenario 1 | Server scenario 2 | Server scenario 3 | Server scenario 4 |
---|---|---|---|---|
1. Read as binary, 512 bytes at a time. | Works | Works | Works | Works |
2. Read as text, 512 characters at a time. | Hangs | Hangs | Hangs | Hangs |
3. Read as text, 1024 characters at a time. | Works | Hangs | Hangs | Works |
4. Use XmlDocument.Load() directly from the NetworkStream . |
Hangs | Hangs | Hangs | Hangs |
5. Use XmlDocument.Load() from a TextReader that points to the NetworkStream . |
Hangs | Hangs | Hangs | Hangs |
6. Use XmlReader.ReadSubTree() directly from the NetworkStream . |
Works | Works | Works | Works |
7. Use XmlReader.ReadSubTree() from a TextReader that points to the NetworkStream . |
Works | Works | Works | Hangs |
Running the Tests
To run the server, open the command line, navigate to the directory with NetworkChallengedStreams.exe
and issue
the following command:
start NetworkChallengedStreams.exe /server |
To run the client, issue the following command:
NetworkChallengedStreams.exe serverScenario clientScenario |
where serverScenario
is server scenario number (1-4) and clientScenario
is client scenario number (1-7).
Conclusion
- Don't ever use
StreamReader
withNetworkStream
. If you have to, read the bytes and decode them yourself. Keep in mind that a multi-byte character may straddle a chunk boundary. - If your input is XML, you don't need to decode characters yourself. You can parse it using
the
ReadSubTree()
method.
Feedback
Questions? Comments?
Drop me a line
Copyright (c) Ivan Krivyakov. Last updated: October 28, 2013