The Affect Of Back-Pressure When Piping Data Into Multiple Writable Streams In Node.js
As a learning exercise, I wanted to try and create a small static file server in Node.js. I figured this was small enough but, complex enough to help me learn about the Node.js fundamentals. During one of my refactoring efforts, I noticed that some larger files were only being "partially served" to the browser. After much debugging, I realized that one of the output streams wasn't draining. This one "slow" stream was, in turn, preventing the HTTP response from completing.
In Node.js, it's safe to pipe a single readable stream to multiple destinations. However, when it comes to streams, things are only as fast as the slowest stream in the workflow. In my particular situation, I had a through-stream that wasn't draining into another destination. As such, the internal buffers of the through-stream filled up and created back-pressure. The .pipe() operation, on the source stream, detected this back-pressure and paused the flow of data. This pause, in turn, stopped the HTTP stream from working.
To see this in action, I've boiled the workflow down into its most simple form. In the following code, I'm creating an HTTP server that attempts to pipe a file into both a PassThrough stream and the HTTP response stream:
// Require node modules.
var http = require( "http" );
var fileSystem = require( "fs" );
var stream = require( "stream" );
// Create an instance of our http server.
var httpServer = http.createServer(
function handleRequest( request, response ) {
// Create our file-input stream.
var hankhill = fileSystem.createReadStream( "./hank-hill.png" );
// Pipe it into a simple transform stream.
// --
// NOTE: Our transform stream is not being piped into anything else. As such,
// its internal buffers MAY fill up (depending on the highWaterMark setting and
// the size of the file being streamed), creating backpressure and subsequently
// pausing the file-input stream.
hankhill.pipe( new stream.PassThrough() );
// ALSO pipe it into the response stream.
hankhill.pipe( response );
}
);
httpServer.listen( 8080 );
console.log( "Server running on port 8080" );
Notice that the PassThrough stream isn't piping into another destination. Since the file being streamed, in this case, is larger than the highWaterMark of the PassThrough stream, the buffers fill up, the data pauses, and only half of the file is streamed to the HTTP response:
To fix this, all we need to do is pipe the Through stream into another destination so that its buffers can drain.
NOTE: In the following code, I could have bypassed the PassThrough stream altogether and just gone directly to the createWriteStream(); however, I wanted to keep the two pieces of code in alignment.
// Require node modules.
var http = require( "http" );
var fileSystem = require( "fs" );
var stream = require( "stream" );
// Create an instance of our http server.
var httpServer = http.createServer(
function handleRequest( request, response ) {
// Create our file-input stream.
var hankhill = fileSystem.createReadStream( "./hank-hill.png" );
// Pipe it into a simple transform stream.
// --
// NOTE: This time, our passthrough / transform stream is piping into a file-
// output stream. As such, the passthrough buffers can drain and the transform
// stream doesn't have to pause the file-input stream (at least not indefinitely).
hankhill.pipe(
new stream.PassThrough().pipe(
fileSystem.createWriteStream( "./copy.png" )
)
);
// ALSO pipe it into the response stream.
hankhill.pipe( response );
}
);
httpServer.listen( 8080 );
console.log( "Server running on port 8080" );
Now, with all .pipe() destinations being drained in a timely manner, the HTTP response is able to complete and the file is fully served to the browser.
As a final thought, you can see the affect of the back-pressure by changing the highWaterMark of the PassThrough stream. The PassThrough stream only causes back-pressure when its internal buffers are full. So, if we increase the size of the internal buffer, we can buffer more data without creating back-pressure. And, if we don't create back-pressure, we never have to pause the source of data.
In this case, the image being service was 178Kb. So, I'll instantiate the PassThrough stream with a highWaterMark of 200kb. This should allow the entire file to be buffered internally which will prevent back-pressure:
// Require node modules.
var http = require( "http" );
var fileSystem = require( "fs" );
var stream = require( "stream" );
// Create an instance of our http server.
var httpServer = http.createServer(
function handleRequest( request, response ) {
// Create our file-input stream.
var hankhill = fileSystem.createReadStream( "./hank-hill.png" );
// Pipe it into a simple transform stream.
// --
// NOTE: Our transform stream is not being piped into anything else. However,
// this time, the highWaterMark is sufficiently high to buffer the entire file
// before causing back-pressure. As such, the source stream never needs to be
// paused.
hankhill.pipe(
new stream.PassThrough({
highWaterMark: ( 200 * 1024 )
})
);
// ALSO pipe it into the response stream.
hankhill.pipe( response );
}
);
httpServer.listen( 8080 );
console.log( "Server running on port 8080" );
Increasing the highWaterMark isn't a "solution". I'm just using this approach to demonstrate the affect of back-pressure from the "slowest" stream in the workflow. Ultimately, the stream was doing exactly what it was supposed to do. But, when you're new to streams (and Node.js), like I am, the interactions between streams are not always obvious.
Want to use code from this post? Check out the license.
Reader Comments
@All,
This came up as part of an experiment in learning about Node.js by building a static file server:
www.bennadel.com/blog/2818-learning-node-js-building-a-static-file-server.htm
I found that my files weren't streaming if one of the destination streams wasn't getting drained properly.
A very nice practical example of back-pressure, this help me a lot understanding the back-pressure effect on streams. Thanks