How to use Node.js Streams (And how not to!)

October 30th, 2019

When I first started to understand Node.js streams, I thought they were pretty amazing. I love JavaScript Promises, but they only resolve to one result. Streams, however, can provide a constant stream of data, as you might expect!

Functional Reactive Programming is all the rage these days. Libraries like MobX, RxJS and Highland.js make it easy to structure your front-end application as data flowing in one direction down through a chain of pipes.

You can pipe a stream to another stream so that the output of the first becomes the input to the next. Sounds like a really neat way to structure an application, right?

I've already rewritten a lot of my JavaScript code to use Promises. Are streams the next step in the evolution? Is it time to rewrite all our applications to use Node streams? (Spoiler: NO!)

Unix pipes are the best

I love working with pipes in Linux (or Unix). It's really nice to be able to take a text file, pipe that into a command, pipe the output to another command, and pipe the output from that into a final text file.

Here's an example of using the power of pipes on the command line. It takes a text file with a list of words, sorts the list, counts how many times each word appears, then sorts the counts to show the top 5 words:

$ cat words.txt | sort | uniq -c | sort -nr | head -n5

It's not important for you to understand these commands, just understand that data is coming in to each command as "Standard Input" (or stdin), and the result is coming out as "Standard Output" (or stdout). The output of each command becomes the input to the next command. It's a chain of pipes.

So can we use Node.js in the middle of this chain of pipes? Of course we can! And Node streams are the best way to do that.

Going down the pipe

Node.js streams are a great way to be able to work with a massive set of data, more data than could possible fit into memory. You can read a line of data from stdin, process that data, then write it to stdout.

For example, how would we make a Node CLI application that capitalizes text? Seems simple enough. Let's start with an application that just takes stdin and pipes directly to stdout. This code does almost nothing (similar to the cat unix command):

process.stdin.pipe(process.stdout);

Now we can start using our Node.js application in the middle of our pipeline:

$ cat words.txt | node capitalize.js | sort | uniq -c | sort -nr | head -n5

Pretty simple, right? Well, we're not doing anything useful yet. So how do we capitalize each line before we output it?

npm to the rescue

Creating our own Node streams is a bit of a pain, so there are some good libraries on npm to make this a lot easier. (I used to heavily use a package called event-stream, until a hacker snuck some code into it to steal bitcoins!)

First, we'll use the split package, which is a stream that splits an input into lines, so that we can work with the data one line at a time. If we don't do this, we might end up with multiple lines, or partial lines, or even partial Unicode characters! It's a lot safer to use split and be sure we are working with a single, complete line of text each time.

We can also use a package called through which lets us easily create a stream to process data. We can receive data from an input stream, manipulate the data, and pipe it to an output stream.

const split = require('split');
const through = require('through');

process.stdin
    .pipe(split())
    .pipe(
        through(function(line) {
            this.emit('data', line.toUpperCase());
        })
    )
    .pipe(process.stdout);

There is a bug in the code above, because the newline characters are stripped out by split, and we never add them back in. No problem, we can create as many reusable streams as we want, to split our code up.

const through = require('through');
const split = require('split');

function capitalize() {
    return through(function(data) {
        this.emit('data', data.toUpperCase());
    });
}

function join() {
    return through(function(data) {
        this.emit('data', data + '\n');
    });
}

process.stdin
    .pipe(split())
    .pipe(capitalize())
    .pipe(join())
    .pipe(process.stdout);

Isn't that lovely? Well, I used to think so. There's something satisfying about having the main flow of your application expressed through a list of chained pipes. You can pretty easily imagine your data coming in from stdin, being split into lines, capitalized, joined back into lines, and streamed to stdout.

Down the pipe, into the sewer

For a few years, I was really swept up in the idea of using streams to structure my code. Borrowing from some Functional Reactive Programming concepts, it can seem elegant to have data flowing through your application, from input to output. But does it really simplify your code? Or is it just an illusion? Do we really benefit from having all our business logic tied up in stream boilerplate?

It's worse than it looks too. What if we emit an error in the middle of our pipeline? Can we just catch the error by adding an error listener to the bottom of the pipeline?

process.stdin
    .pipe(split())
    .pipe(capitalize())
    .pipe(join())
    .pipe(process.stdout)
    .on('error', e => console.error(e)); // this won't catch anything!

Nope! It won't work because errors don't propagate down the pipe. It's not anything like Promises where you can chain .then calls and throw a .catch at the end to catch all the errors inbetween. No, you have to add an error handler after each .pipe to be sure:

process.stdin
    .pipe(split())
    .pipe(capitalize())
    .on('error', e => console.error(e))
    .pipe(join())
    .on('error', e => console.error(e))
    .pipe(process.stdout);

Yikes! If you forget to do this, you could end up with an "Unhandled stream error in pipe." with no stack trace. Good luck trying to debug that in production!

Conclusions and recommendations

I used to love streams but I've had a change of heart recently. Now, my advice is to use data and error listeners instead of through streams, and write to the output instead of piping. Try to keep the number of streams to a minimum, ideally just an input stream and an output stream.

Here's a different way we can write the same example from above, but without all the hassle:

const split = require('split');
const input = process.stdin.pipe(split());
const output = process.stdout;

function capitalize(line) {
    return line.toUpperCase();
}

input.on('data', line => {
    output.write(capitalize(line));
    output.write('\n');
});

input.on('error', e => console.error(e));

Notice I'm still piping to the split library, because that's straightforward. But after that, I'm using a listener to the data event of the input to receive data. Then, I'm using write() to send the result to the stdout output.

And notice that my capitalize() function no longer has anything to do with streams. That means I can easily reuse it in other places where I don't want to use streams, and that's a really good thing!

I still think Node streams are interesting but they are not the future of JavaScript. If used carefully, you can make pretty powerful command-line tools with Node.js. Just be careful not to overdo it!

The simplest Svelte component is an empty file

August 4th, 2019

I discovered something while refactoring my Svelte code that blew my mind: A Svelte component can be an empty file. How many other component frameworks can say that?

This was very useful during refactoring, because I could just create a placeholder file for the new component, import it and start using it:

<script>
import Empty from './empty.svelte';
</script>

<Empty/>

Sure, it doesn't do anything, but it doesn't break either.

I think this is very symbolic of what makes Svelte so groundbreaking and powerful. Let's dig deeper and see what it can teach us about Svelte.

A Svelte component is a file

With Svelte, components and files have a one-to-one relationship. Every file is a component, and files can't have more than one component. This is generally a "best practice" when using most component frameworks. Perhaps this practice comes from the practice of having each class in a separate file in languages like Java or C++.

By enforcing this practice, Svelte can make some assumptions that simplify your code. That brings me to the next observation.

No boilerplate, just make a new file

In most component frameworks, you need to write some code to define your component. With React, the simplest component is an empty function. In other frameworks, you need to import a library and call a special function to define and create your component. With Svelte, you just create a new .svelte file.

The Svelte compiler will take each file and generate a component for it automatically. And that brings us to another important observation.

You don't need Svelte to use a Svelte component

In order to mount a React component, you need to import react-dom. Using a Vue component requires the Vue library. An Angular application absolutely requires loading the Angular framework.

Svelte, on the other hand, is a compiler. In a way, Svelte is more like a programming language than a library. When you're programming in JavaScript, you don't need to import something to use a for loop. Similarly, you don't need to import anything in your Svelte code to use Svelte's template syntax. Your Svelte files get compiled into Javascript and CSS. It's a very different approach.

You might guess that an empty file would compile into an empty JavaScript file, but every Svelte component comes with an API that allows you to use it outside of Svelte and mount it into the DOM. Here's what it looks like to use a compiled Svelte component:

import Empty from './empty.js';

const empty = new Empty({
  target: document.body,
  props: {
      // if we had some, they'd go here
  }
});

If we compile our empty component and bundle it with Svelte internals, it ends up being 2,080 bytes uncompressed, and 1,043 bytes gzipped. So the overhead for using Svelte ends up being only a kilobyte. Compare that to other frameworks that require 10x or 100x that many bytes just to mount an empty component!

Svelte is a new paradigm

At first glance, being able to use an empty file as a component seems like a silly, impractical gimmick. But looking deeper, I think it teaches us a lot about how Svelte differs from most if not all JavaScript component frameworks that came before it.

I imagine it will inspire other framework developers to take a similar approach and reap some of the same benefits. This is the kind of shift in thinking that changes things permanently. Svelte is not just a new framework, but a complete paradigm shift.

Svelte is the most beautiful web framework I've ever seen

June 28th, 2019

I first heard about Svelte a year ago, when Rich Harris presented it at JSConf EU 2018. The demo gods were a bit harsh on him, but it didn't matter to me, because I was so impressed by his philosophy and ideas that I was already sold. I knew he'd work out the kinks, go through a few major versions, and Svelte would be mature enough in no time.

I kind of forgot about Svelte after that, that was until last week when I read Rich Harris' blog post Why I don't use web components. It reminded how simple and beautiful Svelte's syntax is, and I decided it was time to give it some serious consideration.

First, I played a little bit with the Svelte REPL, and got a sense for how it works. Then I decided to try building a Tic Tac Toe game live on Twitch and YouTube. Even though I'm a total noob when it comes to Svelte, and I'd barely read the docs, it only took me about half an hour to get a Tic Tac Toe game working. After that, I explored some different Svelte features, trying to move the game state into a non-Svelte module, and discovered a few anti-patterns in the process.

At the end, I was completely blown away when I discovered that the production build had only 2,418 bytes of JavaScript..! That's 2.4kb including the Svelte runtime!!!

How does Svelte do it? Because Svelte is a compiler. It only includes the bare minimum of JavaScript necessary to get the job done. It turns the HTML templates you write into extremely simple DOM scripting. It transpiles the JavaScript you write so that your simple variable assignments trigger a re-render. It generates JavaScript classes to represent your .svelte files and wires everything up for you, so the only boilerplate you really need is a <script> tag and a <style> tag.

If you're interested in seeing the Tic Tac Toe game I built, you can clone it on GitHub, and spin it up with npm install, and npm start.

Otherwise, I highly recommend you check out the official Svelte Tutorial and try it out for yourself!

Formatting dates with JavaScript

April 19th, 2019

There are a number of popular JavaScript date formatting libraries out there, such as Moment.js, Luxon and date-fns. Those libraries are very powerful and useful, allowing you to request various date formats using a special syntax, but they also come with many more features than most people will ever use. And that means, your users are probably downloading more JavaScript then they need to.

In this article, I'm going to show you how to use the built in basic Date object to format dates without any third-party library. In other words, we'll be formatting dates with pure vanilla JavaScript.

Feel free to copy and paste the solutions below to use as a starting point in your own code bases. I'll demonstrate how to generate a number of common format types, but you may need to modify the solutions below a little bit to format dates and times to be exactly the way you want.

What about the Internationalization API?

Before I start, I should mention that there is some formatting functionality built into JavaScript dates, using the Internationalization API.

Using the Internationalization API, you can format dates according to a specific locale, which means formatting according to the customs of the user's location and language. If you're not picky about how dates and times will be displayed, this can work well in many cases, but it depends on each user's operating system, and which locales are installed on their devices. In other words, it can be hard to predict what the format will look like in any given browser.

If you want to format dates in some specific way and have full control over what is being displayed, please read on.

Date methods

Pretty much all the information we need can be provided by a few built-in methods on the date object:

const date = new Date; // current time & date

date.getFullYear(); // Year
date.getMonth(); // Month of the year 0-11 (0 = January)
date.getDate(); // Date of the month, 1-31
date.getDay(); // Day of the week, 0-6 (0 = Sunday)
date.getHours(); // Hours, 0-23
date.getMinutes(); // Minutes, 0-59
date.getSeconds(); // Seconds, 0-59

Now you may have noticed that all these methods return numbers. But how are you supposed to get words out of it like "Thursday" or "November"? And what if you want your month or date number to start with a zero? No problem, we can use JavaScript!

Years

Get the full year

Getting the year out of the Date object is really easy, but it's a four-digit year by default:

date.getFullYear(); // returns 2019

What if you want only two-digits?

There is a getYear() function in the Date object as well, and sometimes I accidently use that instead of getFullYear(). However, it's more or less useless. In 2019, it returns 119. Why?? Because there is a Y2K bug baked into JavaScript, even though JavaScript was designed in 1995! In those first five years of JavaScript, people could call getYear() for a two-digit year, and simply add 1900 to get a four-digit year. And I guess that still works, because 1900 + 119 = 2019!

Since the getYear() function has been broken since the year 2000, I recommend getting a two-digit year using this approach instead:

function getTwoDigitYear(date) {
    return date.getFullYear() % 100; // returns 19
}

Months

Display the month as a two-digit number

The getMonth() function of the Date object returns a number between 0 and 11. That has got to be one of the biggest surprises when working with dates in JavaScript. It also mostly makes this method useless without writing more code. Let's see how to do that.

function getTwoDigitMonth(date) {
    // add one to month to make it 1-12 instead of 0-11
    const month = date.getMonth() + 1;

    if (month < 10) {
        // add a 0 to the start if necessary
        return `0${month}`;
    } else {
        // for 10, 11 and 12, just return the month
        return month.toString();
    }
}

Display the month as a string

If we want to display the month as a string of text like "February" or "Mar", then we need to use a JavaScript array with all the months. In fact, this is why the getMonth() method returns a number between 0 and 11, because arrays start counting at 0 as well!

function getMonthName(date) {
    const months = [
        'January',
        'February',
        'March',
        'April',
        'May',
        'June',
        'July',
        'August',
        'September',
        'October',
        'November',
        'December'
    ];

    return months[date.getMonth()];
}

If you want to use a short form of the month, or just a single character, or another language, you can easily adapt the code above to change the contents of the array with whatever you prefer to use.

Days of the week

If you're going to be displaying the day of the week, you'll probably want to be displaying some text. You can use the same approach that we used for formatting months above. Basically, you just need to define an array of text and access it using the getDay() result as an index.

function getWeekDayName(date) {
    // make sure you start with Sunday
    const weekDays = [
        'Sunday',
        'Monday',
        'Tuesday',
        'Wednesday',
        'Thursday',
        'Friday',
        'Saturday'
    ];

    return weekDays[date.getDay()];
}

Again, if you want to return different text, like 'Sun' or 'S', just replace the entries in the array with whatever you prefer.

Day of the month

The day of the month is pretty straightforward. You can just call getDate() to get the number, and you don't have to add one to it or anything.

Display the day of the month as a two-digit number

For some date formats, you may want to get a two-digit version of the date number. That's simple enough:

function getTwoDigitDayOfTheMonth(date) {
    const dayOfTheMonth = date.getDate();

    if (dayOfTheMonth < 10) {
        // add a 0 to the start if necessary
        return `0${dayOfTheMonth}`;
    } else {
        // for 10 or greater, just return the dayOfTheMonth
        return dayOfTheMonth.toString();
    }
}

Display an ordinal with the day of the month

If you want a fancy day of the month with an ordinal after it, like 1st, 2nd, 3rd, 4th, 21st, etc., you can easily figure that out with a switch statement:

function getDayOfTheMonthWithOrdinal(date) {
    const dayOfTheMonth = date.getDate();
    const ordinal = getOrdinal(dayOfTheMonth);

    return `${dayOfTheMonth}${ordinal}`;
}

function getOrdinal(number) {
    // using the % modulo operator to get the last digit of the number
    const lastDigitOfNumber = number % 10;

    switch (lastDigitOfNumber) {
        case 1:
            return 'st';
        case 2:
            return 'nd';
        case 3:
            return 'rd';
        default:
            return 'th';
    }
}

Times

You can apply the techniques we used above with times as well, depending what you need. Let's say we want to display a time format in 12-hour time with "am" or "pm", like "9:45pm". Here's how:

function formatTwelveHourTime(date) {
    // call functions below to get the pieces we need
    const hour = getHourInTwelveHourClock(date);
    const minute = getTwoDigitMinute(date);
    const meridiem = getMeridiem(date);

    // put it all together
    return `${hour}:${minute}${meridiem}`;
}

function getHourInTwelveHourClock(date) {
    const hour = date.getHours();

    if (hour === 0 || hour === 12) {
        return 12;
    }

    // otherwise, return a number between 1-11
    return hour % 12;
}

function getTwoDigitMinute(date) {
    const minute = date.getMinutes();

    if (minute < 10) {
        // add a 0 to the start if necessary
        return `0${minute}`;
    } else {
        // for 10 or greater, just return the minute
        return minute.toString();
    }
}

function getMeridiem(date) {
    const hour = date.getHours();

    if (hour < 12) {
        return 'am';
    } else {
        return 'pm';
    }
}

Bringing it all together

We've covered how to generate all the different pieces of various date formats. How about putting all the difference pieces together?

You can use any method you like, but I suggest an approach like the following, using a template string.

function shortDateFormat(date) {
    // use the built-in function here
    const year = date.getFullYear();

    // use the functions we wrote above
    const month = getTwoDigitMonth(date);
    const dayOfTheMonth = getTwoDigitDayOfTheMonth(date);

    // put it all together, eg. "YYYY-MM-DD"
    return `${year}-${month}-${dayOfTheMonth}`;
}

You can create as many formatting functions as you have formats to generate. Here's another example:

function longDateTimeFormat(date) {
    const weekDayName = getWeekDayName(date);
    const monthName = getMonthName(date); 
    const dayOfTheMonth = getDayOfTheMonthWithOrdinal(date);
    const year = date.getFullYear();
    const time = formatTwelveHourTime(date);

    // put it together, eg. "Friday, April 19th, 2019 at 9:45pm"
    return `${weekDayName}, ${monthName} ${dayOfTheMonth}, ${year} at ${time}`;
}

Conclusion

I hope I've provided you with all the tools and techniques you might need to format dates and times. You can apply many of these techniques in other ways too, like formatting currencies. More importantly, with a little bit of custom JavaScript, you can avoid having additional dependencies for your project and downloads for your users.

Is this thing on?

March 26th, 2019

Can you believe it has been four years since I last posted here, and eight years since I really wrote blog articles here? Is anybody still reading this? If so, I'm thinking about writing articles again. If not, well, I will probably write them anyway!

In the meantime, I've been streaming on Twitch for the past year. I stream at random once or twice a week, a few hours in the afternoon EST. There is an archive of videos if you want to watch past streams.

Hope to see you there! (If there is anyone out there??!)

<< older posts newer posts >> All posts