A Light Introduction
to D3.js, with LEGOs
Written by Steven Braun
What is D3.js?
D3.js, which stands for data-driven documents, is a JavaScript library that is the de facto standard for creating interactive and static data visualizations on the web. Popularized by websites like the New York Times, it's hard to avoid D3's influence on the internet.
What are LEGOs?
Invented by Ole Kirk Christiansen, LEGO is a line of plastic interlocking blocks that can be used to construct figurines, models, and more. By piecing together more and more blocks, you can build more elaborate constructions.
Why are you putting them together?
Why not? It turns out that LEGO is a useful medium for understanding how D3.js works — by simulating how shapes are rendered by D3 on an SVG (scalable vector graphics) canvas.
This light introduction to D3.js assumes you have a working knowledge of JavaScript, CSS, HTML, and a little bit of SVG. It also assumes you have a basic understanding of what the DOM (Document Object Model) is and how web pages are constructed.
Materials
This tutorial is originally intended to work with physical LEGOs, so if you have access to those, you will need the following:
  • 16x16 (5") baseplate or larger baseplate
  • Several 2x2, 2x3, 2x4, and/or 2x6 rectangular blocks
  • Several 1x1 round (single) blocks
  • 3–1x4 or 2–1x6 blocks (to create axes)
However, if you don't have access to physical LEGOs, that's perfectly fine! This tutorial includes interactive applets that simulate the work with LEGOs we will do.
There are also digital materials that accompany this tutorial:
With the materials ready, let's get started with a simple example: making a scatter plot.
Exercise
Using the blocks provided below, create a basic scatter plot with an x-axis, y-axis, and at least 6 points. For the axes, use the 1x4 rectangular blocks, and for the data points, use the round blocks. As you do this, come up with a list of commands, dispatched from the origin (0,0) at the bottom left of the baseplate. What kinds of commands are needed? Be verbose and specific, including steps from selecting your blocks to attaching them to the baseplate. Record your data in the table on the LEGO canvas worksheet linked above.
What kinds of commands did you come up with? At a minimum, these must include translational commands such as move x spaces and move y spaces. They also include a command to draw shape at current position. With your LEGOs (real or virtual), you may have come up with something like what is displayed below:
To produce this model in the specified coordinate space, with the origin of dispatch at the bottom left corner of the baseplate, we need to rely on the kinds of commands listed above. But wait! There's one more we didn't consider: a select command. Before we translate our position across the canvas, we must first select the kind of block we want to use. To start off, we might use the following order of commands and blocks:
In total, for the sample scatter plot displayed above, we need the following set of commands (with the assumption that we have already called upon and drawn our baseplate):
  1. Select round block, move 4 x-spaces, move 5 y-spaces, and append round block
  2. Select round block, move 6 x-spaces, move 6 y-spaces, and append round block
  3. Select round block, move 8 x-spaces, move 9 y-spaces, and append round block
  4. Select round block, move 10 x-spaces, move 8 y-spaces, and append round block
  5. Select round block, move 12 x-spaces, move 10 y-spaces, and append round block
  6. Select round block, move 14 x-spaces, move 13 y-spaces, and append round block
When we work in D3, we typically follow an order of commands that parallels the kinds of commands you imagined while creating your LEGO chart. As with the LEGOs, order and sequence is important; certain actions must occur before others in order for the end result to turn out looking the way it should (i.e., you must move x spaces before you draw shape at current position). In D3, this sequencing is achieved by chaining methods, and there is an analogous method for each command listed above. In preview, we might translate our commands above into the following pseudocode, which can then be translated into D3 methods.
Moving in SVG and coordinate space
Working with D3 to create shapes occurs exactly the same way that moving LEGO blocks on a baseplate does — with one twist. In LEGO space, we specify that the coordinate axis has an origin (0,0) at the bottom left corner of the canvas, with x-values increasing left to right and y-values increasing bottom to top, as illustrated below:
In an SVG canvas, however, the origin (0,0) is positioned elsewhere: in the top left corner. X-values increase from left to right, but y-values increase from top to bottom instead of bottom to top:
So when we work with D3, we must deploy our positioning commands with respect to this new origin. With our commands in tow, we are now ready to create our first D3 chart: a scatter plot.
Assembling a D3 script
For our exercises here, D3 acts as an intermediary between our LEGO construction and the SVG canvas. Put another way, D3 is the machinery that transforms or translates our operations in LEGO space into operations in SVG space. As we noted above, every operation we carry out in LEGO space has an analogous operation in SVG, and D3 is the physical API that gives us access to those connections.
We can think of a chart or graph created in D3 as a modular construction built up from chunks of code. At a minimum, any D3 script that is used to create a chart, such as a bar chart or scatter plot, must have 5 basic components generally in this order:
  1. Generate SVG. Define the width, height, and margins of your SVG canvas and then generate that canvas within the page.
  2. Define data. Structure data as an array of elements or objects.
  3. Define scales. Specify how to transform values in the data into pixel dimensions on the screen.
  4. Draw axes. Self-explanatory.
  5. Bind and draw data. Assign data to elements on the screen and generate shapes on the SVG canvas.
These components mirror our LEGO commands, with a couple of additions. While before we were simply selecting blocks, positioning them, and appending them to the canvas, now we must do these things on top of some D3 archetypes, including scales and axes.
Assembling a scatter plot
To create a scatter plot in D3, we can parse out our modular components with the code below, which represents a complete chart.
Generate SVG
var width = 500;
var height = 500;

var margin = {top: 25, left: 25, right: 25, bottom: 25};

var svg = d3.select("body")
  	.append("svg")
  	.attr("width",width)
  	.attr("height",height);
Define data
var data = [{x:2, y: 3},
  	{x: 4, y: 4},
  	{x: 6, y: 7},
  	{x: 8, y: 6},
  	{x: 10, y: 8},
  	{x: 12, y: 11}
  	];
Define scales
var xScale = d3.scaleLinear()
	.domain([1,12])
  	.range([margin.left, width - margin.right]);
  
var yScale = d3.scaleLinear()
	.domain([1,12])
  	.range([height-margin.bottom, margin.top]);
Draw axes
var xAxis = svg.append("g")
	.attr("transform","translate(0," + (height-margin.bottom) + ")")
	.call(d3.axisBottom().scale(xScale));
  
var yAxis = svg.append("g")
	.attr("transform","translate(" + margin.left + ",0)")
	.call(d3.axisLeft().scale(yScale));
Bind and draw data
var circles = svg.selectAll("circle")
	.data(data)
  	.enter()
  	.append("circle")
    		.attr("cx", function(d) { return xScale(d.x); })
    		.attr("cy", function(d) { return yScale(d.y); })
    		.attr("r",6);
The code above produces the following scatter plot — no bells, no whistles, just circles plotted against an x-axis and a y-axis.
Generate SVG
Define data
Define scales
Draw axes
Bind and draw data
In the code above, some elements are highlighted red. These specify elements in the code that are recyclable; if you leave the rest of the code unaltered, you can create an infinity of different scatter plots by altering the data and the accompanying scale domains.
What's going on?
The scatter plot displayed above is a direct translation of our LEGO chart, with LEGO-based commands replaced with D3 methods. Where we once specified the mover to translate a certain number of x- and y-spaces, we now specify translations as attributes of shapes we append to the SVG canvas. For the most part, there is a one-to-one correspondence between our LEGO functions and these methods, with order reversed: where we once moved x-spaces and then y-spaces before appending our shapes, we now first append our shape and then specify position.
You might notice something slightly odd going on with our data, however. Recall that in our LEGO commands, all operations are dispatched from an absolute origin at the lower left corner of the baseplate. In our SVG canvas, the absolute origin is now in the top left, but we also have a new relative origin at the corner created by the intersection of our "axes." It is from here that the positions of data points are determined, not from the absolute origin.
When we create our scales in D3, what we are doing is asking D3 to translate positions in LEGO space into pixel positions on the screen. Whereas LEGO positions are discrete, pixel dimensions are continuous.
Binding data
You may be familiar with Mike Bostok's oft-cited tutorial on D3 selections titled "Three Little Circles." If not, go take a moment to read it — we won't be covering the mechanics of enter, update, and exit selections here.
However, it's worth thinking through what LEGOs can tell us about how these selections operate. In our LEGO example, we can think of data-binding as an additional interlocking LEGO block added on top of the blocks that we add to our SVG "canvas" (baseplate). Beneath our LEGO blocks, there's actually an invisible layer we haven't considered yet — the DOM, or "Document Object Model." The DOM dictates the structural hierarchy of the web page as a whole, including elements inside of the SVG (which itself is a DOM element).
In a canonical select-and-append procedure in D3, there are 4 basic methods (tasks):
  1. svg.selectAll("circle")
  2. .data(data)
  3. .enter()
  4. .append("circle");
In step 1, we select all the circle elements in the SVG DOM structure. But when we first call this method, there are no circles! The selection is empty. But in the second step, we bind data with the method .data(). In the DOM, this data binding creates placeholders for the empty selection created in the step before, one placeholder per unit of data. In step 3, we insert these placeholders directly into the DOM (with the SVG element, so these placeholders become children of the SVG element), and then in step 4, we append a circle to each of these placeholders. It is in step 4 where D3 moves from manipulations in the DOM to manipulations directly in SVG space.
There's an important reality to consider now. When we work with our LEGOs, the LEGO blocks and the DOM are actually one and the same. Each block we append to the LEGO baseplate (SVG canvas) is both a shape and the DOM element representing that shape; each LEGO block is a child element of the baseplate (SVG container) to which it is attached. Thus, when we bind data to our LEGO blocks, we are consequently binding data to the DOM elements which represent them.
Recall earlier that we said D3 was the "physical API" operating between our LEGO blocks and the SVG canvas. When we use D3 in web pages, it functions as an API that manipulates the DOM. Since we've now made the claim that our LEGO space is actually just an abstraction of the DOM, our metaphor has come full circle.
Let's apply what we've learned to a slightly more complex example: a basic bar chart.
Exercise
Using the blocks provided below, this time create a basic bar chart with an x-axis, y-axis, and bars. As you do this, come up with a list of commands, dispatched from the origin (0,0) at the bottom left of the baseplate. What kinds of commands are needed? Be verbose!
What kinds of commands did you come up with? As in the previous exercise, these might include commands to move x spaces or move y spaces, but this time, some more commands may be needed. Since we are creating rectangles, we also need to specify dimensions. How wide should our rectangles be? How long should they be? Perhaps the bar chart you created looks like this:
These kinds of questions are likewise answered in SVG and D3. When we append a rectangle to our SVG canvas, we must now tell D3 to make it a certain size. This is accomplished through chaining attribute methods.
Assembling a bar chart
We can now assemble a basic bar chart, utilizing again the modular components outlined above.
Generate SVG
var width = 500;
var height =  500;

var margin = {top: 25, left: 25, right: 25, bottom: 25};

var svg = d3.select("body")
	.append("svg")
	.attr("width",width)
	.attr("height",height);
Define data
var data = [{x: 'A', y: 3},
	{x: 'B', y: 5},
	{x: 'C', y: 2},
	{x: 'D', y: 8}
	];
Define scales
var xScale = d3.scaleBand()
	.domain(['A','B','C','D'])
  	.rangeRound([margin.left, width - margin.right])
  	.paddingInner(0.5)
  	.paddingOuter(0.5);
 
var yScale = d3.scaleLinear()
	.domain([1,12])
  	.range([height-margin.bottom, margin.top]);  
Draw axes
var xAxis = svg.append("g")
	.attr("transform","translate(0," + (height-margin.bottom) + ")")
	.call(d3.axisBottom().scale(xScale));
  
var yAxis = svg.append("g")
	.attr("transform","translate(" + margin.left + ",0)")
	.call(d3.axisLeft().scale(yScale));
Bind and draw data
var barWidth = xScale.bandwidth();
var bars = svg.selectAll("rect")
	.data(data)
	.enter()
	.append("rect")
	.attr("x", function(d) { 
		return xScale(d.x); 
	})
	.attr("y", function(d) { return yScale(d.y); })
	.attr("width", barWidth)
	.attr("height", function(d) { 
		return height - margin.bottom - yScale(d.y); 
	});
There are a few notable differences from our scatter plot we created previously. The first is that we are now working with ordinal data in the x-axis domain (x-values of "A," "B," "C," and "D"), and in consequence we must use a different kind of ordinal scale to translate x-values into pixel positions on the screen (here, scaleBand accomplishes this task well for us). This kind of scale divides a range of pixel positions into evenly-portioned bands, with optional inner and outer padding values providing spacing between them. To retrieve the width of each band generated by our scale, we call xScale.bandwidth(), which returns that width, which in turn is used as the width for each of our rectangles.
There's also something peculiar going on with the height attribute declared for each bar in the chart. Why wouldn't the height simply be equal to the value returned by the yScale() function? Recall that our yScale simply transforms our LEGO positions into pixel positions on the page. Since every value returned by yScale is actually the origin (starting point) for drawing each rectangle in our chart (and since rectangles are drawn starting from their top left corner, like SVG coordinate space), we must interpolate the height of each generated rectangle from this value and the dimensions of our SVG — here, the height of each rectangle is the distance between the rectangle's origin and the boundary of the x-axis, which is positioned margin.bottom pixels from the bottom edge of the SVG.
The above code consequently produces the following bar chart, again with no bells or whistles:
Generate SVG
Define data
Define scales
Draw axes
Bind and draw data
And there we have it — we've drawn a basic bar chart based on our LEGO blocks.
The important idea to recognize here is that the concepts we've developed are further generalizable to a wide array of many other kinds of charts you can create with D3. By thinking about SVG as a moveable canvas, akin to the operations we might perform on LEGO blocks, we can conceive of any basic chart as a sequence of commands that include translations across SVG space.
Wrapping up
To recap, we made a few essential claims through this tutorial that are worth reiterating here. First, we made the claim that we can simulate operations in D3 with the use of LEGO blocks. In fact, LEGO blocks provide an ideal medium for understanding how D3 manipulates SVG elements on the page; by forcing us to think in terms of sequences of commands and order of operations, LEGO blocks teach us that the order in which we do things in both LEGO and SVG space matters tremendously. Second, we made the claim that we could translate operations in LEGO space directly into operations (methods) in D3. This made it possible to assume a one-to-one correspondence between LEGO blocks and SVG elements. Finally, we saw that LEGO blocks also serve as a useful medium for understanding one more essential component of working with D3: the DOM (Document Object Model). In LEGO space, each block is both a shape we append to an SVG canvas as well as the DOM element containing it. Thus, when we perform operations across a LEGO baseplate, we are simultaneously manipulating the DOM structure beneath it.
Collectively, these claims lead to an important assumption in D3 — the assumption of modularity. A simple D3 script for a scatter plot or bar chart can be constructed from a set of 5 basic modular components.
That's it! There are many tutorials across the internet that teach the basics of D3, and this is just one more to add to that pool. However, hopefully it gives you a new perspective on how D3 works, with a little LEGO fun thrown in along the way.