All examples By author By category About

darrenjaworski

Tutorial I

Pie chart, bar chart, and line chart

In these series of tutorials, I'm going to be producing popular journalism web-friendly charts. There are a many visuzaliation tools to help you create charts, and maps. Google docs can also help you get started in data visualization. These tutorials are for journalists interested in more control over their visualizations than off the shelf tools, or a way to get started in D3. I will try to provide as much detail as possible, walking through each graphic step by step, from data preparation to final version. These are being written to help an average journalist with a chart or map accompaniment to their work. I will therefore ensure as much as possible a replicability and reusability, while making them as easy to understand as possible.

These tutorials use D3, which is a framework that relies on scaled vector graphics to render visualizations. These tutorials will not require extensive javascript or programming knowledge as they use minimal code, and every facet that is customizable will be explained in detail.

Tutorial I: Pie chart, bar chart, and line chart.

Tutorial II: Choropleth map.

Tutorial III: Points on a map.

Tutorial IV: Slightly more advanced pie, bar and line charts.

Tutorial V: Slightly more advanced choropleth map.

Tutorial VI: Network graph.


Data preparation

For the pie, bar, and line chart we will be using data from a CSV file. These examples will use yearly tornado statistics provided by the National Storm Events Database from NOAA. The data is formatted simply. 'year' as the column heading with the data in 4 digits. 'total' as the second column heading with the totals for each year listed below.

Generally for a bar or line chart your columns designate an axis respectively. In this case, 'year' will be the x axis, and 'total' will be the y axis.

For a pie chart, the different slices of the pie will each be a row. In this case we will have 10 slices, corresponding to each year's tornadic activity total.


HTML

These charts will be rendered in HTML, but only a basic understanding of markup is required. All HTML relies on opening and closing tags, ids and classes. For a comprehensive HTML and CSS instruction, I recommend buying "Learning Web Design" and visiting W3schools.

<!DOCTYPE html>
<meta cahrset="utf-8">
<style>
/*both chart CSS*/
body {
  font: 10px sans-serif;
}

.axis path,
.axis line {
  fill: none;
  stroke: #000;
  shape-rendering: crispEdges;
}

.x.axis path {
  display: none;
}

/*Bar chart CSS*/
.bar {
  fill: #3B8686;
}

.bar text	{
	color: #000000;
}

/*Line chart CSS*/
.line {
  fill: none;
  stroke: #3B8686;
  stroke-width: 1.5px;
}
</style>
<body>
<script src="http://d3js.org/d3.v3.min.js"></script>
<script src="pie.js"></script>
<script src="bar.js"></script>
<script src="line.js"></script>
</body>
</html>

As you can see I separated the three script files, which render the charts, so that you can parse them out and use them how you see fit. You will be adding style property values to the various selectors shortly in the space between the style tags. You have to reference the d3 javascript library - src="http://d3js.org/d3.v3.min.js". In future tutorials you will be calling complimentary libraries to assist in specific techniques, but for these relatively simple charts you'll only need that one library reference.

We're ready to get dirty.


pie.js

We first need to define the size and radius of the pie chart. The variable piewidth and pieheight define the width and height respectively. Because the Math.min(variable, variable) returns the lowest value between the two variables your circle will not be elliptical. The radius will half the size of the lowest variable. Use these properties to define the size of your pie chart.

var piewidth = 600,
    pieheight = 600,
    radius = Math.min(piewidth, pieheight) / 2;

Next we define the color range. For this data, I used a green color palette ranging from #CFF09E to #588A70. You can freely define as many range colors as appropriate. For predefined scales of color try the D3 categorical colors. If you used one of those scales you would replace " d3.scale.ordinal().range["#000000"...]; " and replace it with " d3.scale.category10(); " or any of the other options.

var color = d3.scale.ordinal()
    .range(["#CFF09E", "#A8DBA8", "#79BD9A", "#3B8686", "#0B486B", "#588A70"]);

In the next lines of code we define the arc segments, pie, and place the " piesvg " within the body of the html file.

var arc = d3.svg.arc()
    .outerRadius(radius - 10)
    .innerRadius(0);

var pie = d3.layout.pie()
    .sort(null)
    .value(function(d) { return d.total; });

var piesvg = d3.select("body").append("svg")
    .attr("width", piewidth)
    .attr("height", pieheight)
  .append("g")
    .attr("transform", "translate(" + piewidth / 2 + "," + pieheight / 2 + ")");

Next, we define the data and use a forEach loop to navigate the rows in our data.csv file. The lines of note here are those which access the data and color and retrieve the text labels. " .style("fill", function(d) { return color(d.data.year); }); " accesses the year column of data to color the categories according to our predefined colors. " .text(function(d) { return d.data.year; }); " accesses the year column of data and places the label text on the pie chart.

d3.csv("data.csv", function(error, data) {

  data.forEach(function(d) {
    d.total = +d.total;
  });

  var g = piesvg.selectAll(".arc")
      .data(pie(data))
      .enter().append("g")
      .attr("class", "arc");

  g.append("path")
      .attr("d", arc)
      .style("fill", function(d) { return color(d.data.year); });

  g.append("text")
      .attr("transform", function(d) { return "translate(" + arc.centroid(d) + ")"; })
      .attr("dy", ".35em")
      .style("text-anchor", "middle")
      .text(function(d) { return d.data.year; });

});

That is a simple, elegant pie chart.


bar.js

Just as with the pie chart. First define some spacing parameters. 4 margin parameters and a height and width. Margins are useful to help mitigate labels and axes.

var barmargin = {top: 20, right: 20, bottom: 30, left: 40},
    barwidth = 700 - barmargin.left - barmargin.right,
    barheight = 500 - barmargin.top - barmargin.bottom;

The next 5 blocks of code are axis qualities as well as ultimately defining the " barsvg " and placing it within the body of the html.

var barx = d3.scale.ordinal()
    .rangeRoundBands([0, barwidth], .1);

var bary = d3.scale.linear()
    .range([barheight, 0]);

var barxAxis = d3.svg.axis()
    .scale(barx)
    .orient("bottom");

var baryAxis = d3.svg.axis()
    .scale(bary)
    .orient("left");

var barsvg = d3.select("body").append("svg")
    .attr("width", barwidth + barmargin.left + barmargin.right)
    .attr("height", barheight + barmargin.top + barmargin.bottom)
  .append("g")
    .attr("transform", "translate(" + barmargin.left + "," + barmargin.top + ")");

Next we access the data in a similar manner, using d3.csv and a forEach loop.

d3.csv("data.csv", function(error, data) {

  data.forEach(function(d) {
    d.total = +d.total
    d.year = +d.year;
  });

We then define the x and y domains of the bar chart. This is done by using the data.map() and d3.max() functions. Notice that the barx.domain refers to the d.year object. Also notice that the bary.domain referes to the d.total object.

barx.domain(data.map(function(d) { return d.year; }));
bary.domain([0, d3.max(data, function(d) { return d.total; })]);

The remainder of the code creates both axes and encodes the data to rectangles at appropriate height and year locations. Key line to observe is " .text("Tornado count"); " which is the text shown on the y axis. It was rotated 90 degrees (vertical) and given space from the axis line.

To render the rectangles themselves. The key lines to observe at the " attr("x" , attr("y" and attr("height" ". These are all encoding rectangle attributes. X is the year position, y is a related height attribute, and height is the height of the rectangle. X is given the year, y and height are given the total.

barsvg.append("g")
      .attr("class", "x axis")
      .attr("transform", "translate(0," + barheight + ")")
      .call(barxAxis);

  barsvg.append("g")
      .attr("class", "y axis")
      .call(baryAxis)
    .append("text")
      .attr("transform", "rotate(-90)")
      .attr("y", 6)
      .attr("dy", ".71em")
      .style("text-anchor", "end")
      .text("Tornado count");

  barsvg.selectAll(".bar")
      .data(data)
    .enter().append("rect")
      .attr("class", "bar")
      .attr("x", function(d) { return barx(d.year); })
      .attr("width", barx.rangeBand())
      .attr("y", function(d) { return bary(d.total); })
      .attr("height", function(d) { return barheight - bary(d.total); });

line.js

The line chart code is remarkably similar to the bar chart. I will highlight the differences and leave the rest of the code walkthrough to similar elemetns above.

For this line chart you parse the date from the data.csv file. For more information on this specific date time formatting please visit the D3 wiki.

If you look in the data file data.csv. The data with the year encoding is in the XXXX year format. Because of this I used the correct time format %Y.

var parseDate = d3.time.format("%Y").parse;

We then use the d3.svg.line() to create the appropriate line from the data. X is d.year, and y is d.total.

var line = d3.svg.line()
    .x(function(d) { return linex(d.year); })
    .y(function(d) { return liney(d.total); });

We use analoguous techniques to access the data and render the axes and line. Observe again how we wrote a label on the y axis.

var linesvg = d3.select("body").append("svg")
    .attr("width", linewidth + linemargin.left + linemargin.right)
    .attr("height", lineheight + linemargin.top + linemargin.bottom)
  .append("g")
    .attr("transform", "translate(" + linemargin.left + "," + linemargin.top + ")");

d3.csv("data.csv", function(error, data) {
  data.forEach(function(d) {
    d.total = +d.total
    d.year = parseDate(d.year);
  });

  linex.domain(d3.extent(data, function(d) { return d.year; }));
  liney.domain(d3.extent(data, function(d) { return d.total; }));

  linesvg.append("g")
      .attr("class", "x axis")
      .attr("transform", "translate(0," + lineheight + ")")
      .call(linexAxis);

  linesvg.append("g")
      .attr("class", "y axis")
      .call(lineyAxis)
    .append("text")
      .attr("transform", "rotate(-90)")
      .attr("y", 6)
      .attr("dy", ".71em")
      .style("text-anchor", "end")
      .text("Tornado count");

  linesvg.append("path")
      .datum(data)
      .attr("class", "line")
      .attr("d", line);
});

A couple of things to keep in mind from this tutorial. I routinely write the javascript within script tags in the html file. I split these files up for a seperation of concerns so that you can extract only the code that you need.

I added the CSS styling to the html. I seperated the relevant CSS by which lines effect which graphic.