What's French for Data Visualisation

This is a repost from the ONS’s digital blog.

Conferences are a great way to learn and view the wider field of your profession. Often in data visualisation, you feel you’re working in an area so specialised that no one else does anything similar to you. Imagine my surprise when 350 other data visualisation experts and practitioners turned up in Paris for the OpenVis 2018 conference.

To quote Lynn Cherry, OpenVisConf programme co-chair, “OpenVis is a top-tier conference about ‘open source data visualization’ tools and techniques (“openvis”)”.

The conference was inspiring, full of high-quality talks from people leading the field and from a mixture of academics, journalists and industry types. I also got to ride up and down the Seine in a party boat in the Parisian drizzle.

My takeaways can be grouped into the following four areas.

Inspiring

There was so much inspiration. There were technical showcases of web technology to visualise dinosaurs in 3D or how to handle drawing a billion stars. There were explanations of the analytical side of things using machine learning to train neural nets or classify drawings. And also breakdown of design processes.

One idea that I thought could be applied to visualising ONS data is t-SNE clustering. t-SNE is a machine learning algorithm for visualising data with lots of dimensions. Ian Johnson showed what this technique could do on the quick draw dataset (a dataset of people drawing objects). Previous attempts at characterising this dataset focused on the average (How long does it take to draw a cat?) but there is an argument that it’s more interesting to show the distribution rather than focus on summary statistics.

The t-SNE algorithm visualises groups that are similar but doesn’t specify what attribute it is matching them on. It could be any feature (eyes, ears, shapes, strokes) or a combination. This method could be applied to a number of our statistics where we create groupings, census being the most obvious one but also well-being, households, earnings and other surveys.

Changing the way you think

There were talks that forced me to consider how we do things. Can we bring aspects of gaming into data visualisation? How can we learn not to fall for fallacies, and how does the brain process information?

Steven Franconeri looked at the how the brain can process visual information either quickly or slowly. The quick part works for shape recognition or feature distribution (mean, outliers, trends or clusters), but works slowly for comparing properties of objects.

Try spotting the odd ones out in this pictures.

Image of blue and red bars

Source: Steve Franconeri on Twitter

We can apply these insights to make our visualisations more understandable.

Reinforcement that you’re doing the right thing

Sometimes it’s good to know, the best people out there are also doing what you do. We have put user-centered design at the heart of what we do, as do many others.

One talk was about disagreements from two of the top data vis editors at the New York Times, Amanda Cox and Kevin Quealy. It was great to see the honest conversations that go on behind making visualisations. Disagreements are part of the process as there will be design choices to be made and these are subjective but even the best disagree.

New connections

There was lots to take-away from the diverse presentations and range of other attendees. From talking to people from design agencies, big tech companies to freelancers, we had common challenges and we would discuss how we overcome them.

I’ll be sharing more of what I’ve learnt with the data visualisation team and from there into our work in the future.

How to make responsive d3.js interactives

What happens to our visualisations on a mobile is important. To reach people with our content, we need to make it syndicatable; to make it syndicatable we need it to work on mobile.

We do a few things to make our interactives work on mobiles.

Breakpoints

We use three breakpoints for our interactive (roughly mobile, tablet, desktop). The interactives behave differently at different widths or we may choose to show or not show bits depending on if there’s space.

We sense the width of the body or a div on the page where we want to put stuff using d3.select("body").style("width") which gives us something like "800px".

We just need the 800 part so we can add parseInt() in front so it’s now parseInt(d3.select("body").style("width")).

Now we know the width of our browser we can use this to set bits of our interactives. Some of the things we change in respect to the width are

  • the height of the svg is calculated with an aspect ratio and the width
  • margins
  • tick formats, going from 2008 on desktop to ‘08 on mobile
  • the number of ticks
  • which annotations to show and and where they appear.

One thing to note is that the width is only sensed when that code is run, which is normally when the page is loaded. The interactive may load great, but if the browser is resized it may no longer fit. This is where pym comes in.

pym.js

pym.js is a javascript library developed by NPR. Our interactives are built as standalone pages and embedded as iframes. They are embedded on the ONS site, but they can be embedded anywhere. Using pym makes our embedded iframes behave responsively.

Normally with an iframe you specify the height and width. The content of the page will change depending on the width, for example text may wrap and then the height changes too. With pym, the height of the iframe is detected and adjusted automatically. This means the size of the iframe can change responsively.

How pym works

pym works as a parent-child relationship. The parent page is the place where the interactive will be embedded in an iframe. In our case, the parent is the main ONS site. The interactive, which is a standalone page in itself, is the child.

On the parent page, set up pym with

new pym.Parent(parent_id, child_url);

This tells pym to put the iframe in a div with the id parent_id and the child page is child_url. That’s it for the parent. Most of the work is done on the child page.

On the child page, our interactive, we need to tell pym this is a child page. This is done with pymChild = new pym.Child();.

To make the content resize when the browser changes, we need to structure our code in a certain way. We make a function that contains everything to draw the interactive. This function is called drawGraphic. We tell pym to fire drawGraphic every time the browser is resized by adding a renderCallback when we set up pym. Now we instead start pym with pymChild = new pym.Child({ renderCallback: drawGraphic});.

As part of the drawGraphic function we can set pym to tell the parent page how tall the child page is. This sets the iframe’s height on the parent page to fit the child’s height. This is done with pymChild.sendHeight();. This is normally left until the end of drawGraphic once everything has been drawn.

Resizing the page destroys the interactive, calls the function drawGraphic and everything is redrawn. Because d3.js is so fast it seems like the graph is changing as your change the size of the browser. However if you have delays or transitions, it will reset them on resizing.

When we are developing our interactives we are working on the child page. Issues with the iframe resizing may not appear so it’s important to test with a dummy parent page before publishing.

Other organisations like the BBC already use pym.js so it makes it easier from them to lift our interactives into their site.

Bootstrap Grid

Finally, we use the bootstrap grid to make content blocks flow under each other depending on the width of the browser. We can also hide and show certain elements for mobile.

Postscript

On a side note, I came across another way to make responsive SVGs from the d3js Slack group. It uses a viewbox to scale the SVG. This is useful for interactives that have transitions or delays.

Using the ONS chart templates

In my last post I talked about the structure of my team and how we work. This post goes into more about how we use templates for charts and interactives.

There are many charts we commonly use. Rather than remake these from scratch each time, we made templates. This allows us to make charts quickly and include features commonly used, for example annotations.

The ONS chart templates are designed to work well on mobile, lightweight on the user’s bandwidth and embeddable using pym.js.

Each template is driven by a config file that specify certain variables that are needed. Using these templates, it should be possible to just add in your data and adjust the config to quickly produce charts.

Getting your hands dirty

Time to make a chart. You’ve got a story, you know what statistical relationships you want to show.

Step 1. Download the templates

Probably the simplest way is to clone or download a zip file off github from the Simple Charts repository. This contains templates for a variety of different charts. You could also fork the repo and use github pages.

Step 1. Prep your data

In each folder, there will be a data file, normally called data.csv. You want to try and get your data to match it as much as possible, including the column headings. Some templates will read the heading automatically, but some templates refer to the column headings by name in the code so will need to keep them the same.

Step 2. Edit the config.json

Each chart has a config file that contains frequently changed variables such as annotations, aspect ratio, number of ticks on axis etc. Read the wiki pages to see how each variable changes that particular template.

Step 3. Preview the file

You can now see your finished visualisation using a browser. Firefox will run the code in the browser off local files but most browsers won’t (Chrome, IE, Safari). In which case, you will either have to run a http server in the directory of your file, or use github pages.

Now you’ve made your chart with the ONS templates, you can stop worrying about how your graph is going to behave on mobile and you can work on all the other important bits - annotations, title, the article etc.

The Digital Content Team behind Visual.ONS

In my mind, this post was meant to talk about how to use the ONS chart template, but to do that properly I need to explain how the digital content team is set up and the purpose of our work because this drives how our templates have been developed.

Our goals

The principles of the ONS strategy is laid out in the document “Better Statistics, Better Decisions”. In it, it describes many of the principles that guide our work. These include

  1. Inform decision making
  2. Support democratic debate
  3. Improve communication
  4. Challenge misuse of statistics

As a result, there is an increased focus on working to ONS’ “enquiring citizen” persona. In the same way that research is putting much focus and effort into communicating with the public, the ONS as a statistics bodies is trying to do the same. Our work aims to communicate statistics in a way that support these principles. And the area of work that mostly correlates to that is data journalism; using data and statistics to tell stories that matter to people.

ONS data are published in many many excel files. With each release, a commentary and analysis is published but these are often technical documents with enough detail for the expert user. If you are an expert you’ll be familiar with the website you’ll be able to find the data and understand everything about it. If you’re in an “information forager”, you might persist with different search terms and then eventually find your way to what you think might be the right data so you download it and see. You will probably search and download a few times until you find what you really want.

The vast majority would turn away at just opening a spreadsheet. We are failing these people. And if we really want to improve the world we need to make sure these people have the information they need to have important conversations. This is where the digital content team comes in.

How we work

The team is multidisclipinary and is made up of journalists, designers and developers.

Digital Content Team

The journalistic skills are important for transforming information about statistics in to something that accurate to the data yet understandable to average citizen. The design and coding skills are about making something that presents the data in the most readable and useful way using the best bits of digital; the ability to reach a large amount of people, in a way that is quick to produce and can be interactive.

Generating an idea

Since we are the ONS we have access to vast troves of data that the organisation collects. Each department is in charge of statistics that relate to certain aspects of life and society e.g. birth, deaths, marriages, GDP, migration. These departments (or as we refer to as “business areas”) are the expert in their data and are responsible for how their data is represented. Sometimes they will have found something interesting in their research and feel bringing this to the attention of people will benefit them so they look to us to help them. Or they’ll have a big release coming out and you know they’ll be some interesting stories to tell there. Some times we are digging around the data published and we find something interesting to we approach the business area to get their opinion of what we’ve found.

As the ONS is a large organisation, visibility is an issue to getting to know people and having regular meetings are essential. People need to understand what and why you’re trying to do so they can support you and commit enough resources.

Exploring the data

Once you’ve found a good idea, it’s time to really hone in on what the headline is going to be. What’s going to make someone stop and read your article and listen to what you have to say. This takes time, as sometimes an idea is great but when you look into it, there’s more to it. You need to analyse the data. This step uses tools like excel, python or charting tools like plotly, tableau, rawgraphs or datawrapper. It’s often best to get feedback from all sides including editorial and the business area on your idea to hone it’s focus, check it’s viability and ensure it’s rigorous.

Commission

Once your idea is firm, it is written up as a proforma and sent to an internal group Editorial and Comms Group (ECG) made up of deputy directors for each business area. The proforma contains the outline of what you want to say, how you’re going to say it, the risks, external factors, other angles. They look over the idea to ensure it’s aimed at the right level with the right focus and any concerns are addressed.

Development

The article is written and any charts or interactives are developed. Often charts use our templates and if necessary are adjusted to fit the story. Our templates have come out of what have worked in the past and use the most common elements of interactives. The development of interactives and the article happens with regular feedback from the editorial, design and UX as well as the business area. These is the part where the disclipine of data visualisation becomes important as choosing the right chart to tell the story is paramount. I would suggest the FT’s visual vocabulary as a starting point. Not forgotten is the digital standards we are working towards including working on all devices and accessibility.

Publish

Once it’s all finished the article is published. Often we are looking for organisations to syndicate the content so elements are made to be embedded into other people’s system. We track how the content goes down with partners, through different distribution channels.

Why it works for us

So now you know how the team works we can talk about why the templates work for us. But I’ll leave that to another post.

How to embed ONS interactives

At the end of some posts on Visual.ONS is some code about how to use an iframe to embed any interactives elsewhere. This is to encourage syndication elsewhere, for example news organisations who might want to rewrite the words around an interactive to suit their style or readership.

Using an iframe to embed

In this article about house price by area it says “To embed the floorplan in your site use the following code:”

<iframe width="100%" height="1200px" src="https://www.ons.gov.uk/visualisations/dvc434/floorplan/index.html" scrolling="no" frameborder="0"/>

If you copied this into your website or CMS, this would make an iframe which acts like a window where the view is the ONS interactive. The window view is set with the width="100%" and height="1200px". This will create a window that fills 100% of the width of where you put it and a height that’s set manually. The width is sensed when the page is loaded and set to fill the whole width. The height should be set to something that doesn’t cut off the bottom of the interactive.

This would work fine for most circumstances. But if you were changing the size of your window for example rotating your mobile or changing the resizing your browser, the width that was set when the page loaded would no longer correspond with the width of your browser.

Making your embed responsive

We use a javascript library called pym.js to make our interactives responsive. The basic idea is that on resizing, the interactives are redrawn to fit the new iframe. If the width of the interactive is small, like a mobile screen, the interactive is designed to behave differently.

To embed a responsive graphic, you need to use pym.js on the site you’re embedding on. This is quite simple to do if you can add scripts to your page. This website is referred to as the parent. The page you’re embedding is called the child.

The example on the pym.js page says use code like this

<div id="example"></div>
<script type="text/javascript" src="https://pym.nprapps.org/pym.v1.min.js"></script>
<script>

var pymParent = new pym.Parent('example', 'https://www.ons.gov.uk/visualisations/dvc434/floorplan/index.html', {});

</script>

Code walkthrough

Let’s talk through what’s going on. First create a div and give it the id=example.

Next, load the pym.js script from the NPR website <script type="text/javascript" src="https://pym.nprapps.org/pym.v1.min.js"></script>.

Make another <script>, make a variable and then use a function to make this page a parent for pym.js var pymParent = new pym.Parent(.

Use the div id to say where to put it, in this case the 'example', div.

Choose what will be the child page. This is going to be the interactive we’ve chosen and we get the page from the embed code in the article 'https://www.ons.gov.uk/visualisations/dvc434/floorplan/index.html'.

Then some more bit to say we’re not using any of the optional extras , {} and finally close everything );</script>.

Hopefully that made sense. Now let’s see it in action.

Responsive embed

Non-responsive embed

Try resizing your browser or rotating your phone and compare this non-responsive version where the boxes don’t resize.