Reflections on one year at the ONS

It’s been just over one year since I joined the Office for National Statistics to work in the data visualisation team. As with most jobs it took a while to find my feet, but I no longer stumble when I explain my job. I help communicate the data the ONS collects about the UK economy and society in a way that’s more understandable. Moving away from long reports and excel sheets, our multidisclipinary teams uses plain English and visuals to improve communication. By improving how we talk about stats, we hope to raise the level of debate in society about important issues.

When I took the job, it felt like jumping fields. But there is a lot of overlap with my previous roles in user research, parliament communications, public engagement and science communication. You need to know your audience, work to get yourself where they are, and give them a meaningful engagement. Although I’m doing more programming and more maths now which suits me.

From amateur to professional

I started doing interactive data visualisations as a bit of a hobby. I set myself weekend projects putting data into existing visualisations as a way to learn to code with d3.js. I read other people’s blog posts about issues in data vis for example, using colour wisely or the best way to represent data truthfully.

But now I’m a practitioner I’ve learnt so much more. I’ve improved my coding and can create things from scratch rather than just repurpose examples. I understand more web technologies. I am more familiar with principles of data vis so I can talk about why things should be a certain way. I’ve done more writing and have more experience of integrating storytelling into projects. I’ve managed more projects simultaneously than ever before. I’m a manager for the first time which is a big learning experience but it’s rewarding to see someone flourish and grow.

Being in a team of data vis specialists means we talk a lot about data vis. We also get approached by colleagues in the ONS about the best way to represent a particular story in a dataset. Critically talking about data vis and learning to articulate what makes a good data viz only really happens when you have other knowledgeable people around. Having talked to other government analysts, our team is in a unique position with so many data vis specialists in one team.

What’s coming up?

On a personal level, there are several areas I want to develop. I want to improve my data wrangling, my story generation and project management. I want to use some of my user research skills to feed into the evidence behind what we do. There is a the team challenge of integrating the learnings from our visual.ons prototype website into the way the organisation works and perhaps even wider than the ONS.

When sitting down to write this blog post, I realised that although I learnt how to code through my visualisation experiments. But I only learnt how to talk about data vis with other people around me and I wonder if other analysts aren’t able to have these conversations because they don’t have data viz people around them. And if we created a friendly space for these conversations to happen, would this help graphical literacy. I feel there’s an appetite as we have often have people on our data vis courses from other public bodies.

So if you’re interested in starting something let’s talk.

What's French for Data Visualisation

This is a repost from the ONS’s digital blog.

Conferences are a great way to learn and view the wider field of your profession. Often in data visualisation, you feel you’re working in an area so specialised that no one else does anything similar to you. Imagine my surprise when 350 other data visualisation experts and practitioners turned up in Paris for the OpenVis 2018 conference.

To quote Lynn Cherry, OpenVisConf programme co-chair, “OpenVis is a top-tier conference about ‘open source data visualization’ tools and techniques (“openvis”)”.

The conference was inspiring, full of high-quality talks from people leading the field and from a mixture of academics, journalists and industry types. I also got to ride up and down the Seine in a party boat in the Parisian drizzle.

My takeaways can be grouped into the following four areas.

Inspiring

There was so much inspiration. There were technical showcases of web technology to visualise dinosaurs in 3D or how to handle drawing a billion stars. There were explanations of the analytical side of things using machine learning to train neural nets or classify drawings. And also breakdown of design processes.

One idea that I thought could be applied to visualising ONS data is t-SNE clustering. t-SNE is a machine learning algorithm for visualising data with lots of dimensions. Ian Johnson showed what this technique could do on the quick draw dataset (a dataset of people drawing objects). Previous attempts at characterising this dataset focused on the average (How long does it take to draw a cat?) but there is an argument that it’s more interesting to show the distribution rather than focus on summary statistics.

The t-SNE algorithm visualises groups that are similar but doesn’t specify what attribute it is matching them on. It could be any feature (eyes, ears, shapes, strokes) or a combination. This method could be applied to a number of our statistics where we create groupings, census being the most obvious one but also well-being, households, earnings and other surveys.

Changing the way you think

There were talks that forced me to consider how we do things. Can we bring aspects of gaming into data visualisation? How can we learn not to fall for fallacies, and how does the brain process information?

Steven Franconeri looked at the how the brain can process visual information either quickly or slowly. The quick part works for shape recognition or feature distribution (mean, outliers, trends or clusters), but works slowly for comparing properties of objects.

Try spotting the odd ones out in this pictures.

Image of blue and red bars

Source: Steve Franconeri on Twitter

We can apply these insights to make our visualisations more understandable.

Reinforcement that you’re doing the right thing

Sometimes it’s good to know, the best people out there are also doing what you do. We have put user-centered design at the heart of what we do, as do many others.

One talk was about disagreements from two of the top data vis editors at the New York Times, Amanda Cox and Kevin Quealy. It was great to see the honest conversations that go on behind making visualisations. Disagreements are part of the process as there will be design choices to be made and these are subjective but even the best disagree.

New connections

There was lots to take-away from the diverse presentations and range of other attendees. From talking to people from design agencies, big tech companies to freelancers, we had common challenges and we would discuss how we overcome them.

I’ll be sharing more of what I’ve learnt with the data visualisation team and from there into our work in the future.

How to make responsive d3.js interactives

What happens to our visualisations on a mobile is important. To reach people with our content, we need to make it syndicatable; to make it syndicatable we need it to work on mobile.

We do a few things to make our interactives work on mobiles.

Breakpoints

We use three breakpoints for our interactive (roughly mobile, tablet, desktop). The interactives behave differently at different widths or we may choose to show or not show bits depending on if there’s space.

We sense the width of the body or a div on the page where we want to put stuff using d3.select("body").style("width") which gives us something like "800px".

We just need the 800 part so we can add parseInt() in front so it’s now parseInt(d3.select("body").style("width")).

Now we know the width of our browser we can use this to set bits of our interactives. Some of the things we change in respect to the width are

  • the height of the svg is calculated with an aspect ratio and the width
  • margins
  • tick formats, going from 2008 on desktop to ‘08 on mobile
  • the number of ticks
  • which annotations to show and and where they appear.

One thing to note is that the width is only sensed when that code is run, which is normally when the page is loaded. The interactive may load great, but if the browser is resized it may no longer fit. This is where pym comes in.

pym.js

pym.js is a javascript library developed by NPR. Our interactives are built as standalone pages and embedded as iframes. They are embedded on the ONS site, but they can be embedded anywhere. Using pym makes our embedded iframes behave responsively.

Normally with an iframe you specify the height and width. The content of the page will change depending on the width, for example text may wrap and then the height changes too. With pym, the height of the iframe is detected and adjusted automatically. This means the size of the iframe can change responsively.

How pym works

pym works as a parent-child relationship. The parent page is the place where the interactive will be embedded in an iframe. In our case, the parent is the main ONS site. The interactive, which is a standalone page in itself, is the child.

On the parent page, set up pym with

new pym.Parent(parent_id, child_url);

This tells pym to put the iframe in a div with the id parent_id and the child page is child_url. That’s it for the parent. Most of the work is done on the child page.

On the child page, our interactive, we need to tell pym this is a child page. This is done with pymChild = new pym.Child();.

To make the content resize when the browser changes, we need to structure our code in a certain way. We make a function that contains everything to draw the interactive. This function is called drawGraphic. We tell pym to fire drawGraphic every time the browser is resized by adding a renderCallback when we set up pym. Now we instead start pym with pymChild = new pym.Child({ renderCallback: drawGraphic});.

As part of the drawGraphic function we can set pym to tell the parent page how tall the child page is. This sets the iframe’s height on the parent page to fit the child’s height. This is done with pymChild.sendHeight();. This is normally left until the end of drawGraphic once everything has been drawn.

Resizing the page destroys the interactive, calls the function drawGraphic and everything is redrawn. Because d3.js is so fast it seems like the graph is changing as your change the size of the browser. However if you have delays or transitions, it will reset them on resizing.

When we are developing our interactives we are working on the child page. Issues with the iframe resizing may not appear so it’s important to test with a dummy parent page before publishing.

Other organisations like the BBC already use pym.js so it makes it easier from them to lift our interactives into their site.

Bootstrap Grid

Finally, we use the bootstrap grid to make content blocks flow under each other depending on the width of the browser. We can also hide and show certain elements for mobile.

Postscript

On a side note, I came across another way to make responsive SVGs from the d3js Slack group. It uses a viewbox to scale the SVG. This is useful for interactives that have transitions or delays.

Using the ONS chart templates

In my last post I talked about the structure of my team and how we work. This post goes into more about how we use templates for charts and interactives.

There are many charts we commonly use. Rather than remake these from scratch each time, we made templates. This allows us to make charts quickly and include features commonly used, for example annotations.

The ONS chart templates are designed to work well on mobile, lightweight on the user’s bandwidth and embeddable using pym.js.

Each template is driven by a config file that specify certain variables that are needed. Using these templates, it should be possible to just add in your data and adjust the config to quickly produce charts.

Getting your hands dirty

Time to make a chart. You’ve got a story, you know what statistical relationships you want to show.

Step 1. Download the templates

Probably the simplest way is to clone or download a zip file off github from the Simple Charts repository. This contains templates for a variety of different charts. You could also fork the repo and use github pages.

Step 1. Prep your data

In each folder, there will be a data file, normally called data.csv. You want to try and get your data to match it as much as possible, including the column headings. Some templates will read the heading automatically, but some templates refer to the column headings by name in the code so will need to keep them the same.

Step 2. Edit the config.json

Each chart has a config file that contains frequently changed variables such as annotations, aspect ratio, number of ticks on axis etc. Read the wiki pages to see how each variable changes that particular template.

Step 3. Preview the file

You can now see your finished visualisation using a browser. Firefox will run the code in the browser off local files but most browsers won’t (Chrome, IE, Safari). In which case, you will either have to run a http server in the directory of your file, or use github pages.

Now you’ve made your chart with the ONS templates, you can stop worrying about how your graph is going to behave on mobile and you can work on all the other important bits - annotations, title, the article etc.

The Digital Content Team behind Visual.ONS

In my mind, this post was meant to talk about how to use the ONS chart template, but to do that properly I need to explain how the digital content team is set up and the purpose of our work because this drives how our templates have been developed.

Our goals

The principles of the ONS strategy is laid out in the document “Better Statistics, Better Decisions”. In it, it describes many of the principles that guide our work. These include

  1. Inform decision making
  2. Support democratic debate
  3. Improve communication
  4. Challenge misuse of statistics

As a result, there is an increased focus on working to ONS’ “enquiring citizen” persona. In the same way that research is putting much focus and effort into communicating with the public, the ONS as a statistics bodies is trying to do the same. Our work aims to communicate statistics in a way that support these principles. And the area of work that mostly correlates to that is data journalism; using data and statistics to tell stories that matter to people.

ONS data are published in many many excel files. With each release, a commentary and analysis is published but these are often technical documents with enough detail for the expert user. If you are an expert you’ll be familiar with the website you’ll be able to find the data and understand everything about it. If you’re in an “information forager”, you might persist with different search terms and then eventually find your way to what you think might be the right data so you download it and see. You will probably search and download a few times until you find what you really want.

The vast majority would turn away at just opening a spreadsheet. We are failing these people. And if we really want to improve the world we need to make sure these people have the information they need to have important conversations. This is where the digital content team comes in.

How we work

The team is multidisclipinary and is made up of journalists, designers and developers.

Digital Content Team

The journalistic skills are important for transforming information about statistics in to something that accurate to the data yet understandable to average citizen. The design and coding skills are about making something that presents the data in the most readable and useful way using the best bits of digital; the ability to reach a large amount of people, in a way that is quick to produce and can be interactive.

Generating an idea

Since we are the ONS we have access to vast troves of data that the organisation collects. Each department is in charge of statistics that relate to certain aspects of life and society e.g. birth, deaths, marriages, GDP, migration. These departments (or as we refer to as “business areas”) are the expert in their data and are responsible for how their data is represented. Sometimes they will have found something interesting in their research and feel bringing this to the attention of people will benefit them so they look to us to help them. Or they’ll have a big release coming out and you know they’ll be some interesting stories to tell there. Some times we are digging around the data published and we find something interesting to we approach the business area to get their opinion of what we’ve found.

As the ONS is a large organisation, visibility is an issue to getting to know people and having regular meetings are essential. People need to understand what and why you’re trying to do so they can support you and commit enough resources.

Exploring the data

Once you’ve found a good idea, it’s time to really hone in on what the headline is going to be. What’s going to make someone stop and read your article and listen to what you have to say. This takes time, as sometimes an idea is great but when you look into it, there’s more to it. You need to analyse the data. This step uses tools like excel, python or charting tools like plotly, tableau, rawgraphs or datawrapper. It’s often best to get feedback from all sides including editorial and the business area on your idea to hone it’s focus, check it’s viability and ensure it’s rigorous.

Commission

Once your idea is firm, it is written up as a proforma and sent to an internal group Editorial and Comms Group (ECG) made up of deputy directors for each business area. The proforma contains the outline of what you want to say, how you’re going to say it, the risks, external factors, other angles. They look over the idea to ensure it’s aimed at the right level with the right focus and any concerns are addressed.

Development

The article is written and any charts or interactives are developed. Often charts use our templates and if necessary are adjusted to fit the story. Our templates have come out of what have worked in the past and use the most common elements of interactives. The development of interactives and the article happens with regular feedback from the editorial, design and UX as well as the business area. These is the part where the disclipine of data visualisation becomes important as choosing the right chart to tell the story is paramount. I would suggest the FT’s visual vocabulary as a starting point. Not forgotten is the digital standards we are working towards including working on all devices and accessibility.

Publish

Once it’s all finished the article is published. Often we are looking for organisations to syndicate the content so elements are made to be embedded into other people’s system. We track how the content goes down with partners, through different distribution channels.

Why it works for us

So now you know how the team works we can talk about why the templates work for us. But I’ll leave that to another post.