Approaches to Teaching Data for Non-Profits

Recently The National Neighborhood Indicators Partnership and Microsoft Civic Technology Engagement Group launched a project to expand training on data and technology to improve communities.  I’m pleased they’ve included Data Therapy as one of the resources they highlight to help you think about building your data culture.  Check out their training guide and their catalog of resources!

training_pic

On a related note, if you are someone that does a lot of training and capacity building, or an organization that wants to be doing that, checkout the podcast and recording of a conversation about enabling learning with School of Data.

What Would Mulder Do?

The semester has started again at MIT, which means I’m teaching a new iteration of my Data Storytelling Studio course.  One of our first sessions focuses on learning to ask questions of your data… and this year that was a great change to use the new WTFcsv tool I created with Catherine D’Ignazio.

wtf-screenshotThe vast majority of the students decided to work with our fun UFO sample data.  They came up with some amazing questions to ask, with a lot of ideas about connecting it to other datasets.  A few focused in on potential correlations with sci-fi shows on TV (perhaps inspired by the recent reboot of the X Files).

One topic I reflected on with students at the close of the activity was that the majority of their questions, and the language they used to describe them, came from a point of view that doubted the legitimacy of these UFO sightings.  They wanted to “explain” the “real” reason for what people saw.  They were assuming that the sightings were people imagining what they saw was aliens, which of course couldn’t be true.

Now, with UFO sightings this isn’t especially offensive.  However, with datasets about more serious topics, it’s important to remember that we should approach them from an empathetic point of view.  If we want to understand data reported by people, we need to have empathy for where the data reporter is coming from, despite any biases or pre-existing notions we might have about the legitimacy of the what they say happened.

This isn’t to say that we shouldn’t be skeptical of data; by all means we should be!  However, if we only wear our skeptical hat we miss a whole variety of possible questions we could be asking our dataset.

So, when it comes to UFO sightings, be sure to wonder “What would Mulder do?” 🙂

Talking Visualization Literacy at RDFViz

Just yesterday at I was in a room of amazing friends, new and old, talking about what responsible data visualization might be.  Organizing by the Engine Room as part of their series of Responsible Data Forums (RDF), this #RDFViz event brought  together 30 data scientists, community activists, designers, artists and visualization experts to tease apart a plan of action for creating norms for a responsible practice of data visualization.

Here’s a write up of how we tackled that in the small group I led about what that means when building visual literacy.

Building Literacy for Responsible Visualization

Scan_Jan_15_pdf__page_1_of_5_I’ve written a bunch about data literacy and the variety of ways I try to build it with community groups, but we received strict instructions to focus this conversation on visualization.  That was hard!  So we started off by making sure we understood the audiences we were talking about  – people who make visualizations and people who see/read them.  So many ways to think about this… so many questions we could address… we were lost for a bit about where to even start!

We decided to pick four guiding questions to propose to ourselves and all of you, and then answer them by sketching about quick suggestions for things that might help.

  • How can visual literacy for data be measured?
  • How can existing resources for data visualization read the growing non-technical data visualization producers?
  • How can we teach readers to look at data visualization more critically?
  • How can we help data visualization producers to design more appropriately for their audiences?

A difficult set of questions, but our group of four dove into them unafraid!  Here’s a quick run-down on each.  For the record, I only worked on two of these, so I hope I do justice to the other two I didn’t directly dig into.

Measuring Visual Literacy

Scan_Jan_15_pdf__page_3_of_5_.png

This is a tricky task, fraught with cultural assumptions.  We began by defining it down to the dominant visual form for representing data – namely classic charts and graphs.  This simplified the question a little, but of course buys into power dynamics and all that stuff that comes along with it.

Our idea was to create an interactive survey/game that asks people to read and reason about visualizations.  Of course this draws on a lot of existing research into visual- and data-literacy, but in that body of work we don’t have an agreed-upon set of questions to assess this.  So we came up with the following topics, and example questions as a thing to think about.

  1. Can you read it?  This topic tried to address the question of basic visual comprehension of classic charting.  The example question would show something like a bar chart and ask “What is the highest value?”.
  2. What would you do? This topic digs into making reasoned judgements about personal decisions based on information show in a visual form.  The example question is a line chart showing vaccination rates over time going down and people getting measles going up; asking “Would you vaccinate your children?”.
  3. What can you tell? Another topic to address is making judgements about whether data shows a pattern or not.  The example question would show a statement like “Police kill women more than men – true or false?” and the answers could be “true”, “false” and “can’t tell”.
  4. What’s the message? More complex combinations of charts and graphs are often trying to deliver a message to the reader.  Here we could show a small infographic that documents corruption somewhere.  Then we’d ask “What is the message on this graphic?” with possible answers of “corruption is rampant”, “corruption happens” and “public funds are too high”.

There are just four topics, and we know there are more.  We’re excited about this survey, and hope to find time and funds to review existing surveys that assess various types of literacies so we can build a good tool to help people measure these types of literacies in various communities!

Choosing the Right Visualization for Your Audience

Scan_Jan_15_pdf__page_2_of_5_.pngWe have a vast, and growing array of visualization techniques available to us, but few guidelines on how to use them appropriately for different audiences.  This is problematic, and a responsible version of data visualization should respect where and audience is coming from and their visual literacy.  With that in mind, we propose to create a library of case studies where each one creates different visualizations from the same dataset, making the same argument, for different audiences.

For example, we sketched out ways to argue that police violence is endemic in the US, based on a theoretical dataset that captures all police-related killings.  For a low visual literacy individual (maybe a 10-year old kid) you could start by showing a face of one victim, and then zoom out to a grid of all the victims to show scale of the problem while still humanizing it. For the medium literacy audience (those that watch the evening news each night on tv), you could show a line chart of killings by year.  For a high literacy audience (reading the New York Times) you could do an interactive map that shows killings around the reader’s location as they compare to nation-wide trends.

You could imagine a library of many of these, which we think would help people think about what is appropriate for various audiences.  I’m excited to assign this to students in my Data Storytelling Studio course as an assignment!

Learning to Read A Data Visualization

Scan_Jan_15_pdf__page_4_of_5_.pngOur idea here was to create a quick how-to guide that lists things you should ask when reading a data visualization.  Imagine a listicle called “15 Things to Check in any Data Visualization”!  The problem here is that people aren’t being introduced to the critical techniques for reading visualization, to identify when one is being irresponsible.

Some things that might on this list include:

  • Is the data source identified?
  • Are the axes labelled correctly?
  • What is the level of aggregation?

This list could expose some of the common techniques for creating misleading visualizations.  Next steps?  We’d like to crowd source the completion of the list to make sure we don’t miss any important ideas.

Helping Non-Experts Learn to Make Data Visualizations

Scan_Jan_15_pdf__page_5_of_5_.pngThis is a huge problem.  The hype around data visualization continues to grow, and more and more tools are being created to help non-experts make them.  Unfortunately, the materials we use to help these newcomers into the field haven’t kept pace with the huge rise in interest!

We proposed to address this by better defining what these new audience need to know.  They include:

  • human rights organizations
  • community groups
  • social movements

And more!  A brief brainstorm resulted in this list of things they are trying to learn:

  • how to select the right data to visualize?
  • what types of charts are best suited to understand what types of data?
  • what cultural assumptions are reflected in what types of dataviz?
  • how do design decisions (eg. color) impact on how readers will understand your data visualization?

This is just a preliminary list of course.

Rounding it Up

Problem solved!

Just kidding… we have a lot of work to do if we want to build a responsible approach to literacies about data visualization. These four suggestions from our small working group at the RDFViz event are just that – suggestions. However, the space to approach this from a responsible point of view, and the conversations and disagreements were invaluable!

 

Many thanks to the organizers and funders, including our facilitator Mushon Zer-Aviv, our organizers at the Engine Room, our hosts at ThoughtWorks, Data & Society and Data-Pop Alliance, and our sponsors at Open Society Foundations and Tableau Foundation.  This is cross-posted to the MIT Center for Civic Media website.

Paper on Designing Tools for Learners

On an academic note, I just published a paper in for the Data Literacy workshop at the WebSci 2015 conference.  Catherine D’Ignazio and I wrote up our approach to building data tools for learners, not users.  Here’s the abstract, and you can read the full paper too.

Data-centric thinking is rapidly becoming vital to the way we work, communicate and understand in the 21st century. This has led to a proliferation of tools for novices that help them operate on data to clean, process, aggregate, and vi- sualize it. Unfortunately, these tools have been designed to support users rather than learners that are trying to develop strong data literacy. This paper outlines a basic definition of data literacy and uses it to analyze the tools in this space. Based on this analysis, we propose a set of pedagogical design principles to guide the development of tools and activities that help learners build data literacy. We outline a rationale for these tools to be strongly focused, well guided, very inviting, and highly expandable. Based on these principles, we offer an example of a tool and accom- panying activity that we created. Reviewing the tool as a case study, we outline design decisions that align it with our pedagogy. Discussing the activity that we led in aca- demic classroom settings with undergraduate and graduate students, we show how the sketches students created while using the tool reflect their adeptness with key data literacy skills based on our definition. With these early results in mind, we suggest that to better support the growing num- ber of people learning to read and speak with data, tool de- signers and educators must design from the start with these strong pedagogical principles in mind.

Data Storytelling Studio – Final Projects

I recently wrapped up my first semester-long course at MIT, called the Data Storytelling Studio.  Students posted all their work on the course blog, but I wanted to share short summaries of their wonderful final projects!  All but one focused on the topic of food security.

Somerville Resources

Tuyen Bui, Hayley Song, Deborah Chen worked with partners in Somerville, MA to create a short video about the challenge and community response to food insecurity among local youth.  They shot video with local programs and included “pop up” data about the problems.  The goal was to raise awareness about the problem and solutions to drive people to volunteer with the partners featured. Watch their movie, or read more about the Somerville Resources video.

2015-06-05_1220

SnapSim

Danielle Man, Edwin Zhang, Harihar Subramanyam & Tami Forrester explored food pricing data, nutrition data, and SNAP benefit data in the hopes of building empathy with enrolled in SNAP.  They created an interactive text-based game that puts you in the role of a single parent on SNAP shopping for food for themself and their two children.  Play the game and see how you fare making hard decisions about what to buy for your family on a tight budget.  Read more about their SnapSim project.

2015-06-05_1227

SNAP Judgements

Mary Delaney and Stephen Suen worked with demographic data about SNAP participants, food nutrition data, and housing data.  They wanted to build empathy and understanding among college students for the difficult trade-offs those in SNAP have to make between health, happiness, and financial security.  Mary and Stephen created a text based game where you take on the persona of a SNAP participant and are forced to make decisions over time about what when to buy food and what to buy to feed your family. Play their game now, or read more about their SNAP Judgements project.

2015-06-05_1230

Drought Debunkers

Val Healy, Nolan Essigmann and Ceri Riley explored data about drought and water use in the United States.  Their goal was to tell a story to young college students about how individual conservation choices are largely symbolic in terms of environmental impact, and urge them to word on collective solutions that focus on agricultural and industrial water usage.  They created a web-scrolling infographic to tell their story. Read more about the Drought Debunkers project.

Art Crayon Toolkit

Laura Perovich & Desi Gonzalez looked at color use in famous paintings. Their goal was to build engagement with children around visual elements of art and spark their interests in the arts by connecting in novel ways.  They created a wonderful set of custom crayons that matched the color distributions in various paintings, and an activity book they play-tested with a small set of children.  Read more about their Art Crayon Toolkit.

The Data Storytelling Studio

I’ve been radio silent for the last half year for two reasons.  Firstly, we had a new baby!  Secondly, I’ve been planning and am now teaching a semester long course at MIT for undergraduates and graduate students.  I’ve called this course the Data Storytelling Studio.  You can follow the course blog at http://cms631.datatherapy.org.  I’ll continue to blog here, but less frequently this semester.

I prefer not to share cute baby pictures online, but am happy to share pictures from the course!  I’ve sketched it out with my colleague Catherine D’Ignazio, assistant professor at Emerson college.  She is teaching a version tailored for journalists there, while I teach a diverse audience of MIT students (the course is offered by the Comparitive Media Studies / Writing program).  The course isn’t a programming or data science course; the focus is more on process, tools, and creative presentation.

I’ll be leading the students through an arc of five modules:

  1. Introduction – we begin by setting context and designing and painting a Data Mural together
  2. Finding and Analyzing Data
  3. Cleaning Data and Finding Stories
  4. Presenting Your Story
  5. Final Project

I’ve focused on the topic of food security for this semester, so most of the projects and assignments will focus on that.  In fact, our mural tells a story about Food For Free, a local organization that runs food rescue and other programming.  As you can see, we’re off to a great start!

IMG_6189