Launching the Data Culture Project

Learning to work with data is like learning a new language — immersing yourself in the culture is the best way to do it. For some individuals, this means jumping into tools like Excel, Tableau, programming, or R Studio. But what does this mean for a group of people that work together? We often talk about data literacy as if it’s an individual capacity, but what about data literacy for a community? How does an organization learn how to work with data?

About a year ago we (Rahul Bhargava and Catherine D’Ignazio) found that more and more users of our DataBasic.io suite of tools and activities were asking this question — online and in workshops. In response, with support from the Stanford Center on Philanthropy and Civil Society, we’ve worked together with 25 organizations to create the Data Culture Project. We’re happy to launch it publicly today! Visit datacultureproject.org to learn more.

Update: Join our webinar on April 12th to learn more!

The Data Culture Project is a hands-on learning program to kickstart a data culture within your organization. We provide facilitation videos to help you run creative introductions to get people across your organization talking to each other — from IT to marketing to programs to evaluation. These are not boring spreadsheet trainings! Try running our fun activities — one per month works as a brown bag lunch to focus people on a common learning goal. For example, “Sketching a Story” brings people together around basic concepts of quantitative text analysis and visual storytelling. “Asking Good Questions” introduces principles of exploratory data analysis in a fun environment. What’s more, you can use the sample data that we provide, or you can integrate your organization’s data as the topic of conversation and learning.

Developing Together

We built DataBasic.io to help individuals build their data literacy in more creative ways. We’ve baked in design principles that focused on learners (read our paper), argued to tool designers that their web-based tools are in fact informal learning spaces (watch our talk video), documented how our activities are particularly well suited to data literacy learners (read another paper), and focused them on building a data mindset (read our opinion piece).

These activities and tools were designed and iterated on with interested users (with support from the Knight Foundation). We develop all our tools based on the problem organizations bring to us. Our latest grant was a partnership with Tech Networks of Boston, who brought years of experience working with organizations to develop their capacity and skills in a variety of ways. We prototyped a first set of videos, for the WordCounter “Sketch a Story” activity with them, and tried it out in a local workshop with some of their partners and clients.

Trying Out a Model — the Data Culture Pilot

Based on how that went, we recruited 25 organizations from around the world to help us build the Data Culture Project. Non-profits, newsrooms, libraries, community groups were included in this cohort, and we created a network to help us guide our prototyping. Over the last 6 months, each group ran 3 activities within their organizations as brown-bag lunches.

It was wonderful to have collaborators that were willing to try out some half-baked things! After each workshop, they shared how it went on a group mailing list. Then each month we hosted an online chat to get feedback and share insights and common points from the feedback.

Even in these prototype sessions, the participants shared some wonderful insights. Here are just a few:

  • “It did lead to a pretty significant rethink fo the communications director for what is coming out in the spring.”
  • “I hear back from participants regularly about how much they enjoyed the activities and wondering what comes next.”
  • “As they were working through their data sets, they kept coming up with more questions it made them wonder about and more things to consider about those questions.”
  • “They can relate everything back to their own situations / data / organizations.”

We were heartened and excited to see that our design partners were able to see impacts already!

How to Join the Community

We are launching the Data Culture Project today. Here’s how to make the best use of the project and the community:

  • Read about why you don’t need a data scientist; you need a data culture to understand why data literacy needs to be understood as a community capacity, in addition to an individual capacity.
  • Run one or more of the activities listed on the Data Culture Project home page. We found in the pilot that running one per month (and providing pizza) can work to bring people together.
  • Remix and modify the activity to work for you and tell us about it! At the bottom of each activity page, you’ll see a “Learn With Others” comment box where you can tell others what worked for you (á la Internet food recipe sites).
  • Join our mailing list to connect with others working on creative approaches to building capacity in their organizations (and be the first to hear about new activities and projects).

Remix and modify the activity to work for you and tell us about it! At the bottom of each activity page in the Data Culture Project, you’ll see a “Learn With Others” comment box where you can tell others what worked for you (á la Internet food recipe sites).

We are grateful to the Stanford Center on Philanthropy and Civil Society for supporting the development of the Data Culture Project. The Data Culture Project is headed by Rahul Bhargava and Catherine D’Ignazio, undertaken as a collaboration between the MIT Center for Civic Media and the Engagement Lab@Emerson College, and with the assistance of Becky Michelson (project manager) and Jon Elbaz (research assistant).

Approaches to Teaching Data for Non-Profits

Recently The National Neighborhood Indicators Partnership and Microsoft Civic Technology Engagement Group launched a project to expand training on data and technology to improve communities.  I’m pleased they’ve included Data Therapy as one of the resources they highlight to help you think about building your data culture.  Check out their training guide and their catalog of resources!

training_pic

On a related note, if you are someone that does a lot of training and capacity building, or an organization that wants to be doing that, checkout the podcast and recording of a conversation about enabling learning with School of Data.

What Would Mulder Do?

The semester has started again at MIT, which means I’m teaching a new iteration of my Data Storytelling Studio course.  One of our first sessions focuses on learning to ask questions of your data… and this year that was a great change to use the new WTFcsv tool I created with Catherine D’Ignazio.

wtf-screenshotThe vast majority of the students decided to work with our fun UFO sample data.  They came up with some amazing questions to ask, with a lot of ideas about connecting it to other datasets.  A few focused in on potential correlations with sci-fi shows on TV (perhaps inspired by the recent reboot of the X Files).

One topic I reflected on with students at the close of the activity was that the majority of their questions, and the language they used to describe them, came from a point of view that doubted the legitimacy of these UFO sightings.  They wanted to “explain” the “real” reason for what people saw.  They were assuming that the sightings were people imagining what they saw was aliens, which of course couldn’t be true.

Now, with UFO sightings this isn’t especially offensive.  However, with datasets about more serious topics, it’s important to remember that we should approach them from an empathetic point of view.  If we want to understand data reported by people, we need to have empathy for where the data reporter is coming from, despite any biases or pre-existing notions we might have about the legitimacy of the what they say happened.

This isn’t to say that we shouldn’t be skeptical of data; by all means we should be!  However, if we only wear our skeptical hat we miss a whole variety of possible questions we could be asking our dataset.

So, when it comes to UFO sightings, be sure to wonder “What would Mulder do?” 🙂

Talking Visualization Literacy at RDFViz

Just yesterday at I was in a room of amazing friends, new and old, talking about what responsible data visualization might be.  Organizing by the Engine Room as part of their series of Responsible Data Forums (RDF), this #RDFViz event brought  together 30 data scientists, community activists, designers, artists and visualization experts to tease apart a plan of action for creating norms for a responsible practice of data visualization.

Here’s a write up of how we tackled that in the small group I led about what that means when building visual literacy.

Building Literacy for Responsible Visualization

Scan_Jan_15_pdf__page_1_of_5_I’ve written a bunch about data literacy and the variety of ways I try to build it with community groups, but we received strict instructions to focus this conversation on visualization.  That was hard!  So we started off by making sure we understood the audiences we were talking about  – people who make visualizations and people who see/read them.  So many ways to think about this… so many questions we could address… we were lost for a bit about where to even start!

We decided to pick four guiding questions to propose to ourselves and all of you, and then answer them by sketching about quick suggestions for things that might help.

  • How can visual literacy for data be measured?
  • How can existing resources for data visualization read the growing non-technical data visualization producers?
  • How can we teach readers to look at data visualization more critically?
  • How can we help data visualization producers to design more appropriately for their audiences?

A difficult set of questions, but our group of four dove into them unafraid!  Here’s a quick run-down on each.  For the record, I only worked on two of these, so I hope I do justice to the other two I didn’t directly dig into.

Measuring Visual Literacy

Scan_Jan_15_pdf__page_3_of_5_.png

This is a tricky task, fraught with cultural assumptions.  We began by defining it down to the dominant visual form for representing data – namely classic charts and graphs.  This simplified the question a little, but of course buys into power dynamics and all that stuff that comes along with it.

Our idea was to create an interactive survey/game that asks people to read and reason about visualizations.  Of course this draws on a lot of existing research into visual- and data-literacy, but in that body of work we don’t have an agreed-upon set of questions to assess this.  So we came up with the following topics, and example questions as a thing to think about.

  1. Can you read it?  This topic tried to address the question of basic visual comprehension of classic charting.  The example question would show something like a bar chart and ask “What is the highest value?”.
  2. What would you do? This topic digs into making reasoned judgements about personal decisions based on information show in a visual form.  The example question is a line chart showing vaccination rates over time going down and people getting measles going up; asking “Would you vaccinate your children?”.
  3. What can you tell? Another topic to address is making judgements about whether data shows a pattern or not.  The example question would show a statement like “Police kill women more than men – true or false?” and the answers could be “true”, “false” and “can’t tell”.
  4. What’s the message? More complex combinations of charts and graphs are often trying to deliver a message to the reader.  Here we could show a small infographic that documents corruption somewhere.  Then we’d ask “What is the message on this graphic?” with possible answers of “corruption is rampant”, “corruption happens” and “public funds are too high”.

There are just four topics, and we know there are more.  We’re excited about this survey, and hope to find time and funds to review existing surveys that assess various types of literacies so we can build a good tool to help people measure these types of literacies in various communities!

Choosing the Right Visualization for Your Audience

Scan_Jan_15_pdf__page_2_of_5_.pngWe have a vast, and growing array of visualization techniques available to us, but few guidelines on how to use them appropriately for different audiences.  This is problematic, and a responsible version of data visualization should respect where and audience is coming from and their visual literacy.  With that in mind, we propose to create a library of case studies where each one creates different visualizations from the same dataset, making the same argument, for different audiences.

For example, we sketched out ways to argue that police violence is endemic in the US, based on a theoretical dataset that captures all police-related killings.  For a low visual literacy individual (maybe a 10-year old kid) you could start by showing a face of one victim, and then zoom out to a grid of all the victims to show scale of the problem while still humanizing it. For the medium literacy audience (those that watch the evening news each night on tv), you could show a line chart of killings by year.  For a high literacy audience (reading the New York Times) you could do an interactive map that shows killings around the reader’s location as they compare to nation-wide trends.

You could imagine a library of many of these, which we think would help people think about what is appropriate for various audiences.  I’m excited to assign this to students in my Data Storytelling Studio course as an assignment!

Learning to Read A Data Visualization

Scan_Jan_15_pdf__page_4_of_5_.pngOur idea here was to create a quick how-to guide that lists things you should ask when reading a data visualization.  Imagine a listicle called “15 Things to Check in any Data Visualization”!  The problem here is that people aren’t being introduced to the critical techniques for reading visualization, to identify when one is being irresponsible.

Some things that might on this list include:

  • Is the data source identified?
  • Are the axes labelled correctly?
  • What is the level of aggregation?

This list could expose some of the common techniques for creating misleading visualizations.  Next steps?  We’d like to crowd source the completion of the list to make sure we don’t miss any important ideas.

Helping Non-Experts Learn to Make Data Visualizations

Scan_Jan_15_pdf__page_5_of_5_.pngThis is a huge problem.  The hype around data visualization continues to grow, and more and more tools are being created to help non-experts make them.  Unfortunately, the materials we use to help these newcomers into the field haven’t kept pace with the huge rise in interest!

We proposed to address this by better defining what these new audience need to know.  They include:

  • human rights organizations
  • community groups
  • social movements

And more!  A brief brainstorm resulted in this list of things they are trying to learn:

  • how to select the right data to visualize?
  • what types of charts are best suited to understand what types of data?
  • what cultural assumptions are reflected in what types of dataviz?
  • how do design decisions (eg. color) impact on how readers will understand your data visualization?

This is just a preliminary list of course.

Rounding it Up

Problem solved!

Just kidding… we have a lot of work to do if we want to build a responsible approach to literacies about data visualization. These four suggestions from our small working group at the RDFViz event are just that – suggestions. However, the space to approach this from a responsible point of view, and the conversations and disagreements were invaluable!

 

Many thanks to the organizers and funders, including our facilitator Mushon Zer-Aviv, our organizers at the Engine Room, our hosts at ThoughtWorks, Data & Society and Data-Pop Alliance, and our sponsors at Open Society Foundations and Tableau Foundation.  This is cross-posted to the MIT Center for Civic Media website.

Paper on Designing Tools for Learners

On an academic note, I just published a paper in for the Data Literacy workshop at the WebSci 2015 conference.  Catherine D’Ignazio and I wrote up our approach to building data tools for learners, not users.  Here’s the abstract, and you can read the full paper too.

Data-centric thinking is rapidly becoming vital to the way we work, communicate and understand in the 21st century. This has led to a proliferation of tools for novices that help them operate on data to clean, process, aggregate, and vi- sualize it. Unfortunately, these tools have been designed to support users rather than learners that are trying to develop strong data literacy. This paper outlines a basic definition of data literacy and uses it to analyze the tools in this space. Based on this analysis, we propose a set of pedagogical design principles to guide the development of tools and activities that help learners build data literacy. We outline a rationale for these tools to be strongly focused, well guided, very inviting, and highly expandable. Based on these principles, we offer an example of a tool and accom- panying activity that we created. Reviewing the tool as a case study, we outline design decisions that align it with our pedagogy. Discussing the activity that we led in aca- demic classroom settings with undergraduate and graduate students, we show how the sketches students created while using the tool reflect their adeptness with key data literacy skills based on our definition. With these early results in mind, we suggest that to better support the growing num- ber of people learning to read and speak with data, tool de- signers and educators must design from the start with these strong pedagogical principles in mind.

Data Storytelling Studio – Final Projects

I recently wrapped up my first semester-long course at MIT, called the Data Storytelling Studio.  Students posted all their work on the course blog, but I wanted to share short summaries of their wonderful final projects!  All but one focused on the topic of food security.

Somerville Resources

Tuyen Bui, Hayley Song, Deborah Chen worked with partners in Somerville, MA to create a short video about the challenge and community response to food insecurity among local youth.  They shot video with local programs and included “pop up” data about the problems.  The goal was to raise awareness about the problem and solutions to drive people to volunteer with the partners featured. Watch their movie, or read more about the Somerville Resources video.

2015-06-05_1220

SnapSim

Danielle Man, Edwin Zhang, Harihar Subramanyam & Tami Forrester explored food pricing data, nutrition data, and SNAP benefit data in the hopes of building empathy with enrolled in SNAP.  They created an interactive text-based game that puts you in the role of a single parent on SNAP shopping for food for themself and their two children.  Play the game and see how you fare making hard decisions about what to buy for your family on a tight budget.  Read more about their SnapSim project.

2015-06-05_1227

SNAP Judgements

Mary Delaney and Stephen Suen worked with demographic data about SNAP participants, food nutrition data, and housing data.  They wanted to build empathy and understanding among college students for the difficult trade-offs those in SNAP have to make between health, happiness, and financial security.  Mary and Stephen created a text based game where you take on the persona of a SNAP participant and are forced to make decisions over time about what when to buy food and what to buy to feed your family. Play their game now, or read more about their SNAP Judgements project.

2015-06-05_1230

Drought Debunkers

Val Healy, Nolan Essigmann and Ceri Riley explored data about drought and water use in the United States.  Their goal was to tell a story to young college students about how individual conservation choices are largely symbolic in terms of environmental impact, and urge them to word on collective solutions that focus on agricultural and industrial water usage.  They created a web-scrolling infographic to tell their story. Read more about the Drought Debunkers project.

Art Crayon Toolkit

Laura Perovich & Desi Gonzalez looked at color use in famous paintings. Their goal was to build engagement with children around visual elements of art and spark their interests in the arts by connecting in novel ways.  They created a wonderful set of custom crayons that matched the color distributions in various paintings, and an activity book they play-tested with a small set of children.  Read more about their Art Crayon Toolkit.

The Data Storytelling Studio

I’ve been radio silent for the last half year for two reasons.  Firstly, we had a new baby!  Secondly, I’ve been planning and am now teaching a semester long course at MIT for undergraduates and graduate students.  I’ve called this course the Data Storytelling Studio.  You can follow the course blog at http://cms631.datatherapy.org.  I’ll continue to blog here, but less frequently this semester.

I prefer not to share cute baby pictures online, but am happy to share pictures from the course!  I’ve sketched it out with my colleague Catherine D’Ignazio, assistant professor at Emerson college.  She is teaching a version tailored for journalists there, while I teach a diverse audience of MIT students (the course is offered by the Comparitive Media Studies / Writing program).  The course isn’t a programming or data science course; the focus is more on process, tools, and creative presentation.

I’ll be leading the students through an arc of five modules:

  1. Introduction – we begin by setting context and designing and painting a Data Mural together
  2. Finding and Analyzing Data
  3. Cleaning Data and Finding Stories
  4. Presenting Your Story
  5. Final Project

I’ve focused on the topic of food security for this semester, so most of the projects and assignments will focus on that.  In fact, our mural tells a story about Food For Free, a local organization that runs food rescue and other programming.  As you can see, we’re off to a great start!

IMG_6189