UN Data Forum – Data Literacy: What, Why and How? (liveblog)

This is a liveblog written by Rahul Bhargava at the 2017 UN World Data Forum.  This serves as a summary of what the speakers spoke about, not an exact recording.  With that in mind, any errors or omissions are likely my fault, not the speakers. 

This panel has four speakers on the topic of data literacy, with an emphasis on front-line, practical things.

Empowering Future Users through Data Literacy – Professor Delia North

Dean and Head of Math, Statistics and Computer Science in Universty of Kwazulu-Natal Durban.  She wants to spread the message of empowering people (a theme for this session).  Prof North, teaching over 30 years, works on curriculum design for school level teacher training.  She has a passion for statics and youth, at the national level in addition to within her university.

The need to maintain a competitive economy drives the need for statistical literacy from basic operations, to the PhD level.  All citizens need basic statistical literacy, for basic citizenship; best to accomplish this while they are in school. Professionals need competence to use statistics effectively in the workplace. Specialists need to continually improve their practice.  University tends to think everyone is on the path to becoming a mathematical statistician, but this is an old-fashioned approach.  This isn’t developing them as “consumers” of statistics.

Statistics is often introduced as “hidden” inside of mathematics, so this is what people in South Africa think about.  That doesn’t identify it as a job opportunity to learners. In addition, statisticians are poor at marketing their discipline. It is viewed as difficult, boring and confusing.  There is a shortage of skills, and an overestimation of ability.  The best statisticians go to industry, so universities are left understaffed.  There are “too few enablers” of statistical literacy.

Data used to be scarce, but now it is everywhere.  This requires a rethink of the way we introduce statistics. This involves bringing in more data, and teaching with new methods.  Students need to be actively involved with working with large datasets.  This is an opportunity, not a threat. The questions we ask on our assessments are calculator-driven, not focused on analytical thinking.

Data literacy is an essential part of statical literacy.  Decisions based on data should be part of the statistical literacy training. Statistics should be an applied mathematics applied within another discipline.  For example, they collected rubbish with children and had them track the amount and graph it. You can’t keep it trapped in mathematics classes.  You have to make learning these concepts fun!  Engaging workshops can radically change how empowered a group of teachers feels to introduce statistics.  They want to learn new teaching methods.  You have to teach them at the beginning to introduce things in the right way.

Empowering Users in Situ – Dr. Sati Naidu 

Executive Manager for Staekholder Relations for Statistics South Africa.  Stats SA has moved away from selling the data to helping people use the data for making evidence-making decisions. In 1996 South Africa did its first census. The first CD they produced cost 100,000 USD.  Now data collection is scattered across all the departments.  That should all be available on one platform to drive decision making.  They set up CRUISE, to merge a course for statistics, GIS, planning, and economics all together.  Dr. Naidu attended this course and learned much about a geographic approach to statistics.  Mapping can reveal patterns that are otherwise hidden in traditional analytical means.  This is demonstrated with a powerful set of maps that show the incidence of HIV/AIDS over time across Africa.

Now Stats SA creates GIS to create a platform to combine geometry, shape-files, and more. This lets them create thematic maps very easily. They offer trainings on these tools throughout South Africa.

Another example is looking at piped water over time, to see an increase.  With the map you can see which areas improved, and look for patterns in those with low or high services.  You can run hotspot analysis to look at unemployment data. You can do geospatial analysis to look for outliers and then look for causes.

When data is non-stationary you can’t just use traditional statistical analysis. For instance new houses are much more expensive than old houses in most of Cape Town. But in one area, new houses are very cheap because of the location.  So in one part of town there is a positive correlation, and in another there is a negative one.  You can find this with geographically weighted regression (GWR), while it would be hidden in a traditional regression.

Stats SA has all the official data.  Now they want to engage with private providers to make their data available.  We need to change from Big Data to Open Data, to go from its size to how it is used.

Data Literacy for Capacity Building – Dr. Blandina Kilama

Dr. Kilama works for REPOA on Poverty Research in Tanzania. REPOA is a think-thank in Tanzania that undertakes policy research.  She also teaches statistics part-time, and will share some of her learnings from there.

The stakeholders vary form Policy Makers, to Academia, to Media, to CSOs. Tanzania, has agriculture, This matters when politicians and others often conflate things like employment and productivity when talking about growth. Most African countries are seeing growth from productivity, not from labor.  For instance, agriculture, industry and services contribute roughly equally in terms of the economy.  However, more than 70% of the labour force works in agriculture.

This capacity causes problems sometimes.  For instance REPOA produced some poverty maps that were used by policy makers, leading to reactions of surprise and accusations.  Spatial analysis helped them explain this better, but showing how districts next o cities experience growth, while districts next to refugee camps showed lack of growth.

For media, REPOA builds in flexibility. They do half-day trainings, and make topics relevant for their current work.  These fit the media workers schedules, between their morning checkins and afternoon deadlines.

The challenges include weak numerical literacy, a shift in policies, and a lack of time. In Tanzania there is a common saying “we are all scared of numbers.”  This attitude is a real social challenge to conquer; the stakeholders have a deep fear of numbers. Policies need to shift to include the idea that people providing the data are protected, and experience benefits from it.

Data and Statistics: the sciences, the literacies and collaboration – Professor Helen MacGillivray

Dr. MacGillivray is a high-level mathematical statistician, and heavily involved with teacher training. Works in Australia, but is the incoming President of International Statistics Institute.  This is a big topic, and the challenges reflect that.

In Australia, the people involved in teaching are the ones thinking about what is data literacy, and what is data science. There are valuable lessons in the decades of work on building statistical literacy.  The include work within the other disciples.  Some tidbits include the idea that descriptions are better than definitions, and that discussion is essential, but diagrammatic representations are not.

Statistical literacy focusing on understanding, consuming information, and interpreting and critically thinking about. This differs at grade levels. The curricula has an aim of helping you look behind the data, ask why it is presented, and what questions can be asked.

With data literacy there aren’t many definitions around. The ones that exist vary. Some split this between information literacy and data management.

Why is this important?  It is for everyone to the extent appropriate for their level of education, training, and work. This is very contextual, so it is a constant learning.

How do you do this?  Models at the governmental level are actually decades old.  The emphasis is on the problems, the plan, getting the data, analyzing, and then discussions and interpretation.  Dr. MacGillivray, in her workshops with teachers, encourages them to not think about the problem and the answer.  This work is much wider than that.   At the professional level, current approaches lead statisticians to think that they should NOT be involved with the collection of data; that somehow that gets their hands dirty.  They think it is a waste of a statisticians valuable time.  Nothing could be further form the truth.

In terms of penetration, there is lots of practice, but current teaching methods are still buried in old practices. They need to use complex, many-variabled datasets.  This leads to impediments for data literacy and data science.  Instead of a misplaced focus on calculation as in staticialy literacy education, in data science education there is a misplaced focus on coding.


Q & A

How about grassroots data literacy – what school do I send my students to?  can students analyze air quality?  Part of data literacy is knowing data is important for decisions making.

Prof North responds about the import of sourcing of data, what it is, where it came from, why it was collected is critical. Now we try to use household data that is from the world of the student.  You can use larger datasets, but still from the world of the student.

In terms of data availability, is there a way to asses the data literacy levels of different countries? How can we do better outreach?

Prof Naidu responds that, In terms of dissemination, now Stats SA takes the data to the people.  They have huge publicity campaigns to argue for collection; and then takes the results back to the people.

The SDGs combine social, economic, and environmental measurements. The average person on the street that is the target for behavior change, needs to understand the links between the three.  Where does scientific literacy come into this?

Prof MacGillivray reminds us that this is an old question, because these literacies operate within context in other fields.  We have to work with other disciplines and their educations.  Prof North adds that at her university they implemented practices that try to involve the other disciplines.  So if a student came in for help from another department, they involved the supervisor.  Dr. Kilama adds that in her country collecting the environmental data collection is the challenge they face.

Using data literacy as a means to protect poeple from fake statistics.  VIsualization can make bad statistics very acceptable.  We need to educate people about how to differentiate between good data and good-looking data.

This is the focus of the critical approaches.

Regarding adaptability for developing countries, places where connectivity is quite low?  Can we use radio for this?

This is our perspective from the Netherlands, so we don’t have good approaches already. Perhaps other people in the room do.