I’m happy to announce we received a grant from the Knight Foundation to work with Catherine D’Ignazio (from the Emerson Engagement Lab) on a new suite of tools called DataBasic! Expect to see more here as we build out this suite of tools for Data Literacy learners over the fall. Follow our progress over on DataBasic.io.
We propose to create a suite of focused and simple tools for journalists, data journalism classrooms and community advocacy groups. Though there are numerous data analysis and visualization tools for novices there are some significant gaps that we have identified through prior research. DataBasic is designed to fill these gaps for people who do not know how to code and provide a low barrier to further learning about data analysis for storytelling.
In the first iteration of this project we will build three tools, develop three training activities and run one workshop with journalists and students for feedback. The three tools include: (1) WTFcsv: A web application that takes as input a CSV file and returns a summary of the fields, their data type, their range, and basic descriptive statistics. This is a prettier version of R’s “summary” command and aids at the outset of the data analysis process. (2) WordCounter: A basic word counting tool that takes unstructured text as input and returns word frequency, bigrams (two-word phrases) and trigrams (three-word phrases) (3) TuffyDuff: A tool that runs TF-IDF algorithms on two or more corpora in order to compare which words occur with the most frequency and uniqueness.