What is DataBasic All About?

DataBasic is a suite of focused and simple tools and activities for journalists, data journalism classrooms and community advocacy groups.  We’re happy to announce that we’ve received funding from the Knight Foundation to build and test DataBasic over the next 6 months!

Knight_Prototype_Fund_-_Knight_Foundation

What is DataBasic?

Though there are numerous data analysis and visualization tools for novices there are some significant gaps that we have identified through prior research. DataBasic is designed to fill these gaps for people who do not know how to code and provide a low barrier to further learning about data analysis for storytelling.

In the first iteration of this project we will build three tools, develop three training activities and run one workshop with journalists and students for feedback. The three tools include:

  • WTFcsv: A web application that takes as input a CSV file and returns a summary of the fields, their data type, their range, and basic descriptive statistics. This is a prettier version of R’s “summary” command and aids at the outset of the data analysis process.
  • WordCounter: A basic word counting tool that takes unstructured text as input and returns word frequency, bigrams (two-word phrases) and trigrams (three-word phrases)
  • SameDiff: A tool that gives you various ways to compare two text documents, to see how they are similar and/or different.

More importantly, we’ll be providing an introductory video and simple training activities for each tool as a way to scaffold learning about data analysis at the same time as doing it. These activities will include fun datasets to start off with, and introduce vocabulary terms and the algorithms at work behind the scenes.  We strongly believe in building tools for learners, and will be putting those ideas into practice on these tools and activities.

Who is Building This?

Catherine D’Ignazio is an Assistant Professor of Data Visualization and Civic Media at Emerson College and a Fellow at the Engagement Lab. She has a background in software development, media analysis and the arts and currently teaches journalism students.

Rahul Bhargava is a Research Scientist at the MIT Center for Civic Media. He works in quantitative media analysis and leads data literacy workshops for students and community groups.

Is it Ready Yet?

We are still developing the first prototypes so we can try them out with folks. Expect to see more updates here as we build them out over the fall.