Learning with the Knight Foundation

DataBasic’s development was generously supported by a grant from the Knight Foundation’s Prototype Fund.  The key hypothesis they funded us to test? That building online tools for working with data for learners is different than building for users. While the output is available here at databasic.io, sadly our grant has ended 🙁  We wanted to take this opportunity to share some of our experience from the close-out event Knight just hosted with Maya Design.

Wrap-Up and Reflection

The event began with some scaffolded reflection on the 6-month grant cycle.  We spoke with other grantees in small groups about goals, pivots, successes, failures, learnings, and next steps.  Many sticky notes were used!


One aspect of the event was the amazing set of short 5 minute talks describing the wide variety of projects.  Here’s Rahul speaking about what we’ve learned so far:

The key here was the focus on learnings.  All these small projects Knight funded create an aggregate picture of learnings from prototyping, a methodology we deeply believe in.


With prototypes built, how do you get folks to use them? Our approach with DataBasic involved our audiences in the development of the tools, via multiple workshops, so we had investment from them already.  The focus of this Knight event was to explore the answer to this question more, to help grantees refocus on outreach and marketing.

The folks from Maya led us in an exploration of stakeholders, their needs, and the benefits we provide.  While we had done much of that thinking before, formalizing it on paper was incredibly helpful to think about holes in the outreach we are doing post-launch.

Then they led some great hands-on exercises to help develop elevator pitches from a template.  Here’s one we wrote about why DataBasic is useful for community activists:

For community activists, who need friendlier tools that help them learn how to work with data, DataBasic.io is a suite of online tools for working with data that provide playful and focused, on- and off-screen activities unlike other online tools for finding stories in data, because we have designed and built them with and for learners.

These pitches are a great way to clearly describe what your project brings to its intended audiences, even if you have already thought about it a ton.

Next Steps

As we’ve written before, this type of work is more like a marathon than a sprint.  While we are excited about the enthusiastic response after our launch, we realize that we have to pay close attention to feedback from our various audiences to keep iterating.  We of course hope DataBasic can be useful tools and activities, but we also want it to evolve into a model for how to build data-centric tools for the huge population of learners starting to work with data.  Many thanks to Knight for helping us start down this road, and we look forward to getting their, and your, feedback!


One Week of DataBasic.io

We launched DataBasic.io just one week ago and would love to share some of how our first week went.  More than 4000 people tried out DataBasic.io in just our first week 🙂

Some Press

First up, we got some great press online; from Matt Carroll’s review, to the MIT Media Lab to the UN in Jakarta!

That really helped drive some attention and traffic, and spread the message to folks that we otherwise aren’t connected to!

Some Stats

As we mentioned earlier, more than 4000 visited the site.  That’s great, but we really care about what they did when they were there.  Here are some stats we were excited about:

  • More than 1000 people watched our introductory videos.  This is great, because that’s one of the ways we are trying to make this type of stuff fun and not-intimidating.  Watch them all on our DataBasic Vimeo channel.
  • More than 1/3 of the people who came to one of our tools actually used it to analyze some data.  In the world of web analytics this counts as one of our main “conversion” rates – how many people did the thing we want them to.  This is really high! Most people looked at the sample data for about the Titanic survivors.
  • Almost 50% of visitors rolled over a technical term and read a popup defining what it means.  We are hyper focused on people as learners, and it is great to see this learning feature being used so much.
  • Over 200 people downloaded our activity guides.  One of our core goals is to make these fun activities easier for other people to run, so this is great news to us.  Check out the activity guides for WordCounter, SameDiff, and WTFcsv.

That’s just a quick summary of the impact and use we are seeing so far.  Definitely a success so far!

What’s Next?

These types of projects are much more like marathons than sprints, so as people start to use our DataBasic tools in the real world we look forward to learning more from their feedback.  For instance, we’ve already had a number of suggestions, and an offer to translate in Hungarian.  We’ve also secured funding to add another tool to the suite. Let us know how these tools work for you, and send in any ideas you have about them.

Announcing DataBasic.io!

DataBasic.io is live!!!  After 6 months of planning, building, trying things out with folks, and rebuilding, we’re open to the public 🙂

We’ve got three tools for you to start playing with – WTFcsv, WordCounter, and SameDiff.  Pop on over to https://databasic.io and give them a try.  Right now we’re supporting Spanish or English, and it is accessible to visually impaired via screen-readering software.

Don’t forget to watch the short intro videos on each homepage, and check out the activity guides.

Big thanks to the Emerson Engagement Lab team for helping us get these done – specifically Jay and Jordan!  And of course this wouldn’t have been possible without the support of our sponsor, the amazing Knight Foundation.

Beta Workshop Success

We just held our second, BETA, workshop to test out the DataBasic tools and activities.  Our invitation brought a wonderfully diverse set of journalists, students, community organizers, educators, and folks that work in the arts!  We had to limit attendance to 40 people, simply due to physical limitations in the room.


We gathered another round of invaluable feedback, documented some more of initial uses, brainstormed potential applications, and had a bunch of fun!  Here are two quick drawings participants made, comparing the lyrics that various musicians use with WordCounter and SameDiff:

IMG_2121 IMG_2140

BETA Workshop coming up on 12/8

We’re hosting a BETA workshop on our DataBasic suite of tools on Tues 12/8, from 6-8pm. This workshop is designed for journalists, educators and community organizations that are just starting to work with data. Register on our evenbrite page:


Opening Shot

The tools focus on understanding what is in a CSV file, and also starting to analyze large sets of text data in quantitative ways. We introduce each with a fun, hands-on activity, so it isn’t just staring at screens all evening 🙂 Read the invite for more about why you might want to attend. We’d appreciate your help testing these tools out and want to get more feedback from real folks before we launch them publicly!

(Plus free dinner!)


Video Shoot

Sometimes online tools for working with data can be confusing and overwhelming when you first visit them.  One way we can to try to address this is by having short, friendly introductory videos to tell you why you might want to use each tool.  We wrote some scripts, found some clothes that match the logos, and started shooting video intros for each of the three tools.

They are in post-production now, but you’ll be able to watch them soon on the homepage of each tool in our suite!  Here’s some photos to whet your appetite.


Trying to look casual is hard!


Haven’t had to do this much memorizing since grade school


DataBasic’s First Workshop


On November 8th, 2015, we ran our first pilot workshop of the DataBasic suite of tools at the MIT Media Lab. We hosted around 10-12 people, mostly trusted friends who we could rely on to be honest but kind about the inevitable bugs and shortcomings of the tools at this early stage. For this event, we first outlined our high-level design goal: Design tools that support data literacy learners, not just folks who already know what they are doing with data. We also introduced them to the principles behind DataBasic and noted that our target audiences are journalists, educators, community organizations and students.

We then introduced our three tools, one at a time, with activities that we have designed to teach the tools in a fun way. For example, in WTFcsv – a tool that provides column-by-column descriptions of .csv files – learners chose to develop data-driven questions around UFO sightings. Did particular cities have higher per-capita sightings of UFOs for example? Why did so many people see UFOs in the form of “fireballs”? If we combined this data with weather data, would some interesting patterns surface?


If Kanye West and Elvis Presley had a song-baby, the lyrics would probably look like this.

WordCounter and SameDiff teach basic principles of quantitative text analysis. In the activities for these tools, learners worked with sample data from musician’s lyrics. They used crayons and simple drawings to illustrate patterns from individual artist’s lyrics as well as results from comparative analysis (SameDiff). The above image presents a sample song which would be written by Kanye West and Elvis Presley.

Overall, we learned a great deal from our learners. They had excellent ideas to make the tools more fun, approachable and instructive for new users. We also learned where we had over-complicated things and needed to go back and simplify. And, of course, we had lots of small fixes and feature suggestions that we are working on for our public launch in late December. Stay tuned!

Small fixes and features - suggestions from an astute group

Small fixes and features – suggestions from an astute group


DataBasic’s Guiding Principles

DataBasic's Guiding Principles

DataBasic’s Guiding Principles

These are the design principles that we used to build DataBasic. They come from our paper Designing Tools and Activities for Data Literacy Learners. In order to support learners (rather than users who already know what they are doing), we say that tools should be:

  1. Focused
    1. Centered around one user activity
    2. Doesn’t have too many options
    3. Newbies can do something meaningful quickly
  2. Guided
    1. Has sample data baked in
    2. Can be run from home page
    3. Clear, contextual documentation to get new users started
  3. Inviting
    1. Has a sense of humor!
    2. Described in non-technical language
    3. Can be used in a real context
  4. Expandable
    1. Built for novices
    2. Includes information about how it works (not a black box)
    3. Places itself in a pipeline of analysis


What is DataBasic All About?

DataBasic is a suite of focused and simple tools and activities for journalists, data journalism classrooms and community advocacy groups.  We’re happy to announce that we’ve received funding from the Knight Foundation to build and test DataBasic over the next 6 months!


What is DataBasic?

Though there are numerous data analysis and visualization tools for novices there are some significant gaps that we have identified through prior research. DataBasic is designed to fill these gaps for people who do not know how to code and provide a low barrier to further learning about data analysis for storytelling.

In the first iteration of this project we will build three tools, develop three training activities and run one workshop with journalists and students for feedback. The three tools include:

  • WTFcsv: A web application that takes as input a CSV file and returns a summary of the fields, their data type, their range, and basic descriptive statistics. This is a prettier version of R’s “summary” command and aids at the outset of the data analysis process.
  • WordCounter: A basic word counting tool that takes unstructured text as input and returns word frequency, bigrams (two-word phrases) and trigrams (three-word phrases)
  • SameDiff: A tool that gives you various ways to compare two text documents, to see how they are similar and/or different.

More importantly, we’ll be providing an introductory video and simple training activities for each tool as a way to scaffold learning about data analysis at the same time as doing it. These activities will include fun datasets to start off with, and introduce vocabulary terms and the algorithms at work behind the scenes.  We strongly believe in building tools for learners, and will be putting those ideas into practice on these tools and activities.

Who is Building This?

Catherine D’Ignazio is an Assistant Professor of Data Visualization and Civic Media at Emerson College and a Fellow at the Engagement Lab. She has a background in software development, media analysis and the arts and currently teaches journalism students.

Rahul Bhargava is a Research Scientist at the MIT Center for Civic Media. He works in quantitative media analysis and leads data literacy workshops for students and community groups.

Is it Ready Yet?

We are still developing the first prototypes so we can try them out with folks. Expect to see more updates here as we build them out over the fall.