Designing Tools for Learners (Not Users)

We (Catherine and Rahul) just co-authored an article in the Journal of Community Informatics called Design Principles, Tools and Activities for Data Literacy Learners. In it, we make the case that most tools that help people work with data prioritize flashy visualizations and outputs rather than helping to scaffold a learning process. This ends up making the process of data analysis like a black box (especially for people from non-technical backgrounds). We pose the question – what would it be like if we designed tools for learners rather than users? We offer four qualities that a tool designed for learners should aspire to be: focusedguidedinviting, and expandable and we go on to talk about DataBasic as a case study. Here are the four qualities:

focused tool strives to do one thing well.  These tools are easily learnable and relatively constrained.  Focused tools do not provide many types of options, and thus can provide a low entry point for data literacy learners.  They create a small playground that is rich enough for the learner to play within, but not so rich that they get lost.

guided tool is introduced with strong activities to get the learner started.  Blank-slate websites and software packages require novice users to imagine usage scenarios.  Guided tools combat this by introducing themselves with an activity that holds the learner’s hand as they get started.  These tools might immediately present an on-ramp for learners via example data and example outputs.

An inviting tool is introduced in a way that is appealing to the learner. This might involve using data on a topic that is relevant or meaningful to them, or simply using humor and playfulness to invite the learner to experiment.  Inviting tools make conscious decisions about visual design, user interface and copywriting to offer a consistent, appealing, and non-intimidating invitation to the learner. Inviting activities use familiar materials to produce playful outputs that attract interest and excitement from learners.

An expandable tool is appropriate for the learner’s abilities, but also offers them paths to deeper learning (perhaps by leaving the tool and graduating to more complicated tools).  They overcome a single-minded focus by including call-outs and capabilities that allow the learner an opportunity and pathway to learn more about how the tool works.  Expandable tools recognize that they are steps along the path to building stronger data literacy for the learner, and help bridge from previous work to next steps.

Check out the full paper here. It is part of a special issue on data literacy published by the Journal of Community Informatics.


Big Data and Development at the MIT Media Lab

We conducted a workshop as part of the Data-Pop Alliance’s Global Professional Training Program on Big Data and Development at the MIT Media Lab. Data-Pop’s program focuses on building capacity for working with data for global professionals who are involved in development work and policymaking. You can read more about their approach here. In attendance were around 30 folks from universities, the civil sector, and government from a variety of countries, including Colombia, Senegal, France and the US.

Our workshop was titled Big, small, and popular data: engaging communities with data. First, we did a group critique of an infographic about global food production. This followed the structure of DataTherapy’s Activity: Critique a Gallery of Visualizations where we explored the story’s message, the audience and the visual techniques they used to tell a data-driven story. We then showed a basic process for working with data and gave some examples of how you can build in stakeholder participation at every stage of the process. The GoBoston2030 public engagement process run by the City of Boston for their transportation master planning process is a great example of this in government. They did community data analysis and interpretation events in order to make meaning out of thousands of qualitative data stories that they collected from citizens.

Finally, we presented the basic design goals of Databasic and participants worked in groups to tell a story from quantitative text data using WordCounter and to ask questions of a spreadsheet using WTFcsv. Groups came up with compelling ways to tell stories about their data in less than ten minutes. We had circle diagrams, sophisticated Simpsons’ cartooning and compelling concepts. We followed the workshop with Q & A about how to take simple, participatory methods back to their contexts.


Our round-up of Databasic workshops and demos from Spring 2016

This has been an eventful spring for Databasic! After launching in January to great success we have been traveling to classes, conferences and workshops to help different groups of people learn about working with data.

Catherine led four workshops for graduate and undergraduate Journalism students at Emerson College. Journalism students learned how to work with qualitative and quantitative datasets from open data portals and start telling stories. Rahul led a workshop for his Data Storytelling Studio class at MIT. Students studying art and civic engagement at Emerson College used Databasic to analyze citizen ideas for the future of transportation in Boston which they later presented to guests from the City of Boston and State of MA.

Rahul led a workshop for a coalition of organizations that work with the youth in the arts sector.  It was an exciting chance to share how the DataBasic approach can help arts organizations think about telling stories of their impact with the rich qualitative and quantitative data they have.  The group loved WTFcsv’s visual approach to finding stories.

The Institute for Infinitely Small Things, a public art group, used Databasic as part of their project Campaign Limericks where they worked with students and community members to create limericks out of the top phrases spoken by presidential candidates. Want to check out our corpus of candidates speeches? There are over 100K words for Trump, Clinton, Cruz and Bernie. Catherine and the Institute later created an art installation and data visualization of four limericks at the Harvard Center for American Political Studies.

Four large limerick visualizations created from Databasic analysis are up at the Harvard Center for American Politics in Cambridge thru August 2016.

In April, Rahul shared Databasic as a tool for participatory data analysis at TICTec 2016 in Barcelona. Organized by mySociety, the conference showcased many different technologies and methods for evaluating the impact of Civic technologies for people from 20+ countries. Here are some tweets from his presentation:


Catherine spoke about Data & Community Engagement at the White House and demo’d Databasic to 150+ people in law enforcement and technology at the White House’s celebration of one year of the Police Data Initiative. More than 53 police departments across the country have signed on to opening up their data in the next year. The event was inspiring and showcased law enforcement departments like Dallas and Orlando who are at the cutting edge of transparency and community engagement.

Also, Catherine got to take a photograph with USCTO Megan Smith which was kind of awesome:

And in early May, Catherine ran a Data Storytelling 101 workshop for journalists on the education beat at the Education Writers Association conference. We worked with data from the Chicago Public School system on student suspensions and started asking questions about race and school ratings in conjunction with suspensions. We also spent a good portion of our time talking about cleaning and merging data.

The spring wrapped up with a workshop for 50 municipal government workers participating in the CityAccelerator project organized by Living Cities and facilitated by Eric Gordon and the Engagement Lab in New Orleans. Teams from Seattle, Albuquerque, Baltimore, Atlanta and New Orleans worked on analyzing citizen comments with WordCounter and spreadsheets related to their accelerator projects with WTFcsv. We also brainstormed other datasets internal to their organizations that they might use with Databasic.


We are thrilled at the reception up to this point and have learned a lot from our participants’ ideas about how they can use Databasic in the context of journalism, media literacy, the arts, community engagement and local government.

DataBasic’s First Workshop


On November 8th, 2015, we ran our first pilot workshop of the DataBasic suite of tools at the MIT Media Lab. We hosted around 10-12 people, mostly trusted friends who we could rely on to be honest but kind about the inevitable bugs and shortcomings of the tools at this early stage. For this event, we first outlined our high-level design goal: Design tools that support data literacy learners, not just folks who already know what they are doing with data. We also introduced them to the principles behind DataBasic and noted that our target audiences are journalists, educators, community organizations and students.

We then introduced our three tools, one at a time, with activities that we have designed to teach the tools in a fun way. For example, in WTFcsv – a tool that provides column-by-column descriptions of .csv files – learners chose to develop data-driven questions around UFO sightings. Did particular cities have higher per-capita sightings of UFOs for example? Why did so many people see UFOs in the form of “fireballs”? If we combined this data with weather data, would some interesting patterns surface?


If Kanye West and Elvis Presley had a song-baby, the lyrics would probably look like this.

WordCounter and SameDiff teach basic principles of quantitative text analysis. In the activities for these tools, learners worked with sample data from musician’s lyrics. They used crayons and simple drawings to illustrate patterns from individual artist’s lyrics as well as results from comparative analysis (SameDiff). The above image presents a sample song which would be written by Kanye West and Elvis Presley.

Overall, we learned a great deal from our learners. They had excellent ideas to make the tools more fun, approachable and instructive for new users. We also learned where we had over-complicated things and needed to go back and simplify. And, of course, we had lots of small fixes and feature suggestions that we are working on for our public launch in late December. Stay tuned!

Small fixes and features - suggestions from an astute group

Small fixes and features – suggestions from an astute group


DataBasic’s Guiding Principles

DataBasic's Guiding Principles

DataBasic’s Guiding Principles

These are the design principles that we used to build DataBasic. They come from our paper Designing Tools and Activities for Data Literacy Learners. In order to support learners (rather than users who already know what they are doing), we say that tools should be:

  1. Focused
    1. Centered around one user activity
    2. Doesn’t have too many options
    3. Newbies can do something meaningful quickly
  2. Guided
    1. Has sample data baked in
    2. Can be run from home page
    3. Clear, contextual documentation to get new users started
  3. Inviting
    1. Has a sense of humor!
    2. Described in non-technical language
    3. Can be used in a real context
  4. Expandable
    1. Built for novices
    2. Includes information about how it works (not a black box)
    3. Places itself in a pipeline of analysis