We launched DataBasic.io just one week ago and would love to share some of how our first week went. More than 4000 people tried out DataBasic.io in just our first week 🙂
First up, we got some great press online; from Matt Carroll’s review, to the MIT Media Lab to the UN in Jakarta!
That really helped drive some attention and traffic, and spread the message to folks that we otherwise aren’t connected to!
As we mentioned earlier, more than 4000 visited the site. That’s great, but we really care about what they did when they were there. Here are some stats we were excited about:
- More than 1000 people watched our introductory videos. This is great, because that’s one of the ways we are trying to make this type of stuff fun and not-intimidating. Watch them all on our DataBasic Vimeo channel.
- More than 1/3 of the people who came to one of our tools actually used it to analyze some data. In the world of web analytics this counts as one of our main “conversion” rates – how many people did the thing we want them to. This is really high! Most people looked at the sample data for about the Titanic survivors.
- Almost 50% of visitors rolled over a technical term and read a popup defining what it means. We are hyper focused on people as learners, and it is great to see this learning feature being used so much.
- Over 200 people downloaded our activity guides. One of our core goals is to make these fun activities easier for other people to run, so this is great news to us. Check out the activity guides for WordCounter, SameDiff, and WTFcsv.
That’s just a quick summary of the impact and use we are seeing so far. Definitely a success so far!
These types of projects are much more like marathons than sprints, so as people start to use our DataBasic tools in the real world we look forward to learning more from their feedback. For instance, we’ve already had a number of suggestions, and an offer to translate in Hungarian. We’ve also secured funding to add another tool to the suite. Let us know how these tools work for you, and send in any ideas you have about them.
DataBasic.io is live!!! After 6 months of planning, building, trying things out with folks, and rebuilding, we’re open to the public 🙂
We’ve got three tools for you to start playing with – WTFcsv, WordCounter, and SameDiff. Pop on over to https://databasic.io and give them a try. Right now we’re supporting Spanish or English, and it is accessible to visually impaired via screen-readering software.
Don’t forget to watch the short intro videos on each homepage, and check out the activity guides.
Big thanks to the Emerson Engagement Lab team for helping us get these done – specifically Jay and Jordan! And of course this wouldn’t have been possible without the support of our sponsor, the amazing Knight Foundation.
Remember those pictures from the video shoot that we shared recently? Well the editing is all done! Here are three videos that will be featured on the various homepages. These will be launching soon, so stay tuned…
Welcome to DataBasic
We just held our second, BETA, workshop to test out the DataBasic tools and activities. Our invitation brought a wonderfully diverse set of journalists, students, community organizers, educators, and folks that work in the arts! We had to limit attendance to 40 people, simply due to physical limitations in the room.
We gathered another round of invaluable feedback, documented some more of initial uses, brainstormed potential applications, and had a bunch of fun! Here are two quick drawings participants made, comparing the lyrics that various musicians use with WordCounter and SameDiff:
We’re hosting a BETA workshop on our DataBasic suite of tools on Tues 12/8, from 6-8pm. This workshop is designed for journalists, educators and community organizations that are just starting to work with data. Register on our evenbrite page:
The tools focus on understanding what is in a CSV file, and also starting to analyze large sets of text data in quantitative ways. We introduce each with a fun, hands-on activity, so it isn’t just staring at screens all evening 🙂 Read the invite for more about why you might want to attend. We’d appreciate your help testing these tools out and want to get more feedback from real folks before we launch them publicly!
(Plus free dinner!)
Sometimes online tools for working with data can be confusing and overwhelming when you first visit them. One way we can to try to address this is by having short, friendly introductory videos to tell you why you might want to use each tool. We wrote some scripts, found some clothes that match the logos, and started shooting video intros for each of the three tools.
They are in post-production now, but you’ll be able to watch them soon on the homepage of each tool in our suite! Here’s some photos to whet your appetite.
Trying to look casual is hard!
Haven’t had to do this much memorizing since grade school
On November 8th, 2015, we ran our first pilot workshop of the DataBasic suite of tools at the MIT Media Lab. We hosted around 10-12 people, mostly trusted friends who we could rely on to be honest but kind about the inevitable bugs and shortcomings of the tools at this early stage. For this event, we first outlined our high-level design goal: Design tools that support data literacy learners, not just folks who already know what they are doing with data. We also introduced them to the principles behind DataBasic and noted that our target audiences are journalists, educators, community organizations and students.
We then introduced our three tools, one at a time, with activities that we have designed to teach the tools in a fun way. For example, in WTFcsv – a tool that provides column-by-column descriptions of .csv files – learners chose to develop data-driven questions around UFO sightings. Did particular cities have higher per-capita sightings of UFOs for example? Why did so many people see UFOs in the form of “fireballs”? If we combined this data with weather data, would some interesting patterns surface?
If Kanye West and Elvis Presley had a song-baby, the lyrics would probably look like this.
WordCounter and SameDiff teach basic principles of quantitative text analysis. In the activities for these tools, learners worked with sample data from musician’s lyrics. They used crayons and simple drawings to illustrate patterns from individual artist’s lyrics as well as results from comparative analysis (SameDiff). The above image presents a sample song which would be written by Kanye West and Elvis Presley.
Overall, we learned a great deal from our learners. They had excellent ideas to make the tools more fun, approachable and instructive for new users. We also learned where we had over-complicated things and needed to go back and simplify. And, of course, we had lots of small fixes and feature suggestions that we are working on for our public launch in late December. Stay tuned!
Small fixes and features – suggestions from an astute group
DataBasic’s Guiding Principles
These are the design principles that we used to build DataBasic. They come from our paper Designing Tools and Activities for Data Literacy Learners. In order to support learners (rather than users who already know what they are doing), we say that tools should be:
- Centered around one user activity
- Doesn’t have too many options
- Newbies can do something meaningful quickly
- Has sample data baked in
- Can be run from home page
- Clear, contextual documentation to get new users started
- Has a sense of humor!
- Described in non-technical language
- Can be used in a real context
- Built for novices
- Includes information about how it works (not a black box)
- Places itself in a pipeline of analysis
DataBasic is a suite of focused and simple tools and activities for journalists, data journalism classrooms and community advocacy groups. We’re happy to announce that we’ve received funding from the Knight Foundation to build and test DataBasic over the next 6 months!
What is DataBasic?
Though there are numerous data analysis and visualization tools for novices there are some significant gaps that we have identified through prior research. DataBasic is designed to fill these gaps for people who do not know how to code and provide a low barrier to further learning about data analysis for storytelling.
In the first iteration of this project we will build three tools, develop three training activities and run one workshop with journalists and students for feedback. The three tools include:
- WTFcsv: A web application that takes as input a CSV file and returns a summary of the fields, their data type, their range, and basic descriptive statistics. This is a prettier version of R’s “summary” command and aids at the outset of the data analysis process.
- WordCounter: A basic word counting tool that takes unstructured text as input and returns word frequency, bigrams (two-word phrases) and trigrams (three-word phrases)
- SameDiff: A tool that gives you various ways to compare two text documents, to see how they are similar and/or different.
More importantly, we’ll be providing an introductory video and simple training activities for each tool as a way to scaffold learning about data analysis at the same time as doing it. These activities will include fun datasets to start off with, and introduce vocabulary terms and the algorithms at work behind the scenes. We strongly believe in building tools for learners, and will be putting those ideas into practice on these tools and activities.
Who is Building This?
Catherine D’Ignazio is an Assistant Professor of Data Visualization and Civic Media at Emerson College and a Fellow at the Engagement Lab. She has a background in software development, media analysis and the arts and currently teaches journalism students.
Rahul Bhargava is a Research Scientist at the MIT Center for Civic Media. He works in quantitative media analysis and leads data literacy workshops for students and community groups.
Is it Ready Yet?
We are still developing the first prototypes so we can try them out with folks. Expect to see more updates here as we build them out over the fall.