Big data just seems to get bigger all the time, but that doesn’t mean it gets any less messy. Even large, carefully cultivated government datasets suffer from irregularities like acronyms, open response items, and misused categories. Steadfast librarians have the patience for such inaccuracies, but undergraduate students are often unprepared for the realities of the big data they crave. Teaching data cleaning and collaboration can help students better understand and use large datasets but also illustrate the importance of library-cultivated data, as it often has fewer of these problems than datasets found on the open web. At a high level, library data and open datasets may be seem comparable, but when we give students the tools to slog through the data on their own, the small things start to add up.
datasets, academic librarianship, openrefine, googe refube
Date of this Version
Stonebraker, Ilana, "Good Library Data Made Better With Technology! Using OpenRefine and Google Fusion Tables in Academic Business Libraries Instruction" (2015). Libraries Faculty and Staff Scholarship and Research. Paper 118.