Event box

Cleaning Data with OpenRefine

Cleaning Data with OpenRefine

Cleaning Data with OpenRefine


A part of the data workflow is preparing the data for analysis. Some of this involves data cleaning, where errors in the data are identified and corrected or formatting made consistently. This step must be taken with the same care and attention to reproducibility as the analysis.

OpenRefine (formerly Google Refine) is a powerful free and open-source tool for working with messy data: cleaning it and transforming it from one format into another.

This lesson will teach you to use OpenRefine to effectively clean and format data and automatically track any changes that you make. Many people comment that this tool saves them literally months of work trying to make these edits by hand.

  • It is important to know what you did to your data. Additionally, journals, granting agencies, and other institutions are requiring documentation of the steps you took when working with your data. With OpenRefine, you can capture all actions applied to your raw data and share them with your publication as supplemental material.
  • All actions are easily reversed in OpenRefine.
  • If you save your work it will be to a new file. OpenRefine always uses a copy of your data and does not modify your original dataset.
  • Data cleaning steps often need repeating with multiple files. OpenRefine keeps track of all of your actions and allows them to be applied to different datasets.

OpenRefine download
OpenRefine install documentation

Registration is required

Lesson:

Date:
Friday, October 15, 2021
Time:
2:00pm - 4:00pm
Campus:
Bizzell Memorial Library
Categories:
OU Libraries Event
Registration has closed.