Event box

Intro Data Skills Part 1: Data Organization in Spreadsheets and Data Cleaning with OpenRefine

Intro Data Skills Part 1: Data Organization in Spreadsheets and Data Cleaning with OpenRefine Online

Part 1 of a 3 part series.  

  • Data organization using spreadsheets: Good data organization is the foundation of any research project. Most researchers have data in spreadsheets, so it’s the place that many research projects start.  Typically we organize data in spreadsheets in ways that we as humans want to work with the data. However computers require data to be organized in particular ways. In order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data. Since this is where most research projects start, this is where we want to start too!
    • In this lesson, you will learn:
      • Good data entry practices - formatting data tables in spreadsheets
      • How to avoid common formatting mistakes
      • Approaches for handling dates in spreadsheets
      • Basic quality control and data manipulation in spreadsheets
      • Exporting data from spreadsheets
  • Data cleaning using OpenRefine: A part of the data workflow is preparing the data for analysis. Some of this involves data cleaning, where errors in the data are identified and corrected or formatting made consistent. This step must be taken with the same care and attention to reproducibility as the analysis.  OpenRefine (formerly Google Refine) is a powerful free and open source tool for working with messy data: cleaning it and transforming it from one format into another.

The target audience is learners who have little to no prior computational experience, and the instructors put a priority on creating a friendly environment to empower researchers and enable data-driven discovery. Even those with some experience will benefit, as the goal is to teach not only how to do analyses, but how to manage the process to make it as automated and reproducible as possible.

Space is limited and it will likely fill quickly. A waiting list will be maintained if all of the spots fill up.

Related LibGuide: All of Us at UB by Jocelyn Swick-Jemison

Date:
Tuesday, September 24, 2024
Time:
12:00pm - 3:00pm
Time Zone:
Eastern Time - US & Canada (change)
Online:
This is an online event. Event URL will be sent via registration email.
Registration has closed.

Event Organizer

Profile photo of Natalia Estrada
Natalia Estrada

 

Digital Scholarship Librarian

Email Me

716-645-1338

Natalia Estrada

More events like this...