Skip to Main Content

OpenRefine

An introduction to this free, open source tool for working with "messy" data.

Filtering Text for Easier Data Exploration

You can make a large dataset more manageable by filtering the data by text. In the sample below let's say you want to look at only "Electric Power" in the Sector column. To do this, navigate to the dropdown arrow at the top of the Sector column and choose "text filter".

                    

 After selecting this, a text box will appear in the Facet/Filter box on the left-hand side of the OpenRefine interface. 

When typing in this text box, you can indicate whether you want the results shown to be case-sensitive.In addition, you can search using a regular expression. Regular expressions are sequences of characters (including symbols) that allow you to search longer text and documents for specific patterns.

By searching "electric power" in the text box, the data represented in the center of the interface will change to display only cells in the Sector column that contain the word "electric power."