Data exploration

Data exploration #

After an initial data inspection we are ready for further analysis. At this point it is useful to reach back to our design questions and cross-check them against the data.

  • What might you want to learn about yourself from this data set?
  • What could one learn about any other person based on it?
  • Who and how is going to use it? With which tools? For what reasons?

◕ Use pivot tables to explore data #

Let’s assume that we are interested in the relationship between two variables, e.g. the article topic and reading completion rate or the reader location and her type of device.

First, count the values in a selected column.

  • Select [ ▾ Data ] » Pivot table
  • With Ctrl+h Select the values to be analysed (in this case the whole table)
  • In Rows [ Add ] first_variable e.g. location
  • In Values [ Add ] id and [ Summarize by ] COUNTA

Now you can extend it with another variable.

  • In Columns [ Add ] second_variable e.g. device

You could also filter out rows that you don’t want to include in your analysis.

  • Use Filters [ Add ] any_variable to filter out blank values (if any)

Need more detailed instructions? Check out this tutorial.

◕ Count the most common words #

In contrast to numerical data, text analysis may require slightly more specific tools. However, you can easily perform the simplest task of counting the words that appear in your data set, e.g. news titles.

  • Go to databasic.io and copy-paste the column that you wish to inspect.
  • Check out Voyant Tools to perform a bit more advanced text exploration.

It’s not a particularly precise analytical method, but it certainly allows you to get a quick look at the topic of your text data.