Data inspection

Data inspection & csv file #

A quick file inspection before you start working with the data is a good habit for any analyst. It will help you get a sense of what’s inside the data, discover missing records or spot other errors that may have occurred while writing data to the file.

◕ Get familiar with the CSV file format #

Let’s first look at the way our data is stored. This is a good time to get to know the CSV file, which is one of the basic data formats. You can use either your Newslogging file or this example.

  • Go to [ ▾ File ] » Download » Comma separated values
  • Save the csv file to your computer

You may wonder why to keep data in a plain text file when we can use more sophisticated tools such as spreadsheets or databases? The problem is that they store data in a way that is not always understood by other programs. For this reason, small data sets are often kept in ordinary text files, such as CSV, so they can be read quickly by any computer without installing additional applications.

In a nutshell, a CSV is no different from a typical data table inside a text file with values separated (most often) by commas – as opposed to the column boundaries in the excel table.

col_1,col_2,col_3
val,val,val
val,val,val

col_1 col_2 col_3
val val val
val val val

◕ Inspect your data set #

If you don’t have more advanced software at hand, you can use Google Sheets or one of the online tools.

  • In Google Sheets, go to [ ▾ Data ] » Column stats
  • Upload the CSV file to WTFcsv to see what’s going on inside

PRO TIP: a quick Google search allows you to find a lot of useful utilities for working with text files, e.g. switching between different data formats.