Joining data across tables #
So far we have focused on a single data table, but in reality more complex information structures are stored in databases consisting of many tables.
See below an example from the most recent Medialab project about Kościuszki street in Katowice.
We have separate tables dedicated to buildings and people. Let’s assume that we want to combine them in order to complement the information about the buildings with the names of their architects.
This is where table joins come in handy.
◕ See example #
buildings
id | address | year | architect |
---|---|---|---|
KAM001 | Kościuszki 2 | 1898 | OSO001 |
KAM002 | Kościuszki 14 | 1894 | OSO003 |
people
id | first_name | last_name |
---|---|---|
OSO001 | Edmund | Trossin |
OSO002 | Jacob | Preuss |
OSO003 | Oswald | Winkler |
If we want to attach an architect name to each of the building, we need to join these tables using a common key – in this case, the person’s id. The result would look like this.
buildings
and people
join
id | address | year | first_name | last_name |
---|---|---|---|---|
KAM001 | Kościuszki 2 | 1898 | Edmund | Trossin |
KAM002 | Kościuszki 14 | 1894 | Oswald | Winkler |
◕ Join data #
Usually such joins are made using SQL language or other data processing tools. However, simple operations can also be successfully performed in a plain spreadsheet. Let’s practice.
- Use
=VLOOKUP
function to recode values in selected columns (e.g.yes/no
to1/0
) - Create a dictionary in a separate sheet
- Use this function:
=VLOOKUP(search_key, dictionary!A1:B2, 2, false)
- Then check out:
=ARRAYFORMULA(IFERROR(VLOOKUP(N2:N,dictionary!A1:B2, 2, false), ))