Join and Merge are two operations to combine data from several files. When merging, you are combining several files with the same structure into a single listing.
When joining, you are combining several files with different data structure but with at least one common field. This common field will be used as a key to combine the data and generate a single listing combining the fields from all the files.
Merge and Join examples
Merge example
Let's say you have a File A:
firstName | lastName | jobTitle
Jean | Doe | CEO
Morgan | Stanlet | Marketing Officer
And a File B:
firstName | lastName | jobTitle
Joe | Kanigan | Finance
Marie | Filman | Dev
Then, the resulting of a merge operation would be:
firstName | lastName | jobTitle
Jean | Doe | CEO
Morgan | Stanlet | Marketing Officer
Joe | Kanigan | Finance
Marie | Filman | Dev
Join example
Now, let's look at the join operation. When performing a join, your files can have different structures but must have a common identifier that will be used to combine the data together.
We have a File C:
email | jobTitle
jean.doe@gmail.com | CEO
morgan@gmail.com | Marketing Officer
Than we want to join with a File D:
email | City
jean.doe@gmail.com | Paris
morgan@gmail.com | New York
The resulting join operation using the email
identifier is
email | jobTitle | City
jean.doe@gmail.com | CEO | Paris
morgan@gmail.com | Marketing Officer | New York
A very common data manipulation task is to bring two or more sets of data together based on a common key.