Friday, 22 June 2012

10. Table Joins in SQL (a must)

All of the queries up until this point have been useful with the exception of one major limitation - that is, you've been selecting from only one table at a time with your SELECT statement. It is time to introduce you to one of the most beneficial features of SQL & relational database systems - the "Join". To put it simply, the "Join" makes relational database systems "relational".

Joins allow you to link data from two or more tables together into a single query result--from one single SELECT statement.

A "Join" can be recognized in a SQL SELECT statement if it has more than one table after the FROM keyword.

For example:

SELECT "list-of-columns"

FROM table1,table2

WHERE "search-condition(s)"

Joins can be explained easier by demonstrating what would happen if you worked with one table only, and didn't have the ability to use "joins". This single table database is also sometimes referred to as a "flat table". Let's say you have a one-table database that is used to keep track of all of your customers and what they purchase from your store:


Everytime a new row is inserted into the table, all columns will be be updated, thus resulting in unnecessary "redundant data". For example, every time Wolfgang Schultz purchases something, the following rows will be inserted into the table:

idfirstlastaddress            citystatezipdateitemprice
10982WolfgangSchultz300 N. 1st AveYumaAZ85002032299snowboard45.00
10982WolfgangSchultz300 N. 1st AveYumaAZ85002082899snow shovel35.00
10982WolfgangSchultz300 N. 1st AveYumaAZ85002091199gloves15.00
10982WolfgangSchultz300 N. 1st AveYumaAZ85002100999lantern35.00
10982WolfgangSchultz300 N. 1st AveYumaAZ85002022900tent85.00

An ideal database would have two tables:
  1. One for keeping track of your customers
  2. And the other to keep track of what they purchase:
"Customer_info" table:

"Purchases" table:

Now, whenever a purchase is made from a repeating customer, the 2nd table, "Purchases" only needs to be updated! We've just eliminated useless redundant data, that is, we've just normalized this database!


Data Normalization is a technique of database design that is used to get the tables in your database into at least the third normal form (3NF). Basically, this means that you want to eliminate the redundancy of non-key data when constructing your tables. Each table should only have columns that depend on the primary key.

Notice how each of the tables have a common "cusomer_number" column. This column, which contains the unique customer number will be used to JOIN the two tables. Using the two new tables, let's say you would like to select the customer's name, and items they've purchased. Here is an example of a join statement to accomplish this:

SELECT customer_info.firstname, customer_info.lastname, purchases.item

FROM customer_info, purchases

WHERE customer_info.customer_number = purchases.customer_number;

This particular "Join" is known as an "Inner Join" or "Equijoin". This is the most common type of "Join" that you will see or use.
Notice that each of the colums are always preceeded with the table name and a period. This isn't always required, however, it IS good practice so that you wont confuse which colums go with what tables. It is required if the name column names are the same between the two tables. I recommend preceeding all of your columns with the table names when using joins.

Note: The syntax described above will work with most Database Systems -including the one with this tutorial. However, in the event that this doesn't work with yours, please check your specific database documentation.

Although the above will probably work, here is the ANSI SQL-92 syntax specification for an Inner Join using the preceding statement above that you might want to try:

SELECT customer_info.firstname, customer_info.lastname, purchases.item

FROM customer_info INNER JOIN purchases

ON customer_info.customer_number = purchases.customer_number;

Another example:

SELECT employee_info.employeeid, employee_info.lastname, employee_sales.comission

FROM employee_info, employee_sales

WHERE employee_info.employeeid = employee_sales.employeeid;

This statement will select the employeeid, lastname (from the employee_info table), and the comission value (from the employee_sales table) for all of the rows where the employeeid in the employee_info table matches the employeeid in the employee_sales table.

No comments:

Post a Comment