Saturday, January 14, 2012

The Anatomy of a SQL Query

 By Kirk Harrington, Partner, Statistical Analyst Effectiveness Group (SAEG)

A useful programming code to understand as an analyst is SQL.  Many interfaces to data ask you to write SQL to query and extract data.  There are different types of SQL which are used on different platforms (a topic not discussed in this blog), however the general format is similar across all formats.

The below represent very simple diagrams that to represent the 'anatomy' of a SQL query (three basic structures shown for you).  If you understand the basic structure, you know how to build a query and how to break it down and troubleshoot it for problems.

In addition to this blog, SAEG can make available to you Basic, Intermediate, and Intermediate II SQL lessons that anyone can self teach themselves.  Please just SAEG at enduranalyst@yahoo.com and we can send you those via google docs.  If you would like a human-taught lesson, SAEG can arrange that for a small fee.   In my mind, you don't need a special class or expensive lessons from a vendor to learn SQL.  Its pretty easy to learn and can give you an edge as an analyst if you know it.

The Anatomy of a SQL Query, 3 typical structures

1) Just a simple select, one table

SELECT
<field(s)> or<*>

FROM
<table(s)>


WHERE
<filter statement if desired>

GROUP BY <used in only certain cases, i.e. when doing a sum or just grouping>
ORDER BY <used when you want to 'sort' by a given field>
 
2)  Structure using an 'explicit' join, two tables

SELECT
<field(s)> or <*>

FROM
<table(s)>

<name your joins, i.e. LEFT OUTER JOIN, INNER JOIN, FULL JOIN>


ON <field on first table = field on join to table>

WHERE
<filter statement if desired>

GROUP BY <used in only certain cases, i.e. when doing a sum or just grouping>
ORDER BY <used when you want to 'sort' by a given field>

3)  Structure using an 'implict' join (no naming of joins required), two tables

SELECT
<field(s)> or <*>

FROM
<table(s)>

WHERE
<field on first table = field on join to table>

AND
<filter statement if desired>

GROUP BY <used in only certain cases, i.e. when doing a sum or just grouping>

ORDER BY <used when you want to 'sort' by a given field>~~~~

Other notes:

The * tells the query to bring back every field that is on the given table you select

When joining two or more tables, you use these things called 'aliases' that 'identify' a given field and table where its from.  For more on that, please ask for our SQL lessons to learn more.  Many thanks!

Examples:

Explicit join

Select
id_table.id_code,
id_table.user_name,
volumes_table.volume_produced,
volumes_table.geo_location

from
id_table

left outer join

volumes_table

on id_table.id_code=volumes.id_code
where geo_location='Cleveland'

Implicit join

Select
id_table.id_code,
id_table.user_name,
volumes_table.volume_produced,
volumes_table.geo_location

from

id_code, geo_location
where id_table.id_code=volumes_table.id_code and volumes_table.geo_location='Cleveland'

Notes:
In the above examples, it is assumed that there is an id_code on the volumes table.  Also, notice how the table names are used as a prefix to the fields called for in the joins.  I can use aliases as well instead of the table names (if I wanted to use less code), for example:

Explicit with aliases

Select
a.id_code,
a.user_name,
b.volume_produced,
b.geo_location

from
id_table a

left outer join

volumes_table b

on a.id_code=b.id_code
where geo_location='Cleveland'

Implicit join with aliases

Select
a.id_code,
a.user_name,
b.volume_produced,
b.geo_location

from

id_code a, geo_location b
where a.id_code=b.id_code and a.geo_location='Cleveland'

An Effective Analyst thought...

When you're talking to someone about your analysis, how you ask if someone is understanding something can make you come off in either an 'I-know-more-than-you' or 'I-am-equal-to-you' attitude.  Here's a SAEG suggestion to help with this....

Instead of "Do you understand?", try "Am I making sense?"

No comments:

Post a Comment