In my previous article I discussed the fundamentals of SQL and provided an overview of SELECT, the first of the four main SQL commands. In this article, I will cover some beginner/intermediate topics:
- Sub-queries
- Joins
- Table Creation
Sub-queries, awesomeness in a ( )
What if you want to grab data from a select query but you only want that data to be "selected" if an existing condition exists? You could go and write multiple conditions on the WHERE clause, but that would then require parsing through a whole set of data twice or maybe even thrice!
A better option is to limit your data-set through a sub-query. A sub-query is a SELECT statement inside a select statement. An example of this would be to select data from your table if the data in another table meets a certain condition. Doing this through a join can be confusing and difficult. It is much easier to code a sub-query to parse and then feed the data into your main query.
You can nest a sub-query, under the SELECT, FROM, or WHERE command.
A sample sub-query is below:
SELECT Zone_Temp // Selecting the column for the query.
FROM hvac h // from the HVAC tabled aliased as h
// Begin the sub-query
(SELECT space_id // Selecting the column for the sub-query.
FROM spaces //from the spaces table
WHERE space_id = "Patient Room") sq1 // defining the scope of the sub-query and defining the alias.
WHERE sq1.space_id = h.space_id; //applying the "where" filter based on the space_id of the two tables
Joins, for those of us smart enough to separate our Tables
3NF, BCNF, AHHHH.... I still remember my relational database class. EVERYTHING MUST BE IN 3NF. Why? Well you want to avoid integrity errors with over-rights and "stuff" like that. The thing is for those people who preach third normal form (3NF) do they really understand how much of a pain in the butt it is? I mean "supposedly" you are a super sophisticated DBA if your databases are in 3NF and if you can achieve 4NF or BCNF then your über...
The problem with this level of normalization is it makes queries difficult for the average person to write. If you want to take a Room and compare it's temperature with the outside air temp and the occupancy schedule then you need to run three JOINS in a query, but before we get to far down that rabbit hole, we need to describe what a JOIN is.
A JOIN for all intents and purposes is a method that can be used to, drumroll please...., join tables. This my friends is why Primary and Foreign Keys are so critically important in your tables. If I wanted to join the occupancy schedule onto the room schedule then I could write something like
SELECT r.Room_Temp, Occupancy_Start //Selecting the columns
FROM ROOMS r // From the Rooms Table aliased as r
JOIN OCCUPANCY o // Joined to the Occupancy Table aliased as o
ON r.room_id = o.room_id // on the room_id of both the room and occupancy tables when they equal one another.
JOIN // Selecting the variables to JOIN in the sub-query below
(SELECT OA_T //Selecting the columns
FROM OUTDOOR_Variables //From the Outdoor_variables Table
WHERE OA_T > 80) o // Where the OA_T is greater than 80 and aliasing the sub-query as o
ON o.BLDG = r.BLDG // Joining the output of the sub-query on the BLDG column
GROUP BY r.Room_Temp; //Grouping the results.
Now you can see what we did in the comments above. You can see how this query would be a pain for the person with basic SQL skills. So which evil do you choose? Do you normalize your tables to deter anomalies? or do you simply take your chances with the normalization so you can write queries easily. It really comes down to the velocity and volume of your data. If the velocity (speed) and volume (amount) of your data are high then normalization makes sense. However if velocity and volume are low then normalization may not makes sense because you can easily see any anomalies that take place.
Table Creation, for the slightly GUI phobic folks...
Sure there are reasons to use text editors and command-line syntax. According to several folks out there it gives you a level of control and flexibility you don't get in the GUI. I've heard this rumor, but much like Bigfoot or Chupacabra I have't seen the proof in real-life. I accept however, that some of my readers, may be slightly GUI phobic. Well, my friends, that's ok. You're at the right spot! I will show you, using my superior copy and paste skills combined with a semi-legible SQL coding skill set, how to create tables.
The first thing you need to create a table, is.... A database! Yes, what a novel idea that you would need a database to store a table. Now, that I have solved world hunger, I will move on to coding a table, assuming that you my readers know how to create a database.
The syntax for creating a table is as follows:
CREATE TABLE [table_name]
(
column1_name data_type any_conditions,
column2_name data_type any_conditions,
column3_name data_type any_conditions,
column4_name data_type any_conditions,
Constraints [constraint type],
)
Let's create a table following this syntax. In this case, imagine I just pulled my utility data into my environment via a web service and I want to pull this data into my SQL cluster so I can JOIN the two databases and produce queries.
The first thing I need to do is to create a table in which I will populate my utility data. To do this I will create a table called UTILITY_DATA
CREATE TABLE METER_DATA
(
Meter_Id NUMBER(5) not_null unique,
Meter_Desc VARCHAR2(30) not_null,
Meter_Value NUMBER(10) not_null,
CONSTRAINT Meter_Data_PK
PRIMARY KEY Meter_Id,
)
CREATE TABLE UTILITY_DATA
(
Utility_Account NUMBER(7) not_null unique,
Meter_Id NUMBER(5) not_null unique,
Utility_Type VARCHAR2(8) not_null,
Utility_Qty NUMBER(10) not_null,
Utility_Cost NUMBER(10, $99,999,999.00) not null,
CONSTRAINT Utility_Data_PK
PRIMARY KEY Utility_Account,
CONSTRAINT utility_data_fk_meter_data
FOREIGN KEY meter_id REFERENCES meter_data(meter_id),
)
Now, that we have created out table, we can use the other skills we learned in the beginning of this article to achieve our goal of data collection and review from both our BAS System and our Utility Bill System.
SELECT u.Utility_Account, m.meter_value, u.Utility_Cost
FROM Utility_Data u
(SELECT meter_id
FROM Meter_Data
WHERE meter_value > 0) m
ON u.meter_id = m.meter_id
WHERE u.utility_cost > 0
GROUP BY u.utility_account;
In the above query, I create a query that shows the utility account, it's associated meter value, and the account's utility cost. Using a sub-query I limit the results to meter_value's greater than 0 and I then limit that whole data set based on utility_cost's that are greater then 0. I then group these amounts by utility_account.
Conclusion
This is my second article in the SQL series. I am going to write one, maybe two, more articles around SQL before I move off this topic. My hope is that these articles will help you understand how SQL can be applied within your BAS environment and will enable you to produce actionable insight from your BAS data.
How are you using SQL in your environment to create queries that are producing quantifiable business outcomes?
What more would you like to know about SQL?
Let me know in the comments below!