SQL and PL/SQL by Example
A number of our established “powerful” programming language features, even beloved ones, could very well turn out to belong rather to “the problem set” than to “the solution set.”
-Dutch computer scientist Edsger Dijkstra, advocate of structured programming and winner of the 1972 Turing Award, in “Correctness Concerns and, Among Other Things, Why They Are Resented”
All database activity is conducted in SQL, and therefore database administrators need to be intimately
familiar with it. Figure 2-1 illustrates that the greatest potential for performance improvement usually lies within the software application, not within the database where the application stores its data or within the physical infrastructure where the database is housed.
A common example of performance problems caused by applications is that the applications aren’t
designed to take advantage of indexing schemes. As an example of a performance problem that you might encounter in SQL, consider the query in Listing 2-1, which retrieves the names of employees with first_name “Steven” and last_name “King”. An index exists on the first_name and last_name columns of the employees table, but the query can’t use the index because it uses the upper function in the restriction. Because Oracle Database can’t use the index, it has to retrieve and examine every single record in the employees table.
Listing 2-1. An Inefficient SQL Query
SELECT first_name, last_name
FROM employees
WHERE upper(first_name) = 'STEVEN'
AND upper(last_name) = 'KING';
Another source of performance problems is the temptation to use the Oracle Database engine as a
calculator to perform arithmetic calculations and string manipulations. This is a highly inefficient use of SQL and is better done using math and string libraries instead. However, it is usually fine to issue a simple query against dual such as in Listing 2-2 that executes just one time.
Listing 2-2. Finding the Day of the Week
select to_char(sysdate, 'DAY') from dual;
An equally important reason why database administrators need to understand SQL is that all database
administration activities such as database maintenance and user management are also conducted in SQL. It should come as no surprise, therefore, that the Oracle Database 12c SQL Language Reference is almost 2,000 pages—compare this with the 20 or so pages in this chapter. Fortunately, you can go online and search the SQL manual or download the electronic version free of charge.3 You can purchase a printed copy if you like having a mountain of paper on your desk.
A third reason why database administrators need to understand SQL is that it has deficiencies that must be guarded against. These deficiencies include redundancy, problems introduced by nullable data items, and the absence of prohibitions on duplicate data records. I return to this subject later in this chapter.
TRUE STORY
One day, an angry software developer demanded that we find out why Oracle was not responding to “simple queries.” We found that he had submitted a query that indiscriminately joined seven tables. His query was of the form SELECT COUNT(*) ... FROM Table#1, Table#2, Table#3, Table#4, Table#5, Table#6, Table#7.4 The number of rows produced by such a query equals the product of the number of rows in each table; if each specified table contained 1,000 rows, the query would produce 100 trillion rows.
When we asked the software developer why he hadn’t specified any joining criteria, he said that he first wanted to determine whether Oracle could handle a “simple” query before submitting a “complex” query!
It happened to me—it could happen to you!
Railroad Diagrams
SQL statements have many optional clauses, so the Oracle reference manuals use railroad diagrams as a visual aid. Figure 2-2 is an example of a railroad diagram for a hypothetical ROW command.
Start at the left of the diagram, and take any path the diagram allows you to take. Along the way,
you encounter mandatory words and clauses as well as optional words and clauses and even subdiagrams. The diagram indicates that the ROW command can take the forms indicated in Listing 2-3.
Listing 2-3. Examples of ROW Commands
ROW YOUR BOAT
ROW, ROW YOUR BOAT
ROW, ROW, ROW YOUR CORACLE
ROW, ROW, ROW, ROW YOUR PIROGUE
ROW, ROW, ROW, ROW, ROW YOUR LARGE CORACLE
Let’s look at some railroad diagrams for a simplified version of the SELECT statement. The SELECT
statement starts with the keyword SELECT and is used to retrieve data from a relational database. What you see here is a very small subset of the complete syntax, but it’s powerful enough for many purposes.
The diagrams in Figures 2-3 through 2-8 show that the keyword SELECT is followed a number of clauses: the mandatory SELECT_list and FROM_clause and the optional WHERE_clause, GROUP_BY_clause, and
ORDER_BY_clause. Separate railroad diagrams are also shown for each of the clauses.
As an example, let’s generate a report of the total salary in each department, sorted by descending order of average salary. Let’s assemble the necessary clauses.
The SELECT_list clause is a comma-separated list of column names or expressions in which you’re
interested. Here’s what you need. The upper function produces the uppercase version of a character string, and the avg function produces the average of values in a group:
SELECT upper(department_name),
AVG(salary)
The WHERE_clause portion of the statement is a list of tables that must be joined. You need the departments table and the employees table. You give each table a short alias that will come in handy in other clauses:
FROM departments d,
employees e
The optional GROUP_BY_clause portion of the statement tells Oracle Database that you apply aggregate functions such as avg to groups of records. In this case, you need to group by upper(department_name):
GROUP BY upper(department_name)
Finally, the ORDER_BY_clause portion of the statement specifies how you want the report sorted. In this case, you need to sort by descending values of AVG(salary):
ORDER BY AVG(salary) DESC;
Let’s put this all together:
SELECT upper(department_name),
AVG(salary)
FROM departments d,
employees e
WHERE e.department_id = d.department_id
GROUP BY upper(department_name)
ORDER BY AVG(salary) DESC;
And here is the result:
UPPER(DEPARTMENT_NAME) AVG(SALARY)
------------------------------ -----------
EXECUTIVE 19333.3333
ACCOUNTING 10150
PUBLIC RELATIONS 10000
MARKETING 9500
SALES 8955.88235
FINANCE 8600
HUMAN RESOURCES 6500
IT 5760
ADMINISTRATION 4400
PURCHASING 4150
SHIPPING 3475.55556
Notice the formatting of the results. Formatting instructions aren’t part of the SQL query; the formatting is left to the application program—SQL*Plus in this case—that sent the SQL query to the Oracle Database engine for processing.
Now take a minute and check out the complete syntax of the SELECT statement in the SQL reference
manual. Notice the subquery diagram in particular and how it references itself; just before the ORDER BY clause, you can optionally specify a UNION, INTERSECT, or MINUS clause followed by another subquery. Also observe that the subquery diagram is referenced by other diagrams. For example:
• If you check the Subquery Factoring Clause diagram, you see that it too refers to the subquery diagram.
• If you check the Table Reference subdiagram, you see that a table reference can include a subquery. This kind of subquery is called an inline view.
• Subqueries that produce a single value can be used any place where a single value is indicated. This kind of subquery is called a scalar subquery. For example, scalar subqueries can be used in a SELECT list and in a WHERE clause.
No comments:
Post a Comment