Embedded SQL Programming
Embedded SQL Programming
Table of contents
If you're viewing this document online, you can click any of the topics below to link directly to that
section.
This is the third in a series of seven tutorials that you can use to help prepare
for the DB2 UDB V8.1 Family Application Development Certification exam
(Exam 703). The material in this tutorial primarily covers the objectives in
Section 3 of the exam, entitled "Embedded SQL programming." You can view
these objectives at: http://www.ibm.com/certify/tests/obj703.shtml.
You do not need a copy of DB2 Universal Database to complete this tutorial.
However, you can download a free trial version of IBM DB2 Universal Database
Enterprise Edition for reference.
Although not all materials discussed in the Family Fundamentals tutorial series
are required to understand the concepts described in this tutorial, you should
have a basic knowledge of:
° DB2 instances
° Databases
° Database objects
° DB2 security
This tutorial is one of the tools that can help you prepare for Exam 703. You
should also take advantage of the Resources on page 34 identified at the end of
this tutorial for more information.
reserved.
Likewise, the DB2 Database Manager cannot work directly with high-level
programming language variables. Instead, it must use special variables known
as host variables to move data between an application and a database. (We will
take a closer look at host variables in Declaring host variables on page 8.) Host
variables look like any other high-level programming language variable; to be
set apart, they must be defined within a special section known as a declare
section. Also, in order for the SQL precompiler to distinguish host variables from
other text in an SQL statement, all references to host variables must be
preceded by a colon (:).
Static SQL
A static SQL statement is an SQL statement that can be hardcoded in an
application program at development time because information about its
structure and the objects (i.e., tables, column, and data types) it is intended to
interact with is known in advance. Since the details of a static SQL statement
are known at development time, the work of analyzing the statement and
selecting the optimum data access plan to use to execute the statement is
performed by the DB2 optimizer as part of the development process. Because
their operational form is stored in the database (as a package) and does not
have to be generated at application run time, static SQL statements execute
quickly.
The downside to this approach is that all static SQL statements must be
prepared (in other words, their access plans must be generated and stored in
the database) before they can be executed. Furthermore, static SQL statements
cannot be altered at run time, and each application that uses static SQL must
bind its operational package(s) to every database with which the application will
interact. Additionally, because static SQL applications require prior knowledge
of database objects, changes made to those objects after an application has
been developed can produce undesirable results.
Dynamic SQL
Although static SQL statements are relatively easy to incorporate into an
application, their use is somewhat limited because their format must be known
in advance. Dynamic SQL statements, on the other hand, are much more
flexible because they can be constructed at application run time; information
about a dynamic SQL statement's structure and the objects with which it plans
to interact does not have to be known in advance. Furthermore, because
dynamic SQL statements do not have a precoded, fixed format, the data
object(s) they reference can change each time the statement is executed.
Even though dynamic SQL statements are generally more flexible than static
SQL statements, they are usually more complicated to incorporate into an
application. And because the work of analyzing the statement to select the best
data access plan is performed at application run time (again, by the DB2
optimizer), dynamic SQL statements can take longer to execute than their static
SQL counterparts. (Since dynamic SQL statements can take advantage of the
database statistics available at application run time, there are some cases in
which a dynamic SQL statement will execute faster than an equivalent static
SQL statement, but those are the exception and not the norm.)
Generally, dynamic SQL statements are well suited for applications that interact
with a rapidly changing database or that allow users to define and execute
ad-hoc queries.
Host variables that transfer data to a database are known as input host
variables, while host variables that receive data from a database are known as
output host variables. Regardless of whether a host variable is used for input or
output, its attributes must be appropriate for the context in which it is used.
Therefore, you must define host variables in such a way that their data types
and lengths are compatible with the data types and lengths of the columns they
are intended to work with: When deciding on the appropriate data type to assign
to a host variable, you should obtain information about the column or special
register that the variable will be associated with and refer to the conversion
charts found in the IBM DB2 Universal Database Application Development
Guide: Programming Client Applications documentation (see Resources on
page 34). Also, keep in mind that each host variable used in an application must
be assigned a unique name. Duplicate names in the same file are not allowed,
even when the host variables are defined in different declare sections. A tool
known as the Declaration Generator can be used to generate host variable
declarations for the columns of a given table in a database. This tool creates
embedded SQL declaration source code files, which can easily be inserted into
C/C++, Java language, COBOL, and FORTRAN applications. For more
information about this utility, refer to the db2dclgen command in the DB2 UDB
Command Reference product documentation.
...
// Define The SQL Host Variables Needed
EXEC SQL BEGIN DECLARE SECTION;
char EmployeeNo[7];
char LastName[16];
EXEC SQL END DECLARE SECTION;
...
Again, in order to understand how indicator variables are used, it helps to look
at an example embedded SQL source code fragment. The following code,
written in the C programming language, shows one example of how indicator
variables are defined and used:
...
// Define The SQL Host Variables Needed
EXEC SQL BEGIN DECLARE SECTION;
char EmployeeNo[7];
double Salary; // Salary - Used If SalaryNI Is
// Positive ( >= 0 )
short SalaryNI; // Salary NULL Indicator - Used
// To Determine If Salary
// Value Should Be NULL
EXEC SQL END DECLARE SECTION;
...
Indicator variables can also be used to send null values to a database when an
insert or update operation is performed. When processing INSERT and UPDATE
SQL statements, the DB2 Database Manager examines the value of any
indicator variable provided first. If it contains a negative value, the DB2
Database Manager assigns a null value to the appropriate column, provided null
values are allowed. (If the indicator variable is set to zero or contains a positive
number, or if no indicator variable is used, the DB2 Database Manager assigns
the value stored in the corresponding host variable to the appropriate column
instead.) Thus, the code used in a C/C++ source code file to assign a null value
to a column in a table would look something like:
ValueInd = -1;
EXEC SQL INSERT INTO TAB1 VALUES (:Value :ValueInd);
The SQLCA data structure contains a collection of elements that are updated by
the DB2 Database Manager each time an SQL statement or a DB2
administrative API function is executed. In order for the DB2 Database Manager
to populate this data structure, it must exist. Therefore, any application that
contains embedded SQL or that calls one or more administrative APIs must
define at least one SQLCA data structure variable. In fact, such an application
will not compile successfully if an SQLCA data structure variable does not exist.
The following table lists the elements that make up an SQLCA data structure
variable.
And now let's look at the elements of the of the sqlca.sqlwarn array:
Two types of SQLVAR variables are used: base SQLVARs and secondary
SQLVARs. Base SQLVARs contain basic information (such as data type code,
length attribute, column name, host variable address, and indicator variable
address) for result data set columns or host variables. The elements that make
up a base SQLVAR data structure variable are shown in the following table.
On the other hand, secondary SQLVARs contain either the distinct data type
name for distinct data types or the length attribute of the column or host variable
and a pointer to the buffer that contains the actual length of the data for LOB
data types. Secondary SQLVAR entries are only present if the number of
SQLVAR entries is doubled because LOBs or distinct data types are used: If
locators or file reference variables are used to represent LOB data types,
secondary SQLVAR entries are not used.
The information stored in an SQLDA data structure variable, along with the
information stored in any corresponding SQLVAR variables, may be placed
there manually (using the appropriate programming language statements), or
can be generated automatically by executing the DESCRIBE SQL statement.
Both an SQLCA data structure variable and an SQLDA data structure variable
can be created by embedding the appropriate form of the INCLUDE SQL
statement (INCLUDE SQLCA and INCLUDE SQLDA, respectively) within an
embedded SQL source code file.
° Prepare and execute: This approach separates the preparation of the SQL
statement from its actual execution and is typically used when an SQL
statement is to be executed repeatedly. This method is also used when an
application needs advance information about the columns that will exist in
the result data set produced when a SELECT SQL statement is executed.
The SQL statements PREPARE and EXECUTE are used to process dynamic
SQL statements in this manner.
° Execute immediately: This approach combines the preparation and the
execution of an SQL statement into a single step and is typically used when
an SQL statement is to be executed only once. This method is also used
when the application does not need additional information about the result
data set that will be produced, if any, when the SQL statement is executed.
The SQL statement EXECUTE IMMEDIATE is used to process dynamic SQL
statements in this manner.
Dynamic SQL statements that are prepared and executed (using either method)
at run time are not allowed to contain references to host variables. They can,
however, contain parameter markers in place of constants and/or expressions.
Parameter markers are represented by the question mark (?) character. They
indicate where in the SQL statement the current value of one or more host
variables or elements of an SQLDA data structure variable are to be substituted
when the statement is executed. (Parameter markers are typically used where a
host variable would be referenced if the SQL statement being executed were
static.) Two types of parameter markers are available: typed and untyped.
A typed parameter marker is one that is specified along with its target data type.
Typed parameter markers have this general form:
CAST(? AS DataType)
This notation does not imply that a function is called, but rather it promises that
the data type of the value replacing the parameter marker at application run
time will either be the data type specified or a data type that can be converted to
the data type specified. For example, consider the following SQL statement:
Here, the value for the LASTNAME column is provided at application run time,
and the data type of that value will be either VARCHAR(12) or a data type that
can be converted to VARCHAR(12).
When parameter markers are used in embedded SQL applications, values that
are to be substituted for parameter markers placed in an SQL statement must
be provided as additional parameters to the EXECUTE or the EXECUTE
IMMEDIATE SQL statement when either is used to execute the SQL statement
specified. The following example, written in the C programming language,
illustrates how actual values would be provided for parameter markers that have
been coded in a simple UPDATE SQL statement:
...
// Define The SQL Host Variables Needed
EXEC SQL BEGIN DECLARE SECTION;
char SQLStmt[80];
char JobType[10];
EXEC SQL END DECLARE SECTION;
...
1. Declare (define) a cursor along with its type (read-only or updatable), and
associate it with the desired query (SELECT or VALUES SQL statement).
This is done by executing the DECLARE CURSOR statement.
2. Open the cursor. This will cause the corresponding query to be executed
and a result data set to be produced. This is done by executing the OPEN
statement.
3. Retrieve (fetch) each row in the result data set, one by one, until an
end-of-data condition occurs. Each time a row is retrieved from the result
data set, the cursor is automatically moved to the next row. This is done by
repeatedly executing the FETCH statement; host variables or an SQLDA
data structure variable are used in conjunction with a FETCH statement to
extract a row of data from a result data set.
4. If appropriate, modify or delete the current row, but only if the cursor is an
updatable cursor. This is done by executing the UPDATE statement or the
DELETE statement.
5. Close the cursor. This action will cause the result data set that was produced
when the corresponding query was executed to be deleted. This is done by
executing the CLOSE statement.
Now that we have seen the steps that must be performed in order to use a
cursor, let's examine how these steps are coded in an application. The following
example, written in the C programming language, illustrates how a cursor would
be used to retrieve the results of a SELECT SQL statement:
...
// Declare The SQL Host Memory Variables
EXEC SQL BEGIN DECLARE SECTION;
char EmployeeNo[7];
char LastName[16];
EXEC SQL END DECLARE SECTION;
...
// Declare A Cursor
EXEC SQL DECLARE C1 CURSOR FOR
SELECT EMPNO, LASTNAME
FROM EMPLOYEE
WHERE JOB = 'DESIGNER';
If you know in advance that only one row of data will be produced in response
to a query, there are two other ways to copy the contents of that row to host
variables within an application program, by executing either the SELECT INTO
statement or the VALUES INTO statement. Like the SELECT SQL statement,
Like the SELECT INTO statement, the VALUES INTO statement can be used to
retrieve the data associated with a single record and copy it to one or more host
variables. And, like the SELECT INTO statement, when the VALUES INTO
statement is executed, all data retrieved is stored in a result data set. If this
result data set contains only one record, the first value in that record is copied to
the first host variable specified, the second value is copied to the second host
variable specified, and so on. However, the VALUES INTO statement cannot be
used to construct complex queries in the same way that the SELECT INTO
statement can.
Again, if the result data set produced when the VALUES INTO statement is
executed contains more than one record, the operation will fail and an error will
be generated. (If the result data set produced is empty, a NOT FOUND warning
will be generated.)
Managing transactions
A transaction (also known as a unit of work) is a sequence of one or more SQL
operations grouped together as a single unit, usually within an application
process. Such a unit is called atomic because it is indivisible -- either all of its
work is carried out or none of its work is carried out. A given transaction can be
comprised of any number of SQL operations, from a single operation to many
hundreds or even thousands, depending upon what is considered a single step
within your business logic.
#include <stdio.h>
#include <string.h>
#include <sql.h>
int main()
{
// Include The SQLCA Data Structure Variable
EXEC SQL INCLUDE SQLCA;
printf("%lf\n", Salary);
else
printf("Unknown\n");
}
}
#include <stdio.h>
#include <string.h>
#include <sql.h>
int main()
{
// Include The SQLCA Data Structure Variable
EXEC SQL INCLUDE SQLCA;
As you might imagine, checking the SQL return code after each SQL statement
is executed can add additional overhead to an application, especially when an
application contains a large number of SQL statements. However, because
every SQL statement coded in an embedded SQL application source code file
must be processed by the SQL precompiler, it is possible to have the
precompiler automatically generate the source code needed to check SQL
return codes. This is accomplished by embedding one or more forms of the
WHENEVER SQL statement into a source code file.
The WHENEVER statement tells the precompiler to generate source code that
evaluates SQL return codes and branches to a specified label whenever an
error, warning, or out-of-data condition occurs. (If the WHENEVER statement is
not used, the default behavior is to ignore SQL return codes and continue
processing as if no problems have been encountered.) Four forms of the
WHENEVER statement are available, one for each of the three different types of
error/warning conditions for which the WHENEVER statement can be used to
check, and one to turn error checking off:
A source code file can contain any combination of these four forms of the
WHENEVER statement, and the order in which the first three forms appear is
insignificant. However, once any form of the WHENEVER statement is used, the
SQL return codes of all subsequent SQL statements executed will be evaluated
and processed accordingly until the application ends or until another WHENEVER
statement alters this behavior.
...
// Include The SQLCA Data Structure Variable
EXEC SQL INCLUDE SQLCA;
EXIT:
Unfortunately, the code that is generated when the WHENEVER SQL statement
is used relies on GO TO branching instead of call/return interfaces to transfer
control to the appropriate error handling section of an embedded SQL
application. As a result, when control is passed to the source code that is used
to process errors and warnings, the application has no way of knowing where
control came from, nor does it have any way of knowing where it should return
control to after the error or warning has been properly handled. For this reason,
about the only thing an application can do when control is passed to a
WHENEVER statement error handling label is to display the error code generated,
roll back the current transaction, and return control to the operating system.
Earlier, we saw that the SQL Communications Area (SQLCA) data structure
contains a collection of elements that are updated by the DB2 Database
Manager each time an SQL statement is executed and that one element of that
structure, the sqlcode element, is assigned a value that indicates the success
or failure of the SQL statement executed. The value that gets assigned to the
sqlcode element is actually a coded number. A special administrative API can
be used to translate the coded number into a meaningful description that can
then be displayed to the user. This API is known as the Get Error Message API.
The basic syntax used to call it from a high-level programming source code file
is as follows for C/C++ applications:
And here's the syntax for other high-level programming language applications:
° pBuffer: Identifies a location in memory where the Get Error Message API
is to store any message text retrieved.
° sBufferSize: Identifies the size, in bytes, of the memory storage buffer to
which any message text retrieved should be written.
° sLineWidth: Identifies the maximum number of characters that one line of
Each time the Get Error Message API is called, the value stored in the
sqlcode element of the SQLCA data structure variable provided is used to
locate and retrieve appropriate error message text from a message file that is
shipped with DB2 UDB. The following example, written in the C programming
language, illustrates how the Get Error Message API would typically be used to
obtain and display the message associated with any SQL return code
generated:
...
// Include The SQLCA Data Structure Variable
EXEC SQL INCLUDE SQLCA;
As you can see in this example, when the Get Error Message API is called, it
returns a value that indicates whether or not it executed successfully. In this
case, the return code produced is checked. If an error did occur, a message is
returned to the user explaining why the API failed. If the API was successful, the
message retrieved is returned to the user instead.
SQLSTATEs
DB2 UDB (as well as other relational database products) uses a set of error
message codes known as SQLSTATEs to provide supplementary diagnostic
information for warnings and errors. SQLSTATEs are alphanumeric strings that
are five characters (bytes) in length and have the format ccsss, where cc
indicates the error message class and sss indicates the error message
subclass. Like SQL return code values, SQLSTATE values are written to an
element (the sqlstate element) of an SQLCA data structure variable used
each time an SQL statement is executed. And just as the Get Error Message
API can be used to convert any SQL return code value generated into a
meaningful description, another API -- the Get SQLSTATE Message API -- can
be used to convert an SQLSTATE value into a meaningful description as well.
By including either (or both) of these APIs in your embedded SQL applications,
you can always return meaningful information to the end user whenever error
and/or warning conditions occur.
The following illustration outlines the basic embedded SQL source code
file-to-executable application conversion process when deferred binding is
used. (We'll discuss deferred binding in more detail in Creating and binding
packages on page32 .)
themselves are commented out and DB2-specific function calls are stored in
their place.) At the same time, a corresponding package that contains (among
other things) the access plans that are to be used to process each static SQL
statement embedded in the source code file is also produced. (Access plans
contain optimized information that the DB2 Database Manager uses to execute
SQL statements. Access plans for static SQL statements are produced at
precompile time, while access plans for dynamic SQL statements are produced
at application run time.) Packages produced by the SQL precompiler can be
stored in the database being used by the precompiler as they are generated, or
they can be written to an external bind file and bound to any valid DB2 UDB
database later (the process of storing this package in the appropriate database
is known as binding). By default, packages are automatically bound to the
database used for precompiling during the precompile process. Unless
otherwise specified, the SQL precompiler is also responsible for verifying that all
database objects (such as tables and columns) that have been referenced in
static SQL statements actually exist, and that all application data types used are
compatible with their database counterparts (that's why you need a database
connection in order to use the SQL precompiler.)
Once a source code file containing embedded SQL statements has been
processed by the SQL precompiler, the high-level programming language
source code file that is produced -- and any other source code files used -- must
be compiled by a high-level programming language compiler. This compiler is
responsible for converting source code files into object modules that the linker
can use to create an executable program.
When all of the source code files needed to build an application have been
compiled successfully, the resulting object module can be provided as input to
the linker. The linker combines object modules, high-level programming
language libraries, and DB2 UDB libraries to produce an executable application.
In most cases, this executable application exists as an executable file.
However, it can also exist as a shared library or a dynamic-link library (DLL) that
is loaded and executed by other executable applications
complete the binding process at a later point in time, using a tool known as the
DB2 Binder (or simply the Binder). This is referred to as deferred binding, and is
preferable if you want to:
° Defer binding until you have an application program that compiles and links
successfully.
° Create a package under a different schema or under multiple schemas.
° Run an application against a database using different options (isolation level,
Explain on/off, etc.). By deferring the bind process, you can dynamically
change things like the isolation level used without having to rebuild the
application.
° Run an application against several different databases. By deferring the bind
process, you can build your program once and bind it to any number of
appropriate databases. Otherwise, you will have to rebuild the entire
application each time you want to run it against a new database.
° Run an application against a database that has been duplicated on several
different machines. By deferring the bind process, you can dynamically
create your application database on each machine, and then bind your
program to the newly created database (possibly as part of your application's
installation process).
Section 6. Conclusion
Summary
This tutorial introduced you to embedded SQL programming and walked you
through the basic steps used to construct an embedded SQL application. At this
point, you should know the difference between static SQL and dynamic SQL,
and you should know how both types of SQL statements can be embedded in a
high-level programming language source code file.
You should know how to declare and use host and indicator variables to move
data between an application and a database, and you should be able to analyze
the contents of an SQLCA data structure variable to determine whether an
embedded SQL statement executed as expected. Furthermore, you should
know how to establish a database connection, how to retrieve and process any
results produced, and how to terminate transactions.
Finally, you should be familiar with the steps used to convert a source code file
containing embedded SQL statements into an executable application.
Resources
For more information on DB2 Universal Database application development:
For more information on the DB2 UDB V8.1 Family Application Development
Certification exam (Exam 703):
° DB2 Universal Database v8.1 Certification Exam 703 Study Guide, Sanders,
Roger E., International Business Machines Corporation, 2004.
° DB2 Universal Database v8 Application Development Certification Guide,
Martineau, David and others, International Business Machines Corporation,
2003.
As mentioned earlier, this tutorial is just one tutorial in a series of seven to help
you prepare for the DB2 UDB V8.1 Family Application Development
Certification exam (Exam 703). The complete list of all tutorials in this series is
provided below:
Before you take the certification exam (DB2 UDB V8.1 Application
Development, Exam 703) for which this tutorial was created to help you
prepare, you should have already taken and passed the DB2 V8.1 Family
Fundamentals certification exam (Exam 700). Use the DB2 V8.1 Family
Fundamentals certification prep tutorial series to prepare for that exam. A set of
six tutorials covers the following topics:
° DB2 planning
° DB2 security
° Accessing DB2 UDB data
° Working with DB2 UDB data
° Working with DB2 UDB objects
° Data concurrency
Use the DB2 V8.1 Database Administration certification prep tutorial series to
prepare for the DB2 UDB V8.1 for Linux, UNIX and Windows Database
Administration certification exam (Exam 701). A set of six tutorials covers the
following topics:
° Server management
° Data placement
° Database access
° Monitoring DB2 activity
° DB2 utilities
° Backup and recovery
Feedback
Colophon
This tutorial was written entirely in XML, using the developerWorks Toot-O-Matic tutorial
generator. The open source Toot-O-Matic tool is an XSLT stylesheet and several XSLT
extension functions that convert an XML file into a number of HTML pages, a zip file, JPEG
heading graphics, and two PDF files. Our ability to generate multiple text and binary formats
from a single source file illustrates the power and flexibility of XML. (It also saves our
production team a great deal of time and effort.)