Dynamic SQL in a Dynamic World -

The Sequel

Michael Rosenblum, Dulcian, Inc.

Introduction

Last year, I presented on the topic of dynamic SQL. I never expected that:

· this feature would be completely unfamiliar to so many people.

· there were so many ways of misusing the feature.

Due to length limitations, this paper will not include syntax discussions (it takes about 30 pages in my book PL/SQL for Dummies to introduce all of the core concepts). This paper will include some real world cases and examples. The goal is to explain what is really happening when you fire an EXECUTE IMMEDIATE command.

1. Security Issues

A currently hot topic in the Oracle world is that of database security. From the point of view of dynamic SQL, there are two questions to address:

1. What privileges do you need to use dynamic SQL?

2. How can you guard against misuse of dynamic SQL?

A. Running Dynamic SQL

The following code shows how to create a user with DBA privileges and a few basic routines:

create user odtug identified by odtug

default tablespace users

temporary tablespace temp;

grant dba to odtug;

create or replace procedure odtug.p_makeTable (i_name_tx varchar2) is

v_sql_tx varchar2(256):= 'create table ' ||i_name_tx||' (a_tx varchar2(256))';

begin

dbms_output.put_line('Fired:'||v_sql_tx);

execute immediate v_sql_tx;

end;

create or replace function odtug.f_getCount_nr (i_user_tx varchar2, i_table_tx varchar2) return number

v_out_nr number;

v_sql_tx varchar2(256):='select count(*) from ' ||i_user_tx||'.'||i_table_tx;

begin

dbms_output.put_line('Fired:'||v_sql_tx);

execute immediate v_sql_tx into v_out_nr;

return v_out_nr;

end;

Next, you can connect as ODTUG and try to create a table using dynamic SQL and get a row count from SCOTT.EMP:

SQL> exec p_makeTable('t_odtug');

Fired:create table t_odtug (a_tx varchar2(256))

BEGIN p_makeTable('t_odtug'); END;

ERROR at line 1:

ORA-01031: insufficient privileges

ORA-06512: at "ODTUG.P_MAKETABLE", line 5

ORA-06512: at line 1

SQL> select f_getCount_nr('SCOTT','EMP') from dual;

Fired:select count(*) from SCOTT.EMP

select f_getCount_nr('SCOTT','EMP') from dual

ERROR at line 1:

ORA-00942: table or view does not exist

ORA-06512: at "ODTUG.F_GETCOUNT_NR", line 7

SQL>

Strangely enough, both calls failed with errors that suggest some kind of privilege problems. But the user has a DBA role. What can the developer do? Take some code like the following and fire it manually (expecting some kind of Oracle error):

SQL> create table t_odtug (a_tx varchar2(256));

Table created.

SQL> select count(*) from SCOTT.EMP;

COUNT(*)

----------

SQL>

Surprise! A direct call works, while the same code wrapped in dynamic SQL fails. When this happens, most developers and DBAs will declare dynamic SQL to be unusable and spread this message to everyone else in their companies. This is unfortunate since it would only take one extra step to make everything work simply by referring to the Oracle manuals. The Oracle security model requires all privileges (both system and object) for the code implemented using dynamic SQL to be granted explicitly. This cannot be done with roles. To test this, connect as any valid administrative user and add the following grants to ODTUG user:

SQL> grant create table to odtug;

Grant succeeded.

SQL> grant select on scott.emp to odtug;

Grant succeeded.

SQL>

Now rerunning the initial procedure and function provides completely different results. Everything starts to work exactly as expected:

SQL> exec p_makeTable('t_odtug01');

Fired:create table t_odtug01 (a_tx varchar2(256))

PL/SQL procedure successfully completed.

SQL> select f_getCount_nr('SCOTT','EMP') from dual;

Fired:select count(*) from SCOTT.EMP

F_GETCOUNT_NR('SCOTT','EMP')

----------------------------

SQL>

To summarize, if you are planning to use dynamic SQL in production code, first check with your DBA team about how much effort it would take to provide explicit privileges. Otherwise there may be a lot of confusion during various stages of your application development project.

B. Running Your Dynamic SQL

The question of code injections has been extensively discussed throughout the Oracle literature. Any dynamic SQL operation requires two things to be passed: structural elements (tables, columns, SQL clauses) and/or the data itself. This means that security should be interpreted in the following ways:

· You can only pass allowed structural elements.

· You cannot pass structural elements instead of data.

To satisfy these requirements, follow just two rules to keep your system safe:

1. If the end user's input is simply data (e.g. values of columns, etc.), these values must be passed to the dynamic SQL using bind variables. Since these variables cannot affect the structure of the query, whatever users type will not cause any problems.

2. If the end user's input will influence the structure of the code, the available options should be based only on the repository elements without the ability to alter the repository itself. That is the sole prerogative of administrators or power users. End users should only select from a previously populated list of functions.

The first rule is significantly easier to illustrate. The way in which Oracle defines bind variables automatically prevents their misuse. Even when passing something like “null OR 1=1” instead of a last name, Oracle would just look for the last name “null OR 1=1” instead of showing the whole table, because the input will be interpreted as a string, and not a part of the query as shown here:

SQL> declare

2 v_tx varchar2(256):='null OR 1=1';

3 v_count_nr number:=0;

4 begin

5 execute immediate 'select count(*) from emp where ename = :1'

6 into v_count_nr using v_tx ;

7 dbms_output.put_line('Bind: '||v_count_nr);

9 execute immediate 'select count(*) from emp where ename = '||

10 v_tx into v_count_nr;

11 dbms_output.put_line('Inject: '||v_count_nr);

12 end;

13 /

Bind: 0

Inject: 14

PL/SQL procedure successfully completed.

SQL>

The repository concept requires more explanation. A number of SQL experts were asked to help implement a very advanced business rules-based system. The problem was not the rules engine itself (even though it was very complicated and included hundreds of millions of records), but the whole deployment cycle. The smallest change to the front-end required a significant effort from multiple resources. Another area of concern was that end users were not exactly sure what modules were needed in addition to the rules engine. The requirement was something close to one extra small screen every few hours.

In four man-days, a somewhat limited but generic module was built to satisfy the following requirements:

1. User requests are represented (wrapped if needed) as functions with:

a. A generic name (header of the pop-up screen)

b. Up to 5 parameters, each including:

i. Header

ii. Mandatory/not mandatory identification

iii. Data type (NUMBER/DATE/TEXT/LOV)

iv. Optional conversion expression (e.g. default date format in the UI since everything on the web is text-based)

v. Value list name (for LOV datatypes)

c. Return CLOB

2. The response is already pre-formatted as HTML so the front-end just needs to present the output on the screen.

3. All definitions are stored in the repository table accessible only by administrators and not visible to the end-users

The final order of actions was as follows:

1. The end user sees the list of possible options (display names from the repository tables) and selects one.

2. The front-end utility reads the repository and builds a pop-up screen on-the-fly with appropriate input fields and mandatory indicators. If the data type of the input field is LOV, the utility requests the generic LOV mechanism from the appropriate list of ID/display pairs.

3. The user enters whatever is needed and presses “SUBMIT.” The front-end calls a special wrapper procedure.

4. The wrapper procedure builds a real function call, passes the required parameters, and returns whatever result is generated.

5. The front-end displays the result.

Now everybody was happy since it only took about five minutes from the moment any new procedure was ready to the moment when it was accessible from the front-end.

From a code injection point of view, everything was completely safe as well. Although the core function uses dynamic SQL, all of the structural elements are declared in the repository table. This means that one side of security protection comes down to a standard audit of repository activities (Insert/Update/Delete). From the other side, end users can only communicate to the system something like “execute routine #N” (where N is selected from the value list), but they have no control over the transformation of #N to SCOTT.DO_SOMETHING. This separation of roles between the person who defines what should be executed and the person who actually uses the functionality is the most important aspect of using a repository-based approach. It allows the system to be flexible enough without creating a security breach.

2. Object Dependencies

Another problem area, especially for beginners, is object dependency. Since dynamic SQL is executed at runtime, there are some pros and cons associated with its use.

A. Con #1: No dependency path to follow up

Oracle does not resolve strings for you, so the USER_DEPENDENCIES data dictionary view is useless. As of this writing, the only effective solution is to use the following two-step process:

1. Populate repositories with required information, structured similarly to that of the Oracle data dictionary.

2. Generate from the repository using a straightforward, transparent mechanism (so it is clear what becomes what).

Now the whole dependency problem is limited to comparing the Oracle data dictionary with your own. This part is easy. Also you are getting a direct syntax check of the generated code. Since there is one and only one way of transforming a repository into code, everything either works or doesn’t.

B. Con #2: No way to determine exactly what will be executed

Since many parameters are either user-entered or constructed on the fly, it is very difficult to predict possible syntax errors without actually encountering them. This problem is much more difficult to solve, but the concept of “samplers” looks somewhat promising (if a repository-based approach is not for you):

· Generate all possible (or as many as possible) permutations of the code to be executed and create PL/SQL modules with that code. This approach can help to identify any initial code problems

· Record all dependencies and keep a simple module that references the same set of objects.

· If the sample becomes invalid, this is an indication of code problems.

C. Pro #1: Reference objects that may not be in the database (yet)

If you have a table for each month, it is easier to create a universal caller instead of writing a huge case statement. An example is shown here:

create or replace procedure p_runMonthly(i_dt date)

v_sql_tx varchar2(256):='begin p_run_'||to_char(i_dt,'YYYYMM')||'; end;';

begin

execute immediate v_sql_tx;

end;

D. Pro #2: “Back door” to resolve logical dead loops or hide existing dependencies

Sometimes having dependencies is not a good thing for the following reasons:

1. Code generators – If you wrap a call to the generated modules into dynamic SQL, you can refresh these modules without invalidating all dependencies. This can be critical when using JDBC connections to the database, since they are significantly less forgiving (in comparison to Net8 calls).

2. Remote objects – Since your maintenance cycle may not be the same as that of other teams involved in your project, there is a chance that you could reference objects via database links while they are inaccessible or invalid. In that case, the only safe way out is to wrap all remote calls into dynamic SQL

3. Logical loops – Even though it is very difficult, you could create a case when a chain of object references cycles itself. In that case, the only solution is to make one step a dynamic SQL module. An example from a sales/contacts system is shown here:

o Original design

- There is a materialized view MV_PEOPLE based on the package function VRAP$PKG.F_PEOPLE_TT that returns an object collection.

- There is a materialized view MV_PEOPLE_ASSIGN that is based on the package function VRAP$PKG.F_PEOPLEASSIGN_TT that returns an object collection. Inside of this function, a materialized view MV_PEOPLE is used to get the current phone numbers of all assigned representatives.

- There is a module called nightly (via a job) that fires DBMS_MVIEW.REFRESH for both materialized views in an order.

o Problem

- The job fails with “ORA-04068: Existing state of packages has been discarded,” because a refresh of the first materialized view is explicitly used in the core function of the second one.

o Solution

- In VRAP$PKG.F_PEOPLEASSIGN_TT convert all SQL queries to MV_PEOPLE to dynamic SQL

3. Bulk Operations

Dynamic SQL can be extremely handy in conjunction with the major performance booster of PL/SQL code and bulk operations on collections. The FORALL operator allows you to fire a set of dynamic DMLs at once, but only if you place a parameter in the USING clause. Unfortunately, this does not work with structural elements on the fly as shown here:

forall i in dept.first..dept.last

execute immediate 'delete from emp where deptno=:1' using dept(i); -- works

-- execute immediate 'drop table t_'||dept(i); -- doesn’t work

But the major advantage of this approach is the following statement type:

EXECUTE IMMEDITE <string> BULK COLLECT INTO <collection>

Even though dynamic SQL does not support any PL/SQL datatypes, you can always use RECORD types as the output of a dynamic query. This feature allows you to build a query of any complexity on the fly and retrieve its results for future processing in the fastest possible way.

All user-defined SQL datatypes are perfectly valid for all cases, but there is one syntax issue that many people forget. There should be always an object constructor INSIDE of the query as shown here:

Create function f_getlov_nt

(i_table_tx varchar2,i_id_tx varchar2,i_display_tx varchar2,i_order_tx varchar2)

return lov_nt is

v_out_nt lov_nt := lov_nt();

begin

execute immediate

'select lov_oty('

||i_id_tx||','||i_display_tx||')'||

' from '||i_table_tx||

' order by '||i_order_tx

bulk collect into v_out_nt;

return v_out_nt;

end;

The explanation is simple. Since you are loading results into a single variable, you should return a collection of homogeneous objects and not a set of columns.

4. Transformation of Cursors for DBMS_SQL

Until Oracle version 11g , when discussing dynamic SQL, there was a very thick line between Native Dynamic SQL and DBMS_SQL package. In this latest version of the DBMS, Oracle declared “functional completeness” and complete cross-compatibility of these implementations with the following results:

· Both support more than 32K of text.

· Both can work with user-defined datatypes.

But there was a new extension to the DBMS_SQL package that was expected for many years. Now there is a way to transform the REF CURSOR datatype into the DBMS_SQL.OPEN_CURSOR datatype and vice versa. Last year, I demonstrated a routine that would take a random SQL statement passed as a text and return a CLOB with all retrieved rows. The only problem was that technically the same query had to be parsed twice, once to describe columns via DBMS_SQL and the second time to get data back. In 11g, the same functionality could be achieved with only one parse as shown here:

create or replace function f_getSqlResults (i_sql varchar2,i_fetch_limit_nr number:=64000)

return clob

v_out_cl CLOB;

v_header varchar2(4000);

v_column varchar2(32000);

v_sql_tx varchar2(32000);

v_cur integer := dbms_sql.open_cursor;

v_cols_nr number := 0;

v_cols_tt dbms_sql.desc_tab;

v_exec_nr integer;

v_ref_cur SYS_REFCURSOR;

pragma autonomous_transaction;

BEGIN

DBMS_SQL.parse (v_cur, i_sql, DBMS_SQL.native);

DBMS_SQL.describe_columns (v_cur, v_cols_nr, v_cols_tt);

v_exec_nr:=dbms_sql.execute(v_cur);

v_ref_cur:=dbms_sql.to_refcursor(v_cur);

FOR i IN 1 .. v_cols_nr LOOP

if i = 1 then

v_column:=' v_row_tx := ''"''||DynResults(i).'||v_cols_tt(i).col_name||'||''"'' ';

v_header := '"'||NVL (v_cols_tt (i).col_name, ' ')||'"';

else

v_column:=v_column||'||chr(9)||''"''||DynResults(i).'||

v_cols_tt (i).col_name||'||''"'' ';

v_header := v_header||chr(9)||'"'||NVL (v_cols_tt(i).col_name, ' ')||'"';

end if;

END LOOP;

v_header := v_header||chr(10);

v_column:=v_column||'||chr(10);'||chr(10);

v_sql_tx :=

'DECLARE '||

' v_out_cl CLOB; '||

' v_buffer_tx VARCHAR2(32000); '||

' v_buffer_length_nr number:=0; '||

' v_row_tx VARCHAR2(32000); '||

' cursor c1 is '||

i_sql||';'||

' TYPE DynamicTable IS TABLE OF c1%rowtype INDEX BY BINARY_INTEGER; '||

' DynResults DynamicTable; '||

' PROCEDURE p_addline (pi_tx VARCHAR2) IS '||

' v_add_nr number:=NVL (LENGTH (pi_tx), 0);'||

' BEGIN '||

' if v_buffer_length_nr+v_add_nr > 32000 then '||

' DBMS_LOB.writeappend (v_out_cl, v_buffer_length_nr, v_buffer_tx);'||

' v_buffer_length_nr:=v_add_nr; '||

' v_buffer_tx:=pi_tx; '||

' else '||

' v_buffer_length_nr:=v_buffer_length_nr+v_add_nr; '||

' v_buffer_tx:=v_buffer_tx||pi_tx; '||

' end if; '||

' END; '||

'BEGIN '||

' DBMS_LOB.createtemporary (v_out_cl, TRUE, DBMS_LOB.CALL);'||

' p_addline(:1); '||

' fetch :2 bulk collect into DynResults limit :3;'||

' FOR i IN 1 .. DynResults.COUNT LOOP '||

v_column ||

' p_addline(v_row_tx); '||

' END LOOP;'||

' dbms_output.put_line(v_buffer_length_nr);'||

' DBMS_LOB.writeappend (v_out_cl, v_buffer_length_nr, v_buffer_tx);'||

' :4 := v_out_cl;'||

'END;';

EXECUTE IMMEDIATE v_sql_rx

USING IN v_header, in v_ref_cur, IN i_fetch_limit_nr, OUT v_out_cl;

close v_ref_cur;

return v_out_cl;

EXCEPTION

WHEN OTHERS

THEN

close v_ref_cur;

return '"<Invalid Query>"';

END;

As shown in this example, DBMS_SQL.OPEN_CURSOR is transformed into REF CURSOR, which is a valid datatype to be passed into an anonymous dynamic SQL block. To make the transformation, the original cursor has to be “executed.” Simply parsing it is not enough.

This feature can be extremely useful for anyone working with these types of undefined data structures or in cases of heavy usage of REF CURSOR datatypes because now it is possible to get a detailed description of the cursor without knowing where and how it was created as shown here:

create or replace procedure p_explainCursor (io_ref_cur IN OUT SYS_REFCURSOR)

v_cur integer := dbms_sql.open_cursor;

v_cols_nr number := 0;

v_cols_tt dbms_sql.desc_tab;

begin

v_cur:=dbms_sql.to_cursor_number(io_ref_cur);

DBMS_SQL.describe_columns (v_cur, v_cols_nr, v_cols_tt);

for i in 1 .. v_cols_nr loop

dbms_output.put_line(v_cols_tt (i).col_name);

end loop;

io_ref_cur:=dbms_sql.to_refcursor(v_cur);

end;

This simple example illustrates the point:

SQL> declare

2 v_tx varchar2(256):='select * from dept';

3 v_cur SYS_REFCURSOR;

4 begin

5 open v_cur for v_tx;

6 p_explainCursor(v_cur);

7 close v_cur;

8 end;

9 /

DEPTNO

DNAME

LOC

PL/SQL procedure successfully completed.

SQL>

5. Performance and Resource Utilization

Another reason why many people refuse to try dynamic SQL is that you cannot avoid parsing. It is important to distinguish three completely different situations:

1. EXECUTE IMMEDIATE without bind variables – N hard parses for N calls

2. EXECUTE IMMEDIATE with bind variables – 1 hard parse + N-1 soft parse for N calls

3. DBMS_SQL with separation of parsing and execution stage – 1 hard parse only for every type of call

The last case requires a some non-trivial code. This can be a real performance booster for the whole system as shown in the following example:

· The problem:

o Users upload CSV-files

o All files use very specific templates

- Name of file defines type

- Column headers map directly to table columns (if header is not registered, it will be ignored)

- 1 row of file = 1 logical group (1..N real rows) – for example, one row could include both sale and cancellation records

- Group-level validation – rules are applicable to the whole set of rows rather than each insert

· The solution:

o “Universal CSV-loader” - build all inserts on the fly

The original implementation of the solution was straightforward, but with real volumes of data (10k+ rows in a single file) DBAs started to complain that the CPU workload became too high. Also, the overall performance of the module was significantly worse compared with estimates (and worsened with increased file sizes). The answer was to implement that single parsing mechanism. Since the module itself is too large, this example only shows the core snippets.

The first step was to prepare the inserts. Each insert type has its own DBMS_SQL cursor (already parsed), stored in the associative array for future access as shown here:

Declare

type integer_tt is table of integer;

v_cur_tt integer_tt;

...

Begin

...

for r in v_groupRow_tt.first..v_groupRow_tt.last loop

v_cur_tt(r):=DBMS_SQL.OPEN_CURSOR;

for c in c_cols(v_mapRows_tt(r)) loop

for i in v_header_tt.first..v_header_tt.last loop

if v_header_tt(i).text=c.name_tx then

v_col_tt(i):=c;

v_col_tx:=v_col_tx||','||v_col_tt(i).viewcol_tx;

v_val_tx:=v_val_tx||',:'||v_col_tt(i).viewcol_tx;

end if;

end loop;

v_sql_tx:='insert into '||v_map_rec.view_tx||

'('||v_col_tx||') values('||v_value_tx||')';

DBMS_SQL.PARSE(v_cur_tt(r),v_sql_tx,DBMS_SQL.NATIVE);

end loop;

Now it is only necessary to spin through all of the rows of the uploaded file, find the appropriate row type (from the array) and bind variables, and fire the Insert statement as shown here:

for i in 2..v_row_tt.count loop

for r in v_groupRow_tt.first..v_groupRow_tt.last loop

for c in v_col_tt.first..v_col_tt.last loop

if v_col_tt(c).id = v_mapRows_tt(r) then

DBMS_SQL.BIND_VARIABLE(v_cur_tt(r),

':'||v_col_tt(c).viewcol_tx,

v_data_tt(c).text);

end if;

end loop;

v_nr:=dbms_sql.execute(v_cur_tt(r));

end loop;

The overall performance improvement for a set of 60,000 rows in the file was about 50 times. For smaller datasets, that number decreased, but it was always significant.

Conclusions

The whole topic of dynamic SQL is too extensive to cover in one paper. The purpose of this paper is to encourage IT professionals to extend the number of tools used to solve different kinds of problems. Having a tool that allows us to support unforeseen changes and unpredicted requirements should be always welcomed. However, you should consider complex solutions only when simpler solutions do not work. Even with the best possible technology available, there is no substitute for good analysis and a real understanding of what are you trying to develop.

About the Author

Michael Rosenblum is a Development DBA at Dulcian, Inc. He is responsible for system tuning and application architecture. He supports Dulcian developers by writing complex PL/SQL routines and researching new features. Mr. Rosenblum is the co-author of PL/SQL for Dummies (Wiley Press, 2006). Michael is a frequent presenter at various regional and national Oracle user group conferences. In his native Ukraine, he received the scholarship of the President of Ukraine, a Masters Degree in Information Systems, and a Diploma with Honors from the Kiev National University of Economics.