DBA tip – Easy way to handle single quotes in awk

It is no wonder that I cannot live without awk even a single (working) day in my life. Such is the power of awk (or AWK). It is simply the most powerful utility that I use on *IX (Linux/AIX) systems. awk makes my life easy when dealing with daily DBA tasks. I learned awk almost 10 years ago and I admit that it was tough to get used to its syntax. However, once I knew few basics, it was fun.

Here is an example of a task that most of us could easily relate to. awk handles with ease such tasks for us. “Generate a list of something (think database objects as an example)  and take the same action (viz., runstats/reorg) on each of them.”

I plan on writing intro-level awk blog posts in future. This post is not a basic introduction to awk and I assume you are atleast familiar with awk’s fields and its basic syntax. In this post, we will look at into working with quotes in awk. Quotes in awk are special characters and need to be dealt with caution and care. I will illustrate this with an example.

Sample Problem:

Suppose you have a file with schema in the 1st field and table name in the 2nd field. To keep things simple, assume that they are separated by ‘space’. Here is a sample:

SCHEMA1 TABLE1
SCHEMA2 TABLE2
SCHEMA3 TABLE3
SCHEMA4 TABLE4
SCHEMA5 TABLE5

Here is an exercise for this blog post that we will work on:

“For all the tables in the file, print schema, table and primary key columns”.

SQL to list columns in a Primary Key:

Let us take a quick detour and look at how primary key columns could be extracted from the catalog view syscat.indexes. Below is the SQL that can give this information for us one table at a time. Note that I am only trimming each field for better readability of the output.

$db2 "select char(tabschema, 20), char(tabname, 60), char(colnames,200) from syscat.indexes where tabschema='SCHEMA1' and tabname='TABLE1' and uniquerule = 'P'"

Assuming the schema/table list in a file, we could pass each line to awk and let awk print the SQL for each schema/table combination. Note that in SQL, we need ‘quotes’ around schema/table names.. Our flat file does not have quotes in it. Most of the discussion in this post would be handling quotes in awk.

Combining awk and SQL:

We will first ‘read’ (‘cat’) the file and then pipe the contents to awk. As part of awk command, for each set of schema and table, we will print a SQL that when executed will give us schema, table and it’s primary key.

Note that $1 is the 1st field in the file (schema in this case) and $2 is the 2nd field in the file (table in this case).

$cat list | awk '{print "select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema = "$1" and tabname = "$2 " and UNIQUERULE = 'P';"}'

select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema = SCHEMA1 and tabname = TABLE1 and UNIQUERULE = P;
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema = SCHEMA2 and tabname = TABLE2 and UNIQUERULE = P; 

select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema = SCHEMA3 and tabname = TABLE3 and UNIQUERULE = P; 

select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema = SCHEMA4 and tabname = TABLE4 and UNIQUERULE = P; 

select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema = SCHEMA5 and tabname = TABLE5 and UNIQUERULE = P; 

We are almost there except that in SQL, we need to enclose object names (type CHAR) in single quotes.

1) Is it as simple as enclosing each field ($1, $2 etc) in single quotes? This does not work.

Note: In the following examples, I am removing UNIQUERULE = ‘P’ clause for brevity.

$cat list | awk '{print "select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema = "'$1'" and tabname = "'$2'";"}'

select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema =  and tabname = ;
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema =  and tabname = ;
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema =  and tabname = ;
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema =  and tabname = ;
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema =  and tabname = ;

2) How about escaping the single quotes (around the field numbers) with a ‘\’. This does not work either and results in a bad syntax.

$cat list | awk '{print "select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema = "\'$1\'" and tabname = "\'$2\'";"}'
>

Solution – awk’s Assignment Operator:

In awk, there is more than one way to skin the cat. Of all the options I looked at, the easiest for me was to use awk’s ‘Assignment’ operator/option (-v). Also, this worked in both ‘bash’ and ‘ksh’ shells.

Below is an extract from ‘man’ page for awk:

-v Assignment
Assigns a value to a variable for the awk command’s programming language. The Assignment parameter is in the form of Name = Value. The Name portion specifies the name of the variable and can be any combination of underscores, digits, and alphabetic characters, but it must start with either an alphabetic character or an underscore. The Value portion is also composed of underscores, digits, and alphabetic characters, and is treated as if it were preceded and followed by a ” (double-quotation character, similar to a string value). If the Value portion is numeric, the variable will also be assigned the numeric value.

Here is an example that uses ‘Assignment’ option (-v) in awk. We assign single quote to ‘x’ using -v option.

Read this as: awk -v x=doublequote singlequote doublequote
Also, note that the field numbers ($1, $2) have been preceded and followed by our assignment variable ‘x’ (without quotes)

$cat list | awk -v x="'" '{print "select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema="x$1x" and tabname =" x$2x"  and uniquerule = "x"P"x}'

select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema='SCHEMA1' and tabname ='TABLE1'  and uniquerule = 'P'
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema='SCHEMA2' and tabname ='TABLE2'  and uniquerule = 'P'
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema='SCHEMA3' and tabname ='TABLE3'  and uniquerule = 'P'
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema='SCHEMA4' and tabname ='TABLE4'  and uniquerule = 'P'
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema='SCHEMA5' and tabname ='TABLE5'  and uniquerule = 'P'

What if you want to enclose schema name and/or table name in ‘double quotes’? Would enclosing a double quote between double quotes work? Actually, you get an error when you do that.

$cat list | awk -v x=""" '{print "select char(tabschema, 20), char(tabname, 60), char(colnames,200) from syscat.indexes where tabschema="x$1x" and tabname="x$2x";"}'
bash: syntax error near unexpected token `('

How to get around this?

Option 1:

Enclose double quotes with a single quote (read it as singlequote doublequote singlequote)

Note: In the following examples, I am removing UNIQUERULE = ‘P’ clause for brevity.

$cat list | awk -v x='"' '{print "select char(tabschema, 20), char(tabname, 60), char(colnames,200) from syscat.indexes where tabschema="x$1x" and tabname="x$2x";"}'select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema="SCHEMA1" and tabname ="TABLE1";
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema="SCHEMA2" and tabname ="TABLE2";
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema="SCHEMA3" and tabname ="TABLE3";
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema="SCHEMA4" and tabname ="TABLE4";

Option 2:

Escape the new double quotes with a (back slash \) starting and ending with double quotes as we did earlier.

Here is how you do it.

$cat list | awk -v x="\"" '{print "select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema="x$1x" and tabname =" x$2x";"}'
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema="SCHEMA1" and tabname ="TABLE1";
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema="SCHEMA2" and tabname ="TABLE2";
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema="SCHEMA3" and tabname ="TABLE3";
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema="SCHEMA4" and tabname ="TABLE4";
select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema="SCHEMA5" and tabname ="TABLE5";

So, this also means we just found a new way to handle single quotes as well. Just ‘escape’ single quote with a back slash just like we did for double quotes.

$cat list | awk -v x="\'" '{print "select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema="x$1x" and tabname =" x$2x";"}'

awk, SQL and DB2

We can either re-direct the above command’s output to a SQL file and then run it as ‘db2 -tvf sqlfile ‘ making it a 2-step approach or pipe the output (from awk) to db2 and then run it directly.

Approach 1:
$cat list | awk -v x="\'" '{print "select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema="x$1x" and tabname =" x$2x";"}' > sqlfile
$db2 -tvf sqlfile | tee -a logfile

Approach 2:
In this approach, we will remove the trailing ‘;’ to parse the output directly to DB2.

$cat list | awk -v x="\'" '{print "select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema="x$1x" and tabname =" x$2x}' | db2 -v | tee -a logfile
Advertisements

2 thoughts on “DBA tip – Easy way to handle single quotes in awk

  1. Pingback: DB2 LUW – Handling database objects with mixed case and special characters in their names | db2talk

  2. Pingback: A primer in awk for DB2 LUW DBAs | db2talk

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s