A primer in awk for DB2 LUW DBAs

Although awk and awkward start with the same three letters, awk by no means is awkward. Well, from the syntax of it (and I agree, sometimes, from its learning curve), it could be appear to be so. In this blog post, I share how awk’s most basic features could be powerful tools to get things done faster.

I use awk in many of my UNIX scripts that I write to automate routine tasks. In addition to this, I routinely write awk one-liners to save time. Many times, these one-liners are throw-away in that the same exact awk command might not be used again.

What is awk?

From Wikipedia, “AWK was created at Bell Labs in the 1970s,and its name is derived from the family names of its authors – Alfred Aho, Peter Weinberger, and Brian Kernighan”.
“AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.”

Fields and delimiters in awk:

A delimiter is a character or a set of characters that separate two fields. As awk mainly deals with text processing, delimiter for such text is important to be understood. By default, awk’s delimiter is ‘space’ or ‘tab’ (essentially white space). However, there are options within awk to explicitly define a delimiter for text that is being dealt with. If unexpected results are popping up, make sure the delimiter is being handled right.

In the below file ‘file1’, the delimiter is ‘space’. Each ‘field’ is separated by ‘space’. Note that the delimiter is the same for all the fields and through out the file. Dealing with multiple delimiters is possible in awk but is an advanced topic.

$ cat file1
SCHEMA1pr  TAB
SCHEMA2asdff MEDIUMTABLE
SCHEMA3adf  LONG_TABLE_NAME_THAT_HAS_A_LONG_NAME
S1 T1
S1ABC T1AB

Printing text – awk’s most basic function:

In awk,
$1 = 1st Field
$2 = 2nd Field
Note that $0 = the entire line

If nth field needs to be printed, just pass ‘$n’ as an input to the print function.

$ cat file1 | awk '{print $1}'
SCHEMA1pr
SCHEMA2asdff
SCHEMA3adf
S1
S1ABC

What if you want to print the output in a more formatted way? awk’s printf comes to resuce to do this. If you are familiar with printf statement in ‘C’, awk’s printf statement would be very familiar.

awk’s Print Vs. printf:

I have to admit that only recently I became aware of the power of ‘printf’ command in awk. While I was able to get most of my work done using ‘print’, ‘printf’ is one step further than ‘print’ in terms of printing a nice-looking output. ‘Printf’ lets us dictate by how much the output fields are to be separated.

$ cat t1 | awk '{printf "%-20s %-20s\n", $1,$2}'
SCHEMA1pr            TAB
SCHEMA2asdff         MEDIUMTABLE
SCHEMA3adf           LONG_TABLE_NAME_THAT_HAS_A_LONG_NAME
S1                   T1
S1ABC                T1AB

In the example above, awk does the following to format the output:
1) Left justify the text because we used ‘-‘; the text would have been right-justified if there was no ‘-‘
2) Reserve at least 20 spaces because we specified %20s

What if we used %10 instead of %20? Here is how the output would look like. When the 1st field has more than 10 characters, it encroaches and shifts the 2nd field to the right.

$ cat t1 | awk '{printf "%-10s %-20s\n", $1,$2}'
SCHEMA1pr  TAB
SCHEMA2asdff MEDIUMTABLE
SCHEMA3adf LONG_TABLE_NAME_THAT_HAS_A_LONG_NAME
S1         T1
S1ABC      T1AB

An important point to note is that while ‘print’ adds a next-line character after each line of output text, ‘printf’ does not do that. An explicit ‘\n’ is required if output lines need to be on different lines. This feature of ‘printf’ lets us print separate lines (from input) into one line.

Use cases for DBA

There could be hundreds if not thousands of ways in which DBA could take advantage of awk to simplify mundane/repetitive tasks. The best value-proposition from awk that I have found is that awk greatly reduced the amount of typing (and hence saves time) that I have to do.

Following are few cases that demonstrate how awk could be used in DBA’s work life to save time, essentially exemplifying the power of awk. Note that awk’s command needs to be modified to suit your exact needs and I am presenting examples to showcase awk’s capabilities.

1) Granting privileges to large number of tables based on a matching pattern :

Input: A static list of database objects and/or a dynamic list that could be generated from DB2’s system catalogs.
Suppose there are 100s of tables spread across multiple schemas and you would like to grant / revoke privileges. Instead of taking the painful approach of manually typing the table name at a time, we will first generate the list (as input for awk) and then unleash awk to work on the list.

Here is how it is done:

Generate a list of objects and write a file (alternately, this output could be sent to a pipe and then this list could be passed on to awk); Pass the list as input to awk; The 3rd and final step is to run the SQL statement that has grant statements.

$db2 –x “select char(tabschema, 20), char(tabname, 70) 
from syscat.tables where type = ‘T’ 
and tabschema like ‘%DV%’ 
with ur” > tables.list

$ cat tables.list | awk ‘{print “grant select, insert, 
update, delete on “$1”.”$2” to 
role developer;”} > grant.sql

 ### There would be a similar statement for revoke

$db2 –tvf grant.sql | tee grant.sql.log

The advantage of doing it in multiple steps is that it is easy to keep track of what objects got modified and/or we could use such a list for other tasks.

Could we write a one-liner for the above sequence of steps? Absolutely.
Note that I am only splitting the code into multiple-lines for better readability.

$db2 –x “select char(tabschema, 20), char(tabname, 70) 
from syscat.tables 
where type = ‘T’ and tabschema like ‘%DV%’ 
with ur” 

| awk ‘{print “grant select, insert, update, delete 
on “$1”.”$2” to role developer”} 
| db2 –v | tee –a grant.oneliner.log

Note that there is no ‘;’ after ‘developer’. This is because ‘;’ is not required as a delimiter in DB2 prompt mode.

2) When DDL is given, grant privileges to large number of tables :

Input: A DDL file

Note: This example demonstrates usage of awk’s field separator.

Imagine creating 100s of tables/views (example: cross platform database migration) and then granting privileges on such objects. For granting privileges, awk comes in handy in that we could take action on a large number of prepared lists of objects without handling each object separately.
For this, we will first read the ‘ddl’ file to create object list and then send that as input for awk. This is demonstrated in the example below. Note that we are using awk’s ‘Field Seperator’ operator ‘-F’. With –F ‘.’, we are asking awk to treat ‘.’ as field separator.

$cat $ddl | grep -i "create table" | awk '{print $3}' 
| awk -F '.' '{print "grant select, insert, update, delete on
 "$1" "$2” to role developer;”}' > grant.sql
$db2 –tvf grant.sql | tee grant.sql.log

3) Given object list, print catalog data :

Input: An object list is input for awk

Note: This example demonstrates how quotes are handled in awk

We will first ‘read’ (‘cat’) the file and then pipe the contents to awk. As part of awk command, for each set of schema and table, we will print an SQL that when executed will give us schema, table and its primary key.

Note that the WHERE clause in the SQL needs to have the fields enclosed in quotes. To accomplish this, we will use awk’s special assignment operator ‘-v’. With this operator, we ask awk to treat ‘x’ as a single quote. Such special handling is needed as single quote is a special character in awk and needs to be handled separately. I blogged about handling quotes in awk in a separate blog post on db2talk.

$cat list | awk -v x="'" '{print "select char(tabschema, 20), 
char(tabname, 60), char(colnames, 200) from syscat.indexes 
where tabschema="x$1x" and tabname =" x$2x}' > pk.sql

$db2 -tvf pk.sql | tee pk.sql.log

4) Alter large number of index/data tablespaces for tables in a schema

Note: This example demonstrates how awk’s loops come handy.

Note that we used awk’s built-in variable ‘NF’ which means ‘Number of Fields’. In the example below, we iterate over a list of fields in each line and take an action on each field (in this case, print the field (print $i)).

$db2 -x "select char(tbspace,20), char(index_tbspace,20) 
from syscat.tables where tabschema = ‘SCHEMA1’” > tbspaces.list

$cat tbspaces.list | awk '{ for(i=1; i<=NF; i++) print $i}' | tee list2
DVTBSP
DVTBSP_IX
QATBSP
QATBSP_IX 

$ cat list2 | awk '{print "alter tablespace "$1" autoresize yes;"}' > autoresize.sql

$ db2 -tvf autoresize.sql | tee autoresize.sql.log
alter tablespace DV_TBSP autoresize yes
DB20000I The SQL command completed successfully.

alter tablespace DV_TBSP_IX autoresize yes
DB20000I The SQL command completed successfully.

alter tablespace QA_TBSP autoresize yes
DB20000I The SQL command completed successfully.

alter tablespace QA_TBSP_IX autoresize yes
DB20000I The SQL command completed successfully.

Could we accomplish with a one-liner? Absolutely? Here it is:

$ db2 -x "select char(tbspace,20), char(index_tbspace,20) 
from syscat.tables where tabschema = ‘SCHEMA1'" 
| awk '{ for(i=1; i<=NF; i++) print $i}' 
| awk '{print "alter tablespace "$0" autoresize yes"}'
| db2 -v | tee -a alter.resize.oneliner.log

Conclusion:

In this blog post, we have looked at how awk could be used to simplify DBA’s tasks. This post is by no means comprehensive. If I have not covered any topic in awk that you routinely use, I would be eager to know and learn. I appreciate you sharing such ideas. Thank you for reading.

Advertisements

One thought on “A primer in awk for DB2 LUW DBAs

  1. Pingback: Creating a DB2 LUW (Linux, Unix and Windows) Sandbox database | db2talk

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s