Using awk to format output from select statement in DB2 on AIX/Linux

In this blog post, I will share with you a simple way but effective way to format the output from a select statement at the command prompt.

Consider a table that that has 3 columns, all of varchar data type.

  $ db2 "describe table db2inst1.test"
                                Data type                     Column
Column name                     schema    Data type name      Length     Scale Nulls
------------------------------- --------- ------------------- ---------- ----- ------
COL1                            SYSIBM    VARCHAR                    100     0 Yes
COL2                            SYSIBM    VARCHAR                    120     0 Yes
COL3                            SYSIBM    VARCHAR                    180     0 Yes
  3 record(s) selected.

Below is how the output would for a simple select * statement.

select query output

By default, the output from an SQL statement is hard to read at a command prompt.
This is because by design, CLI does not format the output. Wouldn’t it be nice if the output was formatted to make it easy to read for the user.
We could format the output of a select query by using functions like substr or char or trim These functions help to trim trailing white spaces. While any of these do the job, they tend to make the SQL long. Sometimes, it is tiring to read SQLs with so many functions all over the place. Below are couple of examples of formatting the output using functions.

select using char function

select using substr

As you see from the above examples, it is not too difficult to format the output of a query by using functions. However, if you have 20+ columns in your SQL statement it takes tiring to write these functions for every column and the SQL itself gets verbose.

So, what is the alternative? Is there one? It just happens that awk can help us in this matter. If you are familiar with C’s output format specifiers, awk’s printf should look familiar.

For example,

awk '{printf "%-20s %-20s\n", $1,$2}'

would print the 1st field ($1) and 2nd field ($2) left justified by 20 spaces. I covered this in great detail in one of my previous blogs. Here is the URL if you would like to check it out. https://db2talk.com/2015/08/31/a-primer-in-awk-for-db2-luw-dbas/

By now, you would have figured out that our plan is to rely on awk’s printf statement to format the output from a select statement.


db2 "select * from TEST" | awk '{printf "%-20s  %-20s %s\n", $1, $2, $3}'

COL1                  COL2                 COL3
----------------------------------------------------------------------------------------------------  ------------------------------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2005-05-23            00:00:00             SPEDWD1
2                     POC_USER             HR
MK01871               SPEDWD1              HR
MI027                 SPEDWD1              HR
320249                SPEDWD1              HR
2011-08-08            00:00:00             SPEDWD1
HB08533               SPEDWD1              HR
CB561                 POC_USER             HR
5                     SPEDWD1              HR
HB07693               SPEDWD1              HR

10                    record(s)            selected.

We kind of achieved what wanted in that the output looks good. But, we have a fundamental problem. The output itself is not right. The value “2005-05-23 00:00:00” has been split into 2 parts and is being shown in 2 different columns which is not right. Same is the case with the value “2011-08-08 00:00:00”

We also have the last line “10 record(s) selected.” formatted in a really weird way.

Why are these happening? It happens that awk’s default field delimiter is space (single space) and “2005-05-23 00:00:00” “2011-08-08 00:00:00” have space between the date and time values. awk splits these values into 2 separate fields which is not what we wanted.

There is a way we could fix this. The solution is to tell awk to treat 2 spaces (instead of 1 space) as the field delimiter. Note that we can take this approach because there are no 2 spaces within a single column in our test data.

How do we tell awk to treat 2 spaces as the field delimitter? We rely a special variable called ‘FS’ which stands (as you might have guessed) for Field Separator.

Here is how we do it.


db2 "select * from TEST" | awk '{printf "%-20s  %-20s %s\n", $1, $2, $3}' FS=' {2,}'
COL1                  COL2                 COL3
---------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2005-05-23 00:00:00   SPEDWD1              HR
2                     POC_USER             HR
MK01871               SPEDWD1              HR
MI027                 SPEDWD1              HR
320249                SPEDWD1              HR
2011-08-08 00:00:00   SPEDWD1              HR
HB08533               SPEDWD1              HR
CB561                 POC_USER             HR
5                     SPEDWD1              HR
HB07693               SPEDWD1              HR

                      10 record(s) selected.

By adding FS=‘ {2,}’ , we tell awk to treat any ‘space’ (here the separating character is space) must be repeated at least 2 times. We do not specify anything after “,’ because we wanted the outer bound of number of spaces space to be infinite.

If we wanted the space to be restricted only 6 number of times, we would have had:
FS=‘ {2,6}’. For at least 3 spaces, it would be FS=’ {3,}’ and so on.. Hope you get the idea.

In a future blog post, I will share how we could write an awk script to avoid remembering this syntax of format specifiers. With an awk script, we will simply write the format specifiers into an awk script once and re-use multiple times. It is all about becoming efficient at what we do!

Advertisements

2 thoughts on “Using awk to format output from select statement in DB2 on AIX/Linux

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s