DB2 10.5 — Catalog data for Expression-Based Indexes

One of the new features in DB2 10.5 is ‘Expression-Based Indexes’. This gives ability to create index on data that is not stored in a table. For expression-based indexes, the result of the expression is stored in the index.

Usually, when an expression (example: UPPER(LASTNAME)) is applied on a column, DB2 does not use the index although the column LASTNAME is indexed. To lower the cost of such queries, one approach was to store the value of the expression (UPPER(LASTNAME)) as an additional column and indexing such a column. This approach meant that the SQL (sometimes) had to be modified to use the generated column instead of the main column.
The main value proposition of expression-based indexes is that the SQL need not be changed. In other words, SQL could be written to have the ‘LASTNAME’ column; When DB2 detects UPPER(LASTNAME) in the ‘where’ clause, DB2 would choose the newly created expression-based when appropriate.
I started to look into expression-based indexes when a UNIX script that I wrote (for DB2 9.7) to calculate ‘Index Cardinality’ was not very useful in DB2 10.5. This is because of change in (system) catalog data in DB2 10.5. In this blog post, we will look into this topic.

Prior to DB2 10.5

If we want to look at which columns are indexed, we could use the SQL below. Note that I have Index Cardinality information as well. Usually, I would want to look at this information as well when I am researching a performance problem related to DML operations.


db2 "select char(indname, 20) as indname, 
char(colnames,20) as colnames, 
fullkeycard, 
b.card as tbcard, 
decimal(float((a.fullkeycard)/float(b.card+1))*100,5,2) as ratio, 
A.lastused 
from syscat.indexes A inner join syscat.tables B 
on A.tabschema=B.tabschema and 
A.tabname=B.tabname 
where A.tabschema = 'DB2INST1' 
and A.tabname = 'TABLE1' 
with ur"

INDNAME              COLNAMES    FULLKEYCARD  TBCARD   RATIO   LASTUSED
-------------------- -------------------- --------------------- -------
PK_DB2INST1_TABLE1   +USERID      198492      198492   99.99 01/01/0001
I1_FIRSTNAME         +FIRSTNAME   23952       198492   12.06 01/01/0001
I1_LASTNAMENAME      +LASTNAME    67368       198492   33.93 01/01/0001

  3 record(s) selected.

Starting DB2 10.5

As discussed, DB2 10.5 has a new feature that allows indexes to be created on an expression.
We will create couple of expression-based indexes and later we will attempt to use the above SQL to extract cardinality data for indexes.


db2 "create index i1_firstname on db2inst1.table1(firstname)"
DB20000I  The SQL command completed successfully.

db2 "create index i1_lastnamename on db2inst1.table1(lastname)"
DB20000I  The SQL command completed successfully.

db2 "create index i1_lower_firstname on db2inst1.table1 (lower(firstname))"
DB20000I  The SQL command completed successfully.

db2 "create index i1_lower_lastname on db2inst1.table1 (lower(lastname))"
DB20000I  The SQL command completed successfully.

db2 "runstats on table DB2INST1.TABLE1 with distribution on all columns and detailed indexes all allow write access"
DB20000I  The RUNSTATS command completed successfully.


db2 "select char(indname, 20) as indname, 
char(colnames,20) as colnames, 
fullkeycard, b.card as tbcard, 
decimal(float((a.fullkeycard)/float(b.card+1))*100,5,2) as ratio, 
A.lastused 
from syscat.indexes A inner join syscat.tables B 
on A.tabschema=B.tabschema and 
A.tabname=B.tabname 
where A.tabschema = 'DB2INST1' 
and A.tabname = 'TABLE1' 
with ur"

INDNAME           COLNAMES      FULLKEYCARD  TBCARD   RATIO   LASTUSED
-------------------- --------------------------------- ------------ ----------
PK_DB2INST1_TABLE1   +USERID       198492    198492   99.99 08/11/2015
I1_FIRSTNAME         +FIRSTNAME     23952    198492   12.06 08/11/2015
I1_LASTNAME          +LASTNAME      67368    198492   33.93 01/01/0001
I1_LOWER_FIRSTNAME   +K00           1466     198492   10.81 08/11/2015
I1_LOWER_LASTNAME    +K00           60375     198492   30.41 08/11/2015

  5 record(s) selected.

While most of the above output looks good, column names for newly created ‘Expression-Based’ indexes seem to be incorrect. They are being shown as +K00 which is not very useful as we want to know column names for these indexes.

SYSCAT.INDEXES Catalog View in DB2 10.5

When I researched into this, I noticed this statement in IBM Knowledge Center for DB2 10.5

 
Column Name Data Type Nullable Description
INDSCHEMA VARCHAR (128) Schema name of the index.
INDNAME VARCHAR (128) Unqualified name of the index.
OWNER VARCHAR (128) Authorization ID of the owner of the index.
…………..
COLNAMES VARCHAR (640) This column is no longer used and will be removed in the next release. Use SYSCAT.INDEXCOLUSE for this information.

Querying SYSCAT.INDEXCOLUSE for our test table, we see the following:


db2 "SELECT CHAR(INDNAME, 60) AS INDNAME, 
CHAR(COLNAME, 20) AS COLNAME, 
TEXT 
FROM SYSCAT.INDEXCOLUSE 
WHERE
INDNAME LIKE '%DB2INST1%'  WITH UR"

INDNAME                        COLNAME         TEXT
 -------------------  ------------------------ ---------------
 PK_DB2INST1_TABLE1             USERID          NULL
 I1_DB2INST1_FIRSTNAME          FIRSTNAME       NULL
 I1_DB2INST1_LASTNAME           LASTNAME        NULL
 I1_DB2INST1_LOWER_FIRSTNAME    K00             LOWER(FIRSTNAME)
 I1_DB2INST1_LOWER_LASTNAME     K00             LOWER(LASTNAME)

DB2 10.5 — SQL to extract Column names and Index Cardinality

The following SQL is a modification of the SQL that was used (for DB2 versions prior to DB2 10.5) to query DB2 catalogs to list indexes, their cardinality along with column names needs to be modified to work for expression-based indexes. System object SYSCAT.INDEXCOLUSE is added.


db2 "SELECT CHAR(IND.INDNAME, 30) AS INDNAME, 
CHAR(COALESCE (USE.TEXT, IND.COLNAMES),20) AS COLNAMES, 
FULLKEYCARD, 
TABS.card as tbcard, 
decimal(float((ind.fullkeycard)/float(tabs.card+1))*100,5,2) as ratio, 
IND.LASTUSED 
FROM SYSCAT.INDEXCOLUSE USE INNER JOIN 
SYSCAT.INDEXES IND 
ON IND.INDNAME = USE.INDNAME 

INNER JOIN SYSCAT.TABLES TABS 
ON TABS.TABNAME = IND.TABNAME 
WHERE TABS.TABNAME = 'TABLE1' 
AND 
TABS.TABSCHEMA = 'DB2INST1' WITH UR"

INDNAME              COLNAMES      FULLKEYCARD    TBCARD   RATIO   LASTUSED
------------------------------ ------------------ -------- -------- ----------
PK_DB2INST1_TABLE1  +USERID            198492     198492   99.99 08/11/2015
I1_FIRSTNAME        +FIRSTNAME         23952      198492   12.06 08/11/2015
I1_LASTNAME         +LASTNAME          67368      198492   33.93 01/01/0001
I1_LOWER_FIRSTNAME  LOWER(FIRSTNAME)  21466     198492   10.81 08/11/2015
I1_LOWER_LASTNAME   LOWER(LASTNAME)   60375     198492   30.41 08/11/2015

  5 record(s) selected.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s