Gather Stats on Function based filter condition in oracle

For one of performance improvement, identify root cause for bad execution plan was incorrect cardinality estimation on function based condition on a table column.

As part of recommendation, we just needed to get estimated cardinality improve on table filter conditions.

Let walk through an example to understand it better.
CREATE TABLE FUNC_BASE_STATS AS
SELECT * FROM DBA_OBJECTS;

EXEC DBMS_STATS.GATHER_TABLE_STATS('DMAHTO','FUNC_BASE_STATS');

SELECT NUM_ROWS FROM DBA_TABLES WHERE TABLE_NAME = 'FUNC_BASE_STATS';
Output :: 89980

SELECT NUM_DISTINCT FROM DBA_TAB_COLUMNS WHERE TABLE_NAME = 'FUNC_BASE_STATS'
AND COLUMN_NAME = 'OBJECT_NAME';
NDV on column OBJECT_NAME := 54396

Now we will apply UPPER function on “OBJECT_NAME” and take a look on estimated cardinality.
EXPLAIN PLAN FOR
SELECT * FROM FUNC_BASE_STATS
WHERE UPPER(OBJECT_NAME) LIKE 'TEST';

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);

Plan hash value: 1875800198
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 900 | 103K| 291 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| FUNC_BASE_STATS | 900 | 103K| 291 (1)| 00:00:04 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(UPPER("OBJECT_NAME")='TEST')

Optimizer estimate 900 rows would get projected as part of function filter condition.
As Function based condition is present, oracle estimation on same might not be near to actual rows.

Lets check total distinct values for function expression.

SELECT COUNT( DISTINCT UPPER(OBJECT_NAME)) FROM FUNC_BASE_STATS;
NDV for UPPER(OBJECT_NAME) := 54773

So ideally estimation for any condition similar to “UPPER(OBJECT_NAME) LIKE ‘TEST'”, should be
89980(NUM_ROWS)/54773(NDV) := 1.64 approxi.

Current estimated cardinality is far offset then actual rows.
To improve cardinality estimations, we will opt for creation of extended stats on function expression.

Using extended stats utility functions, we will get new columns added within table as hidden.

SELECT DBMS_STATS.CREATE_EXTENDED_STATS('DMAHTO','FUNC_BASE_STATS','(UPPER(OBJECT_NAME))')
FROM DUAL;
Output :: SYS_STUUZF$GSUU#20JQDJBSQ5DK_Y

SELECT TABLE_NAME , COLUMN_NAME , HIDDEN_COLUMN FROM DBA_TAB_COLS WHERE TABLE_NAME = 'FUNC_BASE_STATS'
AND COLUMN_NAME = 'SYS_STUUZF$GSUU#20JQDJBSQ5DK_Y' ;

TABLE_NAME COLUMN_NAME HIDDEN_COLUMN
--------------- ------------------------------ -------------
FUNC_BASE_STATS SYS_STUUZF$GSUU#20JQDJBSQ5DK_Y YES

Now let gather stats on new hidden column for it to be effective.
EXEC DBMS_STATS.GATHER_TABLE_STATS('DMAHTO','FUNC_BASE_STATS', METHOD_OPT => 'FOR ALL HIDDEN COLUMNS SIZE 1');

Lets check new underlying stats fpr newly extended stats created column.

SELECT NUM_DISTINCT FROM DBA_TAB_COLS WHERE TABLE_NAME = 'FUNC_BASE_STATS'
AND COLUMN_NAME = 'SYS_STUUZF$GSUU#20JQDJBSQ5DK_Y' ;
Output := 53988

Now NDV for Function based condition is more near to actual function processed data.
EXPLAIN PLAN FOR
SELECT * FROM FUNC_BASE_STATS
WHERE UPPER(OBJECT_NAME) LIKE 'TEST';

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);

PLAN HASH VALUE: 1875800198

-------------------------------------------------------------------------------------
| ID | OPERATION | NAME | ROWS | BYTES | COST (%CPU)| TIME |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 246 | 291 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| FUNC_BASE_STATS | 2 | 246 | 291 (1)| 00:00:04 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(UPPER("OBJECT_NAME")='TEST')

By creating extended stats on function based expression and gathering stats for hidden columns.
We were able to achieve correct cardinality estimation for function based conditions.

About Deepak Mahto

Having 8+ Years of relevant/exciting/fruitful/challenging/learning in Oracle Technology. Currently working as Oracle Performance Consultant in an MNC. Hold Expertise in Performance Engineering, SQL Tuning, Database Tuning, SQL, PLSQL Development/Design Considerations and Oracle Internals. I enjoy reading on Oracle internals, posting some of my finding or observations and playing football.
This entry was posted in Oracle Internal and tagged , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s