APPROX_COUNT_DISTINCT in Oracle 12c.

Most of time during performance analysis to understand Stale stats, we usually need to check estimated NDV (Number of Distinct values).
If data-set is huge , it take considerable time to project desired output using traditional COUNT(DISTINCT) appraoch.
For any such analysis we usually need to understand approximate or estimated distribution of columns values to understand SELECTIVITY AND CARDINALITY on any functional predicates.

With Oracle 12c (12.1.0.2), we have a function “APPROX_COUNT_DISTINCT” , which is claim to faster then tradition COUNT(DISTINCT <>) approach to get an idea on NDV.

Let try to understand same with field work!

--Creating a table to 1000,5000,10000 NDV for some columns.
CREATE TABLE TEST_APROX_CNT_DIS AS
(SELECT ROWNUM AS COL1 ,
MOD(ROWNUM ,1000) COL2 , --NDV 1000
MOD(ROWNUM ,5000) COL3 , --NDV 5000
MOD(ROWNUM ,10000) COL4 --NDV 10000
FROM DUAL
CONNECT BY LEVEL < 1000000
);

For different distinct values, we will compare elapsed time with traditional approach.
APPROX_COUNT_DISTINCT, as name implies wont give exact NDV for a column.
It will calculate an approximate values to get an understanding of column selectivity.

Below is compassion of values projected for different distinct values.
COUNT(DISTINCT COL2) APPROX_COUNT_DISTINCT(COL2)
1000                       1012

COUNT(DISTINCT COL3) APPROX_COUNT_DISTINCT(COL3)
5000                                       4835

COUNT(DISTINCT COL4) APPROX_COUNT_DISTINCT(COL4)
10000                                    10050

From performance point of view, APPROX_COUNT_DISTINCT was always faster then usual COUNT (DISTINCT ) approach for checking column NDV.

SELECT COUNT(DISTINCT COL2) FROM TEST_APROX_CNT_DIS;
Elapsed: 00:00:00.091

SELECT APPROX_COUNT_DISTINCT(COL2) FROM TEST_APROX_CNT_DIS;
Elapsed: 00:00:00.055

SELECT COUNT(DISTINCT COL3) FROM TEST_APROX_CNT_DIS;
Elapsed: 00:00:00.105

SELECT APPROX_COUNT_DISTINCT(COL3) FROM TEST_APROX_CNT_DIS;
Elapsed: 00:00:00.059

SELECT COUNT(DISTINCT COL4) FROM TEST_APROX_CNT_DIS;
Elapsed: 00:00:00.105

SELECT APPROX_COUNT_DISTINCT(COL4) FROM TEST_APROX_CNT_DIS;
Elapsed: 00:00:00.055

Let try to compare underlying execution plan for both approach.

COUNT - DISTINCT
------------------------------------------------------------
| Id | Operation | Name | Rows |
------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
| 1 | SORT AGGREGATE | | 1 |
| 2 | VIEW | VW_DAG_0 | 1000 |
| 3 | HASH GROUP BY | | 1000 |
| 4 | TABLE ACCESS FULL| TEST_APROX_CNT_DIS | 999K|
------------------------------------------------------------

APPROX_COUNT_DISTINCT
------------------------------------------------------------
| Id | Operation | Name | Rows |
------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
| 1 | SORT AGGREGATE APPROX| | 1 |
| 2 | TABLE ACCESS FULL | TEST_APROX_CNT_DIS | 999K|
------------------------------------------------------------

With APPROX_COUNT_DISTINCT we got an new Aggregation operation in execution plan “SORT AGGREGATE APPROX”.
As compare to tradition approach, we would need to perform GROUP BY and then AGGREGATION operation to serve COUNT and DISTINCT result.

If for any analysis, we just need an estimate of NDV with acceptable variation from actual values.
APPROX_COUNT_DISTINCT is the function to be used.

About Deepak Mahto

Having 8+ Years of relevant/exciting/fruitful/challenging/learning in Oracle Technology. Currently working as Oracle Performance Consultant in an MNC. Hold Expertise in Performance Engineering, SQL Tuning, Database Tuning, SQL, PLSQL Development/Design Considerations and Oracle Internals. I enjoy reading on Oracle internals, posting some of my finding or observations and playing football.
This entry was posted in Oracle 12c New Features, Oracle Internal, Oracle Performance and tagged , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s