Does Distinct operation in oracle return sort data?

When started my learning with Oracle on Pre 10.2 version, I personally use to rely on distinct to do ascending sorting.At that point was not aware of internal concepts, implicit sorting and distinct correlate as per an underscore/internal parameter (post 10 R2).

Let start with basic questions, does below SQL would return sorted Data.
SQL> select distinct col1 from tab_test;

As of now most of us would say NO but again it DEPENDS!

Lets runs the SQL…..
SQL> desc tab_test;
Name Null? Type
------- -------- ---------
COL1 NUMBER

SQL> select distinct col1 from tab_test;

COL1
----------
1
22
25
30
34
...........
82
92
98

100 rows selected.

So we can clearly see, Data return is not sorted but surely distinct.Over here… An internal oracle parameter come into the picture “_gby_hash_aggregation_enabled”!!

Parameter introduced in oracle 10.2.0.1 version.It imposed new implementation for grouping/aggregation and distinct by Efficient Hash Scheme.By default, _gby_hash_aggregation_enabled parameter is set as TRUE.

Let check underlying plan for the above run of SQL.
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------
SQL_ID 8wk25ca7fqazh, child number 0
-------------------------------------
select distinct col1 from tab_test

Plan hash value: 533710857
-----------------------------------------------
| Id | Operation | Name | Rows |
-----------------------------------------------
| 0 | SELECT STATEMENT | | |
| 1 | HASH UNIQUE | | 100 |
| 2 | TABLE ACCESS FULL| TAB_TEST | 100 |
-----------------------------------------------

With these new parameter, in the plan we have “HASH UNIQUE”!!.
It makes use of hashing schema and guarantee unique data, but no sorting!.

Now lets try to alter “_gby_hash_aggregation_enabled” parameter as FALSE.

SQL> ALTER SESSION SET "_gby_hash_aggregation_enabled" = False;
Session altered.

SQL> select distinct col1 from tab_test;
COL1
----------
1
2
3
4
...
98
99
100

100 rows selected.

So now it is returning sorted data.! 🙂
It is purely because of the underlying execution plan when parameter is set as FALSE.

PLAN_TABLE_OUTPUT

SQL_ID 8wk25ca7fqazh, child number 1
-------------------------------------
select distinct col1 from tab_test
Plan hash value: 2068745886
-----------------------------------------------
| Id | Operation | Name | Rows |
-----------------------------------------------
| 0 | SELECT STATEMENT | | |
| 1 | SORT UNIQUE | | 100 |
| 2 | TABLE ACCESS FULL| TAB_TEST | 100 |
-----------------------------------------------

It returns sorted unique data, because of “SORT UNIQUE” Operations.
Pre Oracle version 10.2.0.1, same was default behaviour. (Distinct –> Sorting)

Internal “_gby_hash_aggregation_enabled” Drives the implementation algorithm (Sort or Hash).
So next time when you use Distinct and require sorted data, do make sure we use ORDER BY Clause.
Instead of relying on internal params or Oracle version (Pre 10.2.0.1)!

Now let use both ORDER BY and DISTINCT and run the SQL with _gby_hash_aggregation_enabled as TRUE and check plan.

SQL>select distinct col1 from tab_test order by col1;

Let check plan.
Plan hash value: 2068745886
-------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |
-------------------------------------------------------
| 0 | SELECT STATEMENT | | | |
| 1 | SORT UNIQUE | | 100 | 300 |
| 2 | TABLE ACCESS FULL| TAB_TEST | 100 | 300 |
-------------------------------------------------------
(same plan applicable for order by col desc also)

Though _gby_hash_aggregation_enabled was set as default (TRUE), here our friend Optimizer has used SORT UNIQUE, which will perform distinct and sorting both!!

So internally Optimizer switch between “hash unique” and “sort unique” on what fits best!!

All above context are best as per my knowledge, but always verify and validate anything before blindly trusting!! 🙂

About Deepak Mahto

Having 8+ Years of relevant/exciting/fruitful/challenging/learning in Oracle Technology. Currently working as Oracle Performance Consultant in an MNC. Hold Expertise in Performance Engineering, SQL Tuning, Database Tuning, SQL, PLSQL Development/Design Considerations and Oracle Internals. I enjoy reading on Oracle internals, posting some of my finding or observations and playing football.
This entry was posted in Oracle Internal and tagged , , , , , , . Bookmark the permalink.

3 Responses to Does Distinct operation in oracle return sort data?

  1. gmesquitax says:

    Thank you for sharing. Just found about that today, after a lot of trouble identifying performance issues with a database that moved to Amazon RDS. By the way, RDS does not support underscore parameteres, so only apparent solution for databases inside RDS is query rewrite avoiding DISTINCT clause, or hint : /*+ NO_USE_HASH_AGGREGATION *

    Like

  2. Nice Blog, thanks for sharing this kind of information.

    Like

  3. Pingback: Analytical Function order of processing within SQL. | Oracle Insight and Performance concepts

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s