Home » SQL & PL/SQL » SQL & PL/SQL » The cost of FILTER operations (DB 19.x)
The cost of FILTER operations [message #680557] Thu, 21 May 2020 04:47 Go to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
A FILTER operation with more than one child is iterative, and therefore a Bad Thing. Example:
orclz> select ename from emp where deptno in (select deptno from dept where dname='ACCOUNTING') or ename in (select ename from bonus);

ENAME
----------
CLARK
KING
MILLER

orclz> select * from table(dbms_xplan.display_cursor(format=>'allstats last cost'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  7rhqqypqsp1uq, child number 0
-------------------------------------
select ename from emp where deptno in (select deptno from dept where
dname='ACCOUNTING') or ename in (select ename from bonus)

Plan hash value: 1445856646

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |      1 |        |     6 (100)|      3 |00:00:00.01 |      91 |
|*  1 |  FILTER                      |         |      1 |        |            |      3 |00:00:00.01 |      91 |
|   2 |   TABLE ACCESS FULL          | EMP     |      1 |     14 |     3   (0)|     14 |00:00:00.01 |       8 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| DEPT    |      3 |      1 |     1   (0)|      1 |00:00:00.01 |       6 |
|*  4 |    INDEX UNIQUE SCAN         | PK_DEPT |      3 |      1 |     0   (0)|      3 |00:00:00.01 |       3 |
|*  5 |   TABLE ACCESS FULL          | BONUS   |     11 |      1 |     2   (0)|      0 |00:00:00.01 |      77 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(( IS NOT NULL OR  IS NOT NULL))
   3 - filter("DNAME"='ACCOUNTING')
   4 - access("DEPTNO"=:B1)
   5 - filter("ENAME"=:B1)


26 rows selected.

orclz>
In principle, every row returned by the first child will call the other chidren. I say "in principle" because there are some optimizations. In this case, the first filtering subquery is called only 3 times because there are only 3 distinct values of dept_id in emp and the CBO caches the result for each one. The second filtering subquery is called only for the 11 emp rows that have not already been selected by the first filter.

First question: How is the cost calculated? I would have expected it to be 3 + (3*1) plus (11*2) = 28, assuming that the stat's are good enough for the CBO to guess the number of iterations. It seems to be assuming no iterations at all.
Second: approaches for tuning. An obvious one is a compound query, such as
orclz> select ename from emp join dept using(deptno) where dname='ACCOUNTING'
  2  union
  3  select ename from emp join bonus using(ename);

ENAME
----------
CLARK
KING
MILLER

orclz> select * from table(dbms_xplan.display_cursor(format=>'allstats last cost'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  74b0dzp192gsp, child number 0
-------------------------------------
select ename from emp join dept using(deptno) where dname='ACCOUNTING'
union select ename from emp join bonus using(ename)

Plan hash value: 2067608535

----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name      | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |           |      1 |        |     9 (100)|      3 |00:00:00.01 |      14 |       |       |          |
|   1 |  SORT UNIQUE                   |           |      1 |      6 |     9  (23)|      3 |00:00:00.01 |      14 |  2048 |  2048 | 2048  (0)|
|   2 |   UNION-ALL                    |           |      1 |        |            |      3 |00:00:00.01 |      14 |       |       |          |
|   3 |    MERGE JOIN                  |           |      1 |      5 |     6  (17)|      3 |00:00:00.01 |       8 |       |       |          |
|*  4 |     TABLE ACCESS BY INDEX ROWID| DEPT      |      1 |      1 |     2   (0)|      1 |00:00:00.01 |       2 |       |       |          |
|   5 |      INDEX FULL SCAN           | PK_DEPT   |      1 |      4 |     1   (0)|      4 |00:00:00.01 |       1 |       |       |          |
|*  6 |     SORT JOIN                  |           |      1 |     14 |     4  (25)|      3 |00:00:00.01 |       6 |  2048 |  2048 | 2048  (0)|
|   7 |      TABLE ACCESS FULL         | EMP       |      1 |     14 |     3   (0)|     14 |00:00:00.01 |       6 |       |       |          |
|   8 |    NESTED LOOPS SEMI           |           |      1 |      1 |     2   (0)|      0 |00:00:00.01 |       6 |       |       |          |
|   9 |     TABLE ACCESS FULL          | BONUS     |      1 |      1 |     2   (0)|      0 |00:00:00.01 |       6 |       |       |          |
|* 10 |     INDEX RANGE SCAN           | EMP_ENAME |      0 |     14 |     0   (0)|      0 |00:00:00.01 |       0 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("DEPT"."DNAME"='ACCOUNTING')
   6 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")
       filter("EMP"."DEPTNO"="DEPT"."DEPTNO")
  10 - access("EMP"."ENAME"="BONUS"."ENAME")


31 rows selected.

orclz>
The cost arithmetic makes perfect sense for this one. Any better options?

(This is not a real problem, just general research)
Re: The cost of FILTER operations [message #680558 is a reply to message #680557] Thu, 21 May 2020 05:46 Go to previous message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Quote:
First question: How is the cost calculated?
A trace 10053 (level 1) will tell you how the optimizer computes the plan.

Note that the optimizer will automatically change the OR to a UNION [ALL] if its computation shows it has a lower cost.

Previous Topic: driving_site hint
Next Topic: JSON Column for A Single Element with Potentially Multiple Values?
Goto Forum:
  


Current Time: Thu Mar 28 05:03:59 CDT 2024