Performance Tuning Based On Index Computer Science Essay

Published Date: 02 Nov 2017

Name: Shailee Dipak Mehta Matriculation Number: A0096013M

Lab2 deals with performance tuning based on index organized tables and clusters.

Lab Setup:

$> cd $HOME/lab

$> cp ~cs5226/lab/lab2.tar .

$> tar xvfp lab2.tar

$> cd lab2

$> sh rename.sh

rename.sh is a shell script that changes the 28th line of init.ora in the correct directory and changes the YOUR_SID variable in this file with the current account id which will be my matriculation number without the last character here.

To start sql client, first command given below was run. Oracle database instance can be created by the second command mentioned below.

$> sqlplus / as sysdba

SQL> startup pfile=init.ora

setup.sql file is run after this which creates a new user named lab2user1 after deleting any such user if any. It grants the user privileges for unlimited space allowance or quota in the tablespace and also rights to connect and create resources in the schema.

SQL> @setup

SQL> set linesize 300;

The set linesize command sets the total number of characters that can be displayed on SQL*Plus screen in one line. Here it is set to 300 characters per line for proper output display after running the set autotrace command.

Question1: Index-organized tables

q1create.sql creates table lab2user1.item. item_id is made the primary key of this table.

SQL> @q1create

Above created table is populated with 10000 rows by q1load.sql.

SQL> @q1load

Set autotrace command is used to determine the estimated cost of the query written in q1query.sql. This command automatically generates a report on the execution path used by SQL optimizer and the statement execution statistics. Successful execution of all SQL DML statements is reported in this report. Various options are available to use with set autotrace command. Here we are using set autotrace on explain command which reports only the optimizer execution path so that we can show the cost estimated for running this query.

Basically, the report generated by this command is divided into 3 parts namely the userâ€™s query output, the optimizer execution path and the SQL statement execution STATISTICS of the execution. We just want to determine the cost estimated for the query and hence set autotrace explain would suffice. We can also use set autotrace on and set autotrace traceonly commands where the former includes all the three parts in the report while the latter omits the userâ€™s output and reports the optimizer execution path and execution STATISTICS. The output by command set autotrace traceonly is shown below.

SQL> set autotrace on explain

SQL> @q1query

301 rows selected.

Execution Plan

---------------------------------------------------------------------------------------------------

Plan hash value: 3354656151

----------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time

------------------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 301 | 12943 | 9 (0) | 00:00:01 |

|* 1 | TABLE ACCESS FULL| ITEM | 301 | 12943 | 9 (0) | 00:00:01 |

-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------------------------

1 - filter("ITEM_ID">=200 AND "ITEM_ID"<=500)

Note

---------------------------------------------------------------------

- dynamic sampling used for this statement

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

59 consistent gets

\

0 physical reads

0 redo size

9979 bytes sent via SQL*Net to client

685 bytes received via SQL*Net from client

22 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

301 rows processed

As shown in the above output obtained and explanation given above, the userâ€™s query output is suppressed when traceonly command is used. The output also shows that the cost estimated for running the query q1query.sql is 9.

Autotrace command can be disabled by set autotrace off command .

SQL> set autotrace off

The estimated cost of running q1query.sql can be reduced if we rebuild the table lab2user1.item as an index-organized table. Following steps are followed to rebuild table lab2user1.item table.

Drop table command drops the existing table

SQL> drop table lab2user1.item;

Table dropped.

Second command recreates the table lab2user1.sql as an index-organized table.

SQL> create table lab2user1.item (

item_id number,

item_name varchar2(30),

price number,

CONSTRAINT pk_lab2user1_item PRIMARY KEY (item_id))

ORGANIZATION INDEX

TABLESPACE USERS;

Table created.

/*********************************************************************/

Here as shown above, a table can be rebuilt by adding a line-ORGANIZATION INDEX in the create table command and also naming the index created on the primary key made on item_id.

After this again the table is populated with 10000 rows by running q1load.sql.

SQL> @q1load

Again set autotrace command is used to record the estimated cost after rebuilding the table. Query q1query.sql is then run. Following results are obtained:

SQL> set autotrace traceonly

SQL> @q1query

301 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 667199629

-------------------------------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

--------------------------------------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 301 | 12943 | 2 (0) | 00:00:01 |

|* 1 | INDEX RANGE SCAN| PK_LAB2USER1_ITEM | 301 | 12943 | 2 (0) | 00:00:01 |

--------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

----------------------------------------------------------------------

1 - access("ITEM_ID">=200 AND "ITEM_ID"<=500)

Note

--------------------------------------------------------

- dynamic sampling used for this statement

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

24 consistent gets

0 physical reads

0 redo size

9979 bytes sent via SQL*Net to client

684 bytes received via SQL*Net from client

22 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

301 rows processed

/*********************************************************************/

Normal tables which are not index organized are stored in the database schema as heap-organized tables and so their data are also stored in an unordered heap manner. Creating a table as an index organized table changes its storage organization in the database. Such a table is organized as a variant of a B-tree. All rows in an index-organized table are arranged in a B-tree index structure and sorted by primary key of the table. Thus, this makes a key-based access to such table very fast as compared to normal tables. Range searches and match queries are the most benefited queries from this organization and hence here q1query.sql when run on index-organized lan2user1.item, which has a range query, incurs less cost (2 in this case) as compared to that run on unorganized lab2user1item.This is also because q1query.sql is the range query on the primary key item_id which makes it real fast.

Question2: Clusters

Following script created the table lab2user1.person and lab2user1.person_details with id as primary key of table lab2user1.person and pid as primary key of lab2user1.person_detail. It also populates both the tables with 10000 and 25000 rows respectively. Additionally this script also drops these tables if any of them exist previously and also deletes a cluster named lab2user1.personnel and an index named lab2user1.personnel_index if any exist otherwise gives an error.

SQL> @q2setup

Table created.

PL/SQL procedure successfully completed.

Table created.

PL/SQL procedure successfully completed.

After creating these tables, we need to estimate the cost of running q2query1.sql and q2query2.sql queries on these tables. For this, following steps are followed:

SQL> set autotrace traceonly

SQL> @q2query1

495 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 836964621

------------------------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

-------------------------------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 100 | 11700 | 52 (4) | 00:00:01 |

|* 1 | HASH JOIN | | 100 | 11700 | 52 (4) | 00:00:01 |

|* 2 | TABLE ACCESS FULL| PERSON | 99 | 3168 | 9 (0) | 00:00:01 |

|* 3 | TABLE ACCESS FULL| PERSON_DETAIL | 106 | 9010 | 42 (3) | 00:00:01 |

------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - access("PERSON"."ID"="PERSON_DETAIL"."ID")

2 - filter("PERSON"."ID"<100)

3 - filter("PERSON_DETAIL"."ID"<100)

Note

-----

- dynamic sampling used for this statement

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

212 consistent gets

0 physical reads

0 redo size

28399 bytes sent via SQL*Net to client

817 bytes received via SQL*Net from client

34 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

495 rows processed

As shown in the above output obtained the cost estimated for running the query q2query1.sql is 52 for the select statement while 52,9 and 42 are the cost estimations for the HASH JOIN and the range queries in the query.

Similarly the cost of execution for running the query q2query2.sql is determined as follows. This query involves only one table i.e. lab2user1.person.

SQL> set autotrace traceonly

SQL> @q2query2

99 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 3595854160

-----------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

-------------------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 99 | 3168 | 9 (0 ) | 00:00:01 |

|* 1 | TABLE ACCESS FULL| PERSON | 99 | 3168 | 9 (0) | 00:00:01 |

-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("PERSON"."ID"<100)

Statistics

----------------------------------------------------------

8 recursive calls

0 db block gets

87 consistent gets

0 physical reads

0 redo size

3349 bytes sent via SQL*Net to client

531 bytes received via SQL*Net from client

8 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

99 rows processed

Autotrace command can be disabled by set autotrace off command .

SQL> set autotrace off

Using Clusters.

Clusters are a group of tables formed to fulfill the purpose of improving the performance of the queries which involves frequent join between two or more tables. Tables in same clusters are stored in the same data block by Oracle and hence the required entries from two or more tables are present in the same data block thus mitigating the cost of additional i/o operation required if they would have been in different data blocks.

Tables which involves referential integrity constraints and are frequently accessed together using joins are most likely to be benefited by an index cluster where one key column as known as cluster key ( say common column in case of tables related via referential integrity constraints ) is used as a key to group multiple rows in a same cluster.

In q2query1.sql tables lab2user1.person and lab2user1.person_detail are accessed using a join on a common column called id which is actually a foriengn key(referential constraint) in table lab2user1.person_detail with lab2user1.person as its parent table. Hence it is advisable here to create a cluster named lab2user1.personnel with id as the cluster key and index on the cluster named lab2user1.personnel_index.

Also to add the two tables in this newly created cluster, tables are first dropped and then added which is shown below.

Changes are made to cluster-setup.sql as shown below.

-- ---------------------------------------------------

-- insert your create cluster statements begin here

-- ---------------------------------------------------

CREATE CLUSTER lab2user1.personnel

(id int)

SIZE 512

STORAGE (initial 100K next 50K);

create index lab2user1.personnel_index on cluster lab2user1.personnel;

drop table lab2user1.person;

drop table lab2user1.person_detail;

create table lab2user1.person (

id int,

name varchar2(30),

gender varchar2(1),

primary key (id))

CLUSTER lab2user1.personnel(id);

create table lab2user1.person_detail (

pid int,

address varchar2(100),

phone varchar2(10),

id int,

primary key (pid))

CLUSTER lab2user1.personnel(id);

-- ---------------------------------------------------

-- insert your create cluster statements end here

-- ---------------------------------------------------

As shown here, create cluster command creates a cluster. Create index command creates the index on the index and the next two commands creates the two tables and add them to the newly created cluster.

After creating the cluster and adding the tables to the cluster, again the set autotrace traceonly command is used to estimate the cost of execution of queries q2query1.sql and q2query2.sql now.

SQL> set autotrace traceonly

SQL> @q2query1

495 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 2722660595

---------------------------------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|

----------------------------------------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 8 | 936 | 14 (0) |

| 1 | NESTED LOOPS | | 8 | 936 | 14 (0) |

| 2 | TABLE ACCESS BY INDEX ROWID| PERSON | 99 | 3168 | 10 (0) |

|* 3 | INDEX RANGE SCAN | SYS_C004563 | 99 | | 2 (0) |

| 4 | TABLE ACCESS CLUSTER | PERSON_DETAIL | 1 | 85 | 1 (0) |

|* 5 | INDEX UNIQUE SCAN | PERSONNEL_INDEX | 1 | | 0 (0) |

----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

3 - access("PERSON"."ID"<100)

5 - access("PERSON"."ID"="PERSON_DETAIL"."ID")

filter("PERSON_DETAIL"."ID"<100)

Note

-----

- dynamic sampling used for this statement

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

340 consistent gets

0 physical reads

0 redo size

24079 bytes sent via SQL*Net to client

816 bytes received via SQL*Net from client

34 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

495 rows processed

816 bytes received via SQL*Net from client

34 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

495 rows processed

As shown above the cost of execution reduces to a great extent for q2query1.sql.

SQL> set autotrace traceonly

SQL> @q2query2

99 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1338246414

-----------------------------------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

-----------------------------------------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 99 | 3168 | 10 (0) | 00:00:01 |

| 1 | TABLE ACCESS BY INDEX ROWID| PERSON | 99 | 3168 | 10 (0) | 00:00:01 |

|* 2 | INDEX RANGE SCAN | SYS_C004565 | 99 | | 2 (0) | 00:00:01 |

------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("PERSON"."ID"<100)

Note

-----

- dynamic sampling used for this statement

Statistics

----------------------------------------------------------

93 recursive calls

1 db block gets

121 consistent gets

0 physical reads

256 redo size

5649 bytes sent via SQL*Net to client

530 bytes received via SQL*Net from client

8 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

99 rows processed

816 bytes received via SQL*Net from client

34 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

495 rows processed

Similary when we run the query q2query2.sql after the cluster formation, we get the following results.

Surprisingly, the cost was found to have increased in this case. This is assumed to be owing to the fact that the cluster lab2user1.personnel includes both the tables. However, in q2query2.sql, only lab2user1.person is being accessed with a range query on its primary key i.e id.

Another version of clusters called Hash clusters was also tried and results for both q2quer1.sql and q2query2.sql obtained along with the steps to be followed for creating a hash cluster are as follows:

-- ---------------------------------------------------

-- insert your create cluster statements begin here

-- ---------------------------------------------------

CREATE CLUSTER lab2user1.personnel

(id int)

SIZE 512 hashkeys 100

STORAGE (initial 100K next 50K);

drop table lab2user1.person;

drop table lab2user1.person_detail;

create table lab2user1.person (

id int,

name varchar2(30),

gender varchar2(1),

primary key (id))

CLUSTER lab2user1.personnel(id);

create table lab2user1.person_detail (

pid int,

address varchar2(100),

phone varchar2(10),

id int,

primary key (pid))

CLUSTER lab2user1.personnel(id);

-- ---------------------------------------------------

-- insert your create cluster statements end here

-- --------------------------------------------------

Results for q2query1.sql after this hash clustering is as shown below:

SQL> set autotrace traceonly

SQL> @q2query1

495 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 304469016

--------------------------------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|

---------------------------------------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 102 | 11934 | 11 (0) |

| 1 | NESTED LOOPS | | 102 | 11934 | 11 (0) |

| 2 | TABLE ACCESS BY INDEX ROWID| PERSON | 99 | 3168 | 11 (0) |

|* 3 | INDEX RANGE SCAN | SYS_C004574 | 99 | | 2 (0) |

|* 4 | TABLE ACCESS HASH | PERSON_DETAIL | 1 | 85 | |

----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

3 - access("PERSON"."ID"<100)

4 - access("PERSON"."ID"="PERSON_DETAIL"."ID")

filter("PERSON_DETAIL"."ID"<100)

Note

-----

- dynamic sampling used for this statement

Statistics

----------------------------------------------------------

197 recursive calls

1 db block gets

2753 consistent gets

0 physical reads

256 redo size

24079 bytes sent via SQL*Net to client

816 bytes received via SQL*Net from client

34 SQL*Net roundtrips to/from client

2 sorts (memory)

0 sorts (disk)

495 rows processed

Hash clusters are clusters that store the rows with the same hash key values together.Here hashkeys clause as mentioned above specifies to create a hash cluster with the number of hash values allowed being 100 for the cluster. The cost of execution of q2query1.sql is further reduced as shown above.

Similarly output for q2query2.sql after Hash clustering is as shown below:

SQL> set autotrace traceonly

SQL> @q2query2

99 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 709164903

----------------------------------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

-----------------------------------------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 99 | 3168 | 10 (0) | 00:00:01 |

| 1 | TABLE ACCESS BY INDEX ROWID| PERSON | 99 | 3168 | 10 (0) | 00:00:01 |

|* 2 | INDEX RANGE SCAN | SYS_C004577 | 99 | | 2 (0) | 00:00:01 |

-----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("PERSON"."ID"<100)

Note

-----

- dynamic sampling used for this statement

Statistics

----------------------------------------------------------

93 recursive calls

1 db block gets

184 consistent gets

0 physical reads

256 redo size

3669 bytes sent via SQL*Net to client

530 bytes received via SQL*Net from client

8 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

99 rows processed

Autotrace command can be disabled by set autotrace off command .

SQL> set autotrace off

Shutdown Oracle Instance:

Oracle instance must be shutdown in the normal circumstances and disconnect from Oracle using the following commands respectively:

SQL> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

SQL> quit

Our Service Portfolio

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

Do not panic, you are at the right place

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now