Partitioning tables is a powerful feature in Oracle Database that can significantly enhance performance and manageability for large datasets. By dividing a table into smaller, more manageable pieces, you can improve query performance, simplify maintenance tasks, and better manage data lifecycle. In this blog post, we’ll explore the upsides and downsides of partitioned tables, and provide a hands-on lab to help you get started with table partitioning in Oracle Database.
What are Partitioned Tables?
Partitioned tables are tables that are divided into smaller, more manageable segments called partitions. Each partition can be managed and accessed independently, but together they form a single logical table. Oracle Database supports several types of partitioning methods, including range, list, and hash partitioning.
Partitioning allows DBAs to split large tables into more manageable sub-tables, which improves database performance, data availability, and manageability.
Hands-On Lab: Implementing Partitioned Tables
In this lab, we’ll demonstrate the benefits of partitioning by creating two versions of the same 'sales'
table. One table will be a standard non-partitioned table 'sales-np'
, while the other, named 'sales'
, will use range partitioning based on a sales_date
colum . We will insert one million rows into each table, perform queries, and observe the impact of partition pruning on query performance.
Let’s start by creating both tables. It worth noticing that we are allocating each partition into a different tablespace. That will be useful for our next post on backing up partitions.
-- Create the non-partitioned table. CREATE TABLE sales_np ( sale_id NUMBER PRIMARY KEY, product_id NUMBER, customer_id NUMBER, sales_date DATE, amount NUMBER ); -- Create tablespaces for partitions. CREATE TABLESPACE ts21 DATAFILE './tss21.dbf' SIZE 200M AUTOEXTEND on; CREATE TABLESPACE ts22 DATAFILE './tss22.dbf' SIZE 200M AUTOEXTEND on; CREATE TABLESPACE ts23 DATAFILE './tss23.dbf' SIZE 200M AUTOEXTEND on; CREATE TABLESPACE ts24 DATAFILE './tss24.dbf' SIZE 200M AUTOEXTEND on; CREATE TABLESPACE tsxx DATAFILE './tssxx.dbf' SIZE 200M AUTOEXTEND on; -- Create the partitioned table allocating each partition to its tablespace. CREATE TABLE sales ( sale_id NUMBER PRIMARY KEY, product_id NUMBER, customer_id NUMBER, sales_date DATE, amount NUMBER ) PARTITION BY RANGE (sales_date) ( PARTITION p21 VALUES LESS THAN (TO_DATE('2021-01-01', 'YYYY-MM-DD')) TABLESPACE ts21, PARTITION p22 VALUES LESS THAN (TO_DATE('2022-01-01', 'YYYY-MM-DD')) TABLESPACE ts22, PARTITION p23 VALUES LESS THAN (TO_DATE('2023-01-01', 'YYYY-MM-DD')) TABLESPACE ts23, PARTITION p24 VALUES LESS THAN (TO_DATE('2024-01-01', 'YYYY-MM-DD')) TABLESPACE ts24, PARTITION pxx VALUES LESS THAN (MAXVALUE) TABLESPACE tsxx );
At any time, we can add a new partition to handle data for future years.
ALTER TABLE sales ADD PARTITION p24 VALUES LESS THAN (TO_DATE('2025-01-01', 'YYYY-MM-DD'));
Now, we use the following PL/SQL block to insert one million rows into each table. This example uses random values for demonstration purposes.
CREATE SEQUENCE sales_seq START WITH 1 INCREMENT BY 1 NOCACHE NOCYCLE; DECLARE v_sale_id NUMBER; v_product_id NUMBER; v_customer_id NUMBER; v_sales_date DATE; v_amount NUMBER; v_start_date DATE := TO_DATE('2020-01-01', 'YYYY-MM-DD'); v_end_date DATE := SYSDATE; v_total_rows NUMBER := 1000000; BEGIN FOR i IN 1..v_total_rows LOOP v_sale_id := sales_seq.NEXTVAL; v_product_id := trunc(DBMS_RANDOM.value(1, 1000)); v_customer_id := trunc(DBMS_RANDOM.value(1, 10000)); v_sales_date := TO_DATE(TRUNC(DBMS_RANDOM.VALUE(TO_CHAR(v_start_date,'J') ,TO_CHAR(v_end_date,'J'))),'J'); v_amount := ROUND(DBMS_RANDOM.value(1, 1000), 2); INSERT /* APPEND */ INTO sales (sale_id, product_id, customer_id, sales_date, amount) VALUES (v_sale_id, v_product_id, v_customer_id, v_sales_date, v_amount); INSERT /* APPEND */ INTO sales_np (sale_id, product_id, customer_id, sales_date, amount) VALUES (v_sale_id, v_product_id, v_customer_id, v_sales_date, v_amount); END LOOP; COMMIT; END; /
The purpose of the /* APPEND */
hint in the INSERT statements is to instruct the database to use direct-path insert operations, which can improve performance by bypassing the traditional insert method that involves writing to the table’s buffer cache. Instead, the direct-path insert writes directly to the data files, allowing for faster data loading, especially in bulk insert operations. This hint is particularly useful in data warehousing environments or situations where large volumes of data need to be inserted quickly.
Oracle’s SQL Tuning Guide: Influencing the Optimizer
About direct path load
Direct path load always loads above the high water mark, since it is formatting and writing blocks directly to disk – it cannot reuse any existing space.
You cannot query a table after direct pathing into it until you commit.
It’s time to see partition pruning in action running the EXPLAIN PLAN
command:
SQL> explain plan for select sum(amount) total, count(*) count from sales where sales_date between to_date('2022-01-01', 'YYYY-MM-DD') and to_date('2022-21-31', 'YYYY-MM-DD'); Explained. SQL> select * from table(dbms_xplan.display); ---------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ---------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 22 | 820 (1)| 00:00:01 | | | | 1 | SORT AGGREGATE | | 1 | 22 | | | | | |* 2 | FILTER | | | | | | | | | 3 | PARTITION RANGE ITERATOR| | 1 | 22 | 820 (1)| 00:00:01 | 3 | KEY | |* 4 | TABLE ACCESS FULL | SALES | 1 | 22 | 820 (1)| 00:00:01 | 3 | KEY | ---------------------------------------------------------------------------------------------------- SQL> explain plan for select sum(amount) total, count(*) count from sales_np where sales_date between to_date('2022-01-01', 'YYYY-MM-DD') and to_date('2022-21-31', 'YYYY-MM-DD'); Explained. SQL> select * from table(dbms_xplan.display); -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 22 | 1161 (1)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 22 | | | |* 2 | FILTER | | | | | | |* 3 | TABLE ACCESS FULL| SALES_NP | 1 | 22 | 1161 (1)| 00:00:01 | --------------------------------------------------------------------------------
The key differences between the partitioned and non-partitioned table explain plans are evident in the cost and the operations performed. The partitioned table has a lower cost (820) compared to the non-partitioned table (1161), indicating better performance due to partition pruning. The partitioned table uses a PARTITION RANGE ITERATOR
operation, which efficiently scans only the relevant partitions based on the sales_date
filter condition, reducing the amount of data processed. In contrast, the non-partitioned table performs a full table scan, accessing all rows in the SALES_NP
table, which is less efficient. These differences highlight the performance benefits of partitioning, particularly for large datasets.
Licensing caveat
There is a license caveat to be aware of. Oracle Partitioning is an additional cost option available only for Enterprise Edition databases, meaning it is not included in the standard licensing fees and requires a separate purchase. Before implementing partitioning in your database environment, it is crucial to check your current license rights and ensure that you are authorized for this option to avoid any compliance issues or unexpected costs. Properly understanding and managing your licensing agreements can prevent potential legal and financial repercussions.
To wrap up
Partitioned tables offer significant advantages in terms of performance, manageability, and data life-cycle management. However, they also introduce complexity and require careful planning. By following this hands-on lab, you should have a good understanding of how to implement and manage partitioned tables in Oracle Database. Experiment with different partitioning strategies to find the best fit for your data and workload requirements.
Additional Resources
Was this post helpful? Consider subscribing and leaving a comment.