Logical Partitioning of a Database: A New Way of Thinking about Enhancing Performance

Dr. Paul Dorsey

Dulcian, Inc.

May 3, 2002

 

 

Overview

A number of months ago, I was asked to review a technology company’s product strategy. The company, InfoCruiser, was developing a main memory database that could run either with Oracle as the core engine or with their own proprietary engine.

 

Their idea was to replicate a portion of the database in a memory structure and use a dedicated DBMS to support queries to that portion of the database. At the time, I was not terribly impressed by the idea. Why not simply pin tables into memory or place them on a RAM disk? After all, the cost of RAM disks has recently dropped significantly. With the advent of Oracle 9i’s Real Application Cluster (RAC) ability to partition workload across multiple machines, does InfoCruiser’s approach make sense?

 

I took the opportunity to review the company’s product, dbCruiser, and observed some good performance statistics from their system tests. I had to ask myself: Did InfoCruiser have a great idea? Is partitioning the database based on its usage patterns something worth doing?

 

At its root, the idea behind the dbCruiser technology is system partitioning, which is the core idea behind many performance tuning strategies. The available partitioning strategies currently include:

·         Pinning tables into memory

·         Striping tables across disks

·         Placing database objects into separate tablespaces on separate disk drives

·         Oracle 9i’s RAC technology

 

None of these techniques can be tied very closely to how tables are used in the system. However, there is a partitioning based strategy that is based on system usage. We have been using it for years.  We call it “data warehousing.”

 

Data Warehousing as a Performance Tuning Strategy

Data warehousing is not often thought of as a performance tuning strategy; but it is a classic partitioning strategy. Using a data warehouse, a subset of the OLTP database is massaged, denormalized and placed into a separate database. Some, if not all, of the reporting activity is offloaded to the warehouse.

 

In data warehouses, a copy is made of a large portion (perhaps all) of the system’s data and placed on a separate database server.  This removes the reporting activity from the main system as well as allowing for a warehouse that is tuned to specifically support the reporting requirements without having to worry about any OLTP considerations.

A traditional warehouse structure is shown in Figure 1. As you can see from the diagram, applications are partitioned between the core system and the warehouse.

 

 

 

Figure 1: Data Warehouse Architecture

 

 

Advantages

It is well-established that data warehousing leads to improvements in overall system performance. Not only do reports run much faster because they are running on a database explicitly tuned for this purpose, but also because the workload has been partitioned. All of the transaction processing takes place in the core database, while all of the reporting takes place in the data warehouse. Using this strategy means that two independent machines are responsible for the entire system’s activity.

 

By creating a data warehouse, overall system performance is improved. The warehouse structure is designed explicitly to support the reports that will be run against it. Enforcement of business rules is not important as the data is being loaded from a clean source.  Data elements that do not need to be queried are not included, and the system parameters are set to optimize the query performance.

 

Disadvantages

Using a data warehouse to improve performance does not improve performance/usability without costs:

  1. There is the additional overhead of periodic refreshing of the data warehouse.
  2. The data warehouse usually contains stale data. It is only as current as its last refresh.
  3. The data warehouse is a separate system that must be designed and maintained independently from the OLTP.  It is usually created using different indexing, and structure from the core OLTP system.
  4. Since the structure is independent from the OLTP, applications have to be written explicitly to run against the warehouse or the OLTP.  Even though largely the same information resides in both structures, applications rarely take advantage of this fact. Any applications to be run that pull data from both the core database and data warehouse run the risk of working with inconsistent data.

 

Expanding the data warehousing strategy for use in OLTP systems

The database architecture of most applications relies on a single database system to meet both read (queries) and write (updates, inserts, deletes) requirements.  This means that the DBA must tune the system to balance the needs of both kinds of transactions.  Just as an example, consider indexing. For an OLTP system with a heavy insert, update and delete load, every index added means an additional index that must be maintained. The more indexes there are, the slower the inserts, updates and deletes will execute.  To tune a system for a high transaction volume, you want to minimize the number of indexes.  However, for a “query mostly” system, just the opposite is true.  More indexes mean more options for tuning your query.  The more indexes, the better. 

 

The role of the DBA in tuning the system is to balance these two conflicting types of tuning.  Therefore, the DBA is always making tradeoffs in tuning a system. The system must be tuned to balance the needs of inserts, updates and deletes against the needs of query access. Frequently, it is very difficult to tune a system to support both kinds of activities. As data volume increases, the complexity of system increases, and as the number of users increases, every system will experience performance degradation.

 

The idea behind using a data warehouse can also be applied to performance tuning of an OLTP system using data partitioning.

 

Data Partitioning

Not all tables are the same. Tables fall into several categories:

a)      Reference tables that are small, and not volatile (e.g. Status and Type tables)

b)      Volatile, core data tables that are very large and highly volatile (e.g. Sales, Purchase Order)

c)      Access tables that are medium in size, relatively non-volatile but are highly accessed (e.g. product catalog)

 

The access table portion of the system is relatively small. This presents an opportunity to partition the system by examining usage patterns. Is there something different that can be done with the less volatile, less frequently accessed tables?

 

The optimal solution is to divide the database into two areas:

·         Access table area (lower volatility, high level-access)

·         Volatile table area (higher volatility, lower level access)

 

Relevant reference tables can be duplicated and placed in memory in each area.

 

For the volatile table area, Oracle technology is perfectly suited to handle this using traditional performance tuning techniques.

 

Take the tables that are heavily queried (but not heavily updated) and place them in a separate database. Utilities can be written to automatically decide what tables to copy into this structure to minimize the maintenance overhead.

 

This “query mostly” database would be tuned differently, probably using RAM disks and an optimized temporary tablespace.

 

Then all of the queries can be automatically intercepted and a decision made about which database to use to execute the queries. This would be done either by writing a low level routine or by taking advantage of Oracle’s materialized views to force the queries into the new system.

 

This environment would achieve the same kind of performance benefits as a data warehousing approach. The queries would run much faster since they are located on a dedicated machine and the queries have been offloaded  from the main OLTP system so the rest of the DML will run more quickly.

 

The proposed architecture is shown below in Fig 2. 

 

 

Figure 2: Partitioned Database Architecture

 

 

Implementation

This idea of partitioning of the query data can be implemented in various ways.

 

  1. Place data on faster media -- The access area could be placed onto a RAM disk, a very fast disk drive, or pinned into memory where they can be more rapidly retrieved. This alternative does not involve placing the data in a separate database but is included here for comparison purposes.
  2. Support access in a separate Oracle instance – Specifically tune the DBMS to support query access.  Place tables in as fast a media as possible.
  3. Support access area with a memory based DBMS -- Use a special database engine written specifically to support the access area residing in main memory.

 

Each strategy will be discussed in turn.

Strategy 1: Place data on faster media

The first strategy is a simple solution both from a technical and DBA perspective. It involves identifying tables that would best benefit by being in a high-performance area and moving them to that faster medium. Using this strategy eliminates the concern about placing a table in the wrong area since the cost of placing a table incorrectly is very low. The only disadvantage to this solution is that only modest performance gains can be achieved with this approach.

 

In most cases, half of all of the CPU cycles in a system are spent in the access area (based on examination of several internal systems).  If a typical system is assumed, then half of the database activity could be moved to a dedicated access area.

 

Using a Strategy 1 implementation, moving the access area into main memory would double the performance of the access area SQL.  There would be no effect on the volatile area since the same processor would still be supporting both areas.  This would lead to approximately a 30% improvement in throughput.

 

Note that the user perception of these performance benefits would be less dramatic because network traffic is unaffected.

Strategy 2: Partition the  system using multiple Oracle databases

In the second strategy, all access area SQL is handled with in its own database or a dedicated computer. This strategy is obviously a much more aggressive approach to the problem.

 

There are two possible methods to support this strategy:

 

a)      Partition the databases - Place the volatile area in one instance and the access area in a second instance. The databases communicate using some type of transfer protocol or database links so that the two areas can access each other for queries that span the two systems. This is a very difficult strategy to implement.

 

b)      Store the entire database physically in one place and replicate the access area in a second instance. The problem of keeping the access partition synchronized remains; however, the low volatility of the access area makes this a straightforward problem to solve.

 

This second strategy has several advantages:

·         Queries to the access area will run significantly faster than they would in a traditional system. Since the access area is smaller, it can probably reside entirely in memory (either on a RAM disk or in the SGA). As this is a “query mostly” database, it can be tuned to optimize query access.

·         Offloading a significant percentage of the work in the volatile area will increase throughput there as well.

 

Implementation of this strategy would require preprocessing of all SQL entering the system to intercept and reroute it to either the access or volatile areas. Returned results must be synchronized.

 

An efficient method of keeping the two copies of the access area in synch with each other is critical. Depending upon the application, either periodic refreshes of the access area or two-face commits would be necessary.  Materialized views may be used to support this strategy.

 

Using a Strategy 2 implementation results in at least the same performance improvements for the access area SQL as in Strategy 1.  However, removing all of the access area SQL from the volatile area would decrease the load on the first server by 50%.  Some overhead will be necessary for query routing, but this is not significant.  Depending upon how the load balancing works out, this should increase overall system performance by at least 50%.

Strategy 3: Support the access area with a memory based DBMS

This strategy is comprised of the Strategy 2 approach with the addition of a separate non-Oracle database to support the access area. The idea is that Oracle is a well developed product for OLTP systems, but it is not specifically designed to operate optimally in a main memory only environment.  There have been tests of products that have attempted to outperform Oracle in the main memory, query-only environment.  Indications are that such a system could outperform Oracle by about 10 to 1. 

 

The advantage is that the access area would be able to run very quickly.  The disadvantage is that this technology is still evolving and is not ready for production usage as of this writing.

 

Using Strategy 3, the highest benefits can be realized. As mentioned before, main memory databases can outperform an Oracle database by a factor of 10 to 1 on query performance. This will effectively drop the cost of query execution to near zero.  Hence the performance improvement in the entire system will be 75% or higher.

 

 

Comparison of this approach to other OLTP Scalability Solutions

Of course, we can always resort to the easiest tuning solution of all: “Throw a bigger, faster computer at the problem.”  In organizations with infinite money, this is a viable alternative.

 

Within Oracle, the database activity can be partitioned using a parallel query option and Real Application Clustering (RAC), where the workload is spread across various processors or machines.  The impact of these strategies is unknown.  

 

All of these techniques involve significant costs, whether for hardware, software or expertise to create the necessary structures and setup. What is needed is a way to provide substantial performance improvements without the need for professional expertise or large hardware/software expenditures.

 

With the proposed architecture, the idea is to take into account how the tables (objects) in the system are used as the basis for creating database subsets. Not all tables are used in the same way and handling these tables differently can drastically improve overall system performance.

 

Issues in applying the architecture in OLTP

Separating query traffic from transactions in an OLTP system is a challenging problem.  Proper implementation requires the following:

1. Data Partitioning

It is important to identify the correct tables to place in the “query mostly” partition.  Ideally, this should be automated by a utility that identifies good candidate tables/views.

 

2. Core Database tuning

If most of the query traffic is removed from the database, then standard tuning logic will have to be modified.  The activities in the database will have a much lower percentage of queries and a much higher percentage of Inserts, Updates and Deletes. This will greatly influence the appropriate tuning strategies.

 

3. “Query Mostly” Database Engine Optimization

The query database will be tuned like a normal data warehouse. The only difference will be that the queries experienced by the database would be unusual from a warehouse perspective. But that should not greatly affect the tuning strategy.

4. Query Routing

Queries must be routed to the appropriate part of the system.  This can either be done using a custom query router or, if materialized views are used as the implementation structure, then Oracle can be tricked into acting as the query router.

5. Data Synchronization

The query-mostly area will need to be kept in synch with the core database.  There may be some ability to use stale data, but each system will have a different tolerance for stale data. 

 

Conclusion

System performance may be significantly enhanced through partitioning the tables in the system based upon usage patterns. There is an opportunity to move tables that are heavily queried but not heavily updated to a “query-mostly” database. This partitioning results in a dramatic improvement in overall system performance.