DB2 Magazine: Face Time With Paul Zikopoulos

Face Time With Martin Hubel

DB2 Magazine reader: I have an application that queries and writes to 11 normalized tables. While performance testing, I noticed that when I put all of the tablespaces used by these tables in the same 32K buffer pool, the application ran much faster than when I allocated separate buffer pools. Each table is in its own table space. Why is it faster, and do you recommend setting up tables/tablespaces this way?

Hubel: Your findings are similar to many experiences I've had and heard about. It would appear that the buffer pool search algorithm favors a single buffer pool over multiple buffer pools.

IBM now recommends a single DB2 for Linux, Unix, Windows buffer pool for query workloads. For transaction workloads, IBM still recommends separating tables from indexes. However, this approach seems contrary to your experience.

It would also be appropriate in DB2 v.8.1 to consider using block-based buffers by allocating NUMBLOCKPAGES and BLOCKSIZE for the buffer pool. This will allow separation of synchronous and asynchronous I/O with the buffer pool.

DB2 Magazine reader: Is there a way in DB2 to specify that data in a column should be stored in a compressed format? We have a lot of XML CLOB columns that aren't accessed much. It would be great to have DB2 compress these automatically to save space.

Hubel: I heard this question a few times at the IDUG conference in Orlando earlier this month. Data compression is a known requirement that appears to have some priority. On DB2 for z/OS, hardware data compression can be used, as the platform is IBM. However, in DB2 for Linux, Unix, and Windows, no vendor provides this hardware feature.

Some work on compression has been done, including the compression of nulls and system default values in DB2 UDB v.8.1 and the option for compression on the BACKUP DATABASE command in Stinger.

The use of the COMPACT option when creating CLOB columns may also provide some relief, although this only saves space at the end of the file.

DB2 Magazine reader: What is an effective way to monitor tempspace? When tempspace (I'm using SMS) reaches a certain threshold, I'd like to be able to determine how much tempspace each application is consuming and decide which application to force before tempspace runs out entirely. I'm running a very large data warehouse where ad hoc queries are the norm. Currently, I have to force all applications to get out of a tempspace full condition. This isn't ideal.

Hubel: There is no direct way to capture monitoring data for temporary tablespaces, but my friend Phil Nelson and I played with this to come up with a suggestion we hope is helpful. There might be other data that would be useful for a postmortem review in events, but to identify the active user, we found we had to use snapshot data. We looked at tablespace, tablespace containers, and OS file information, but the only data that contains both the userid and an indication of the activity is the table snapshot.

On the systems I checked, the snapshot of temporary tables shows the schema as follows:


Table Schema = <94><PHILIP>
Table Name = TEMP (00001,00002)
Table Type = Temporary
Data Object Pages = 11993
Rows Read = 12
Rows Written = 71824
Overflows = 0
Page Reorgs = 0

The table schema shows the application handle followed by the userid (in this case, "<94><PHILIP>"). Based on the amount of activity, you may choose to force only a particular application — rather than all users — by specifying the application handle in the force command.

DB2 Magazine reader: Have you ever seen a DB2 sparse index implemented via RowIDs (direct row access)? How popular is direct row access, and what is your experience with it?

Hubel: It sounds like you might be a recent convert to DB2 from Oracle. When first learning DB2, Oracle users often look for a correlation to what they know. Two questions common among new DB2 converts relate to direct row access and providing hints to the optimizer. Neither is necessary in a DB2 environment.

There is good news and bad news on the RowID. The bad news is that RowIDs don't exist in a usable format. This allows DB2 to adhere to relational database principles, in which physical pointers to data shouldn't exist for application use.

The good news is there's no need for RowIDs at the application level. Before DB2 for z/OS v.8, a ROWID column was used to link to LOB (large object) columns. This column is now "hidden" and is not visible via a SELECT * on the table. If you need to point to a specific row, the best way is to use the primary key in your WHERE clause.

Note: The comments and responses included in this forum and represent the views of the author only. Neither IBM nor the author is liable for its contents.

CAREER CENTER
Ready to take that job and shove it?
SEARCH JOBS
RECENT JOB POSTINGS
CAREER NEWS
10 Search Engines You Don't Know About
Go beyond Google and get vertical. These specialized search sites will help you find the business information you need -- fast.

Subscribe to the new digital version of IBM Database Magazine
New Digital Version

Sponsored links:



Subscribe to the IBM Database Magazine Newsletter

Email Address *
First Name
Last Name
HTML Preference
HTML Text
 

Fields with * are required.

 




Visit these other IBM and TechWeb Partner Sites: :
Maximizing ROI Through Business Process Management (BPM) and Service-Oriented Architecture (SOA)
Internet Evolution – The Macrosite for News, Analysis, & Opinion About the Future of the Internet
Business Innovation – Technology Strategies and Solutions for Driving Business Success