|
Q and A
Asked and Answered
Can you have too many DB2 package collections?
Hi Robert,
I have been in the process of reworking how our shop will be using plans, packages, and collections for the past month. When the developers put together the guidelines for this stuff a number of years
ago, they decided that all the online CICS regions would use a plan containing a single collection containing all the packages needed for that region. We use the database segmentation you mention, so each customer has his or her own region and qualifier. We ended up with a single plan for each customer referencing a single collection for the approximately 1,000 online packages. We have 20 major customers, 20 online plans, and 20 collections. That setup has been working fine for years with no major
performance problems.
For some reason, the developers took the opposite approach with batch programs. Each batch job is set up as a separate plan listing all the collections that contain the packages needed. But then they bound each package into separate collections rather than actually grouping anything. So, each Packlist contains multiple collections, which each represent a single package. Now the batch world has 20 (customers/qualifiers) x 500 plans referencing multiple collections. But there are 20 x 825
(packages) for 16,500 collections, each with a single package. This means that each Planlist must make sure that it lists the necessary individual collections and gets modified anytime any module adds a call to the program.
I am trying to convince the developers that, for the most part, we only need a single batch Collid for each customer and all the packages bound to that customer's batch collection. The Planlist would only need to contain that single Collid and would not need to be changed simply
because a call is added to a program -- as long as that package was also bound to the group collection. If necessary, they could have a separate collection for the most frequently called packages and list that first in the Planlist to improve access. Although, if we haven't had a performance problem finding the correct package in the 1,000 packages in the single online collection, I don't anticipate a problem when dealing with fewer batch packages in a single batch collection. In effect, we would go from
having 16,500 collections of one each to 20 or 40 collections with up to 825 packages in each.
Do you have any comments on this?
Mark Labby
Harrisburg, PA
Robert Catterall responds:
I absolutely agree that you should consolidate collections in your batch environment, and I see one collection per customer (one per database instance) as being a reasonable choice. A key advantage, as you've pointed out, would be greatly simplified plan maintenance. Rebinding a plan every
time you add a package (required when every new package goes into a new collection) is not something I'd want to do. With one collection per customer, you'd bind a batch plan once and wouldn't have to do a plan rebind just to add another package to the customer collection.
Aside from the benefit of simplified plan maintenance, would the shift to one collection per customer from one per package buy (or cost) you anything in terms of application performance? That depends. If the batch programs use SET
CURRENT PACKAGESET to direct DB2 to the
collection holding a package to be executed, going to one collection per plan vs. many would probably not have a performance impact -- either way, you're pretty much going directly to the package (don't worry about having a lot of packages in a collection; there is, in effect, a unique
index that DB2 uses to quickly locate a package in a collection). If, on the other hand, the batch jobs do not use SET CURRENT PACKAGESET, going to one collection per plan versus
many might improve performance because in the latter case (many collections in the package list) DB2, in the absence of a collection name in the CURRENT PACKAGESET special register,
will have to search through the collections listed, in the order listed, until the package is found. If the desired package is towards the end of a long list, the search would add some overhead to program execution that I'd just as soon avoid. A single collection per plan would eliminate that
overhead, and while the cost
savings might be relatively small as a percentage of total batch job run time, why not make your DB2 program execution environment as efficient as it can be?
See a
complete archive of reader/author Q&As
.
Back to
Programs and Packages, Plans and Collections
by Robert Catterall
|