Wednesday, July 18, 2012

Quickly and Easily Clone DB2 Subsystems and Objects with DB2 Cloning Tool for z/OS

There are times when a clone of an existing DB2 for z/OS subsystem can be very handy for program testing purposes (or to speed up a production application process -- more on that momentarily). Suppose you want to clone one of your DB2 subsystems. What will you have to do? Some of the required steps -- identifying the subsystem to z/OS via PARMLIB updates, setting up RACF (or third-party equivalent) security, providing DB2 libraries for the subsystem (e.g., SDSNLOAD, SDSNEXIT), determining the volumes in the disk subsystem that will be used for the clone -- are pretty straightforward and not particularly time-consuming. Then there's the data set part. You have to initialize all the "system" data sets for the clone (catalog, directory, bootstrap data set, log data sets, etc.), and you have to deal with all of the "user" data sets (potentially, tens of thousands) associated with table spaces and indexes -- including any required data set renaming and alteration of object identifier information so that these IDs will line up with those in the clone's DB2 catalog. Ugh. You can grind through this data set tedium yourself, but there's a MUCH better way to get the job done: use IBM's DB2 Cloning Tool for z/OS to handle all the data set-related work, and get your clone established way faster (and without the risk of error linked to the manual approach).

DB2 Cloning Tool makes life easier for DBAs with respect to two main functions: cloning a DB2 subsystem in its entirety, and cloning individual objects or sets of objects (table spaces and indexes) by overlaying objects on a target subsystem with objects from a source subsystem (this object-level cloning functionality is sometimes referred to as "object refresh"). For the cloning of an entire subsystem, what has to be done ahead of time is the system stuff I mentioned above (PARMLIB update to identify the new subsystem to z/OS, RACF set-up, DB2 libraries, etc.). DB2 Cloning Tool takes it from there, doing the heavy lifting related to dealing with all the DB2 data sets. As noted previously, a key benefit of DB2 Cloning Tool utilization is speed. One of the ways in which DB2 Cloning Tool delivers here is through its ability to utilize ultra-fast, disk subsystem-based data replication technology, such as FlashCopy in IBM disk storage systems, and compatible technologies provided by other storage system vendors. FlashCopy replication of one volume's contents to another volume is near-instantaneous from a source system perspective, and on top of that you get some nice CPU savings on the source and target z/OS LPARs because the physical data copying work is accomplished by processors in the disk subsystem.

FlashCopy exploitation is great, but DB2 Cloning Tool is by no means limited to using that technology to accomplish a subsystem cloning operation. You can direct DB2 Cloning Tool to use any of a number of mechanisms to create a DB2 subsystem clone. For example, you could choose to go with dump/restore processing, or you could have DB2 Cloning Tool create a clone using your disk subsystem's remote mirroring functionality (e.g., IBM's Metro Mirror technology, formerly known as PPRC). DB2 Cloning Tool can also take a system-level backup generated by the IBM DB2 Recovery Expert for z/OS and use that to create a clone of a DB2 subsystem (I blogged about Recovery Expert earlier this year). DB2 Cloning Tool's flexibility with respect to the mechanisms used to establish a clone of a DB2 subsystem means that you can go with the approach that best suits a particular situation in your shop.

Another way in which DB2 Cloning Tool speeds things up is its proprietary process for renaming data sets, when that is required for a cloning operation. Far faster than IDCAMS RENAME, this process by itself can take a big chunk out of the time that would otherwise be required to clone a large DB2 subsystem.

Speed of execution is also enhanced through features that boost DBA productivity. This certainly comes into play when you want to use DB2 Cloning Tool to clone objects -- that is, to refresh a subset of the data in a cloned DB2 subsystem. Here, DBA work is accelerated through an interface that is very similar to the DB2 LISTDEF utility. The DB2 utility DSN1COPY, which can be used to perform object ID translation when copying a DB2 data set from one subsystem to another, is limited to one input data set per execution. By contrast, DB2 Cloning Tool allows "wild-carding" and INCLUDE/EXCLUDE specifications that make it easy to pull a whole set of objects into one data refresh operation. For example, you could refresh data in a target subsystem using all source system table spaces in database XYZ whose names begin with the characters 'EMP'. Want to include the indexes on those table spaces? No problem. Want to do a refresh using source table space ABC and all table spaces referentially related to it? Piece of cake.

And hey, DB2 Cloning Tool doesn't just make you more productive -- it also helps to keep you out of trouble by taking care of the storage system stuff (e.g., volume initialization) that needs doing when a DB2 subsystem is cloned. You might say that DB2 Cloning Tool understands database-speak (the interface is DBA-centric) and translates that as needed into data set and storage-speak. In addition to protecting you from mistakes, this "storage-aware" aspect of DB2 Cloning Tool also cuts way down on the time investment required of your colleagues in storage administration: you collaborate with these folks for the initial set-up of a DB2 subsystem clone, and then you just turn the crank using the DB2 Cloning Tool.

As for use cases, I mentioned in my opening paragraph that a DB2 subsystem clone is most often thought of in the context of application testing. Indeed, DB2 clones are very useful for that purpose, but don't think that this is the only way your organization can derive value from DB2 Cloning Tool. One company used the product to surmount what had been a vexing challenge in their production DB2 environment. The problem in this case concerned a report-generation application that took three hours to complete. That would be OK, except for the fact that the DB2 data on which the reports are based had to remain static during execution of the application, and at most there was a two-hour window during which the target DB2 tables could be in a read-only state. This organization hit on the idea of using DB2 Cloning Tool to create a clone of the DB2 subsystem, against which the reporting application could run. The clone is created by DB2 Cloning Tool in only 7 minutes (thanks to exploitation of fast replication technology, mentioned previously), and is "thrown away" once the report-generating application has completed (clone disposal is another DB2 Cloning Tool function -- done in a neat and orderly fashion, on request). Because the clone is a separate DB2 subsystem, database object names don't have to be changed, and that means no required program code changes for the reporting application. Because the clone subsystem is created so quickly, data-changing applications can get going almost two hours earlier than before. Everybody's happy.

Seems like a good point on which to close: when creating (and refreshing) a DB2 subsystem clone is so easy and so fast (and it can be when you use DB2 Cloning Tool), you can advantageously use DB2 clones in ways that might not otherwise have occurred to you. Food for thought.

Thursday, July 5, 2012

Proactive SQL Statement Tuning with IBM InfoSphere Optim Query Workload Tuner for DB2 for z/OS

SQL statement tuning -- most often in the form of query tuning -- is important when it comes to boosting the CPU efficiency of a DB2 for z/OS application workload. Sure, there are "system-level" adjustments that one can make to reduce  the CPU consumption of a DB2 application workload (things like enlarging and/or page-fixing buffer pools, binding high-use packages with RELEASE(DEALLOCATE), setting up CICS-DB2 protected entry threads, and -- in a data sharing environment -- increasing the size of the global lock structure), and these definitely have their place as part of an overall DB2 performance tuning strategy; however, a given system-level change will often have a CPU overhead impact of just a few percentage points. For really dramatic improvements in CPU efficiency (and run times), one generally needs to turn one's focus to particular SQL statements. In some cases, I've seen statement-focused tuning actions reduce the CPU cost of a query by 80%, 90%, or more (with attendant reductions in elapsed time).

So, SQL statement tuning can be a very good thing. Trouble is, for many DBAs it's an interrupt-driven process: you're working away at something, and your work is interrupted by a phone call, or an instant message, or (oh, boy) someone standing at your desk to tell you about a query that is running too long. So, you drop what you were doing, and you analyze the problem query and take some action that (hopefully) brings execution time down to an acceptable level. Then, you get back to whatever task you had to temporarily suspend, and roll along until the next "fix this query!" disruption.

If you're tired of this reactive approach to query tuning -- tired (as we say in the USA) of chasing after horses that have gotten out of the barn -- then you should take a look at a tool that can put you in control -- a tool that can enable you to get out in front of problem queries and transform query tuning from disaster response to an ongoing process that delivers results you can easily document and communicate to the higher-ups in your organization. The tool I'm talking about is IBM's InfoSphere Optim Query Workload Tuner for DB2 for z/OS, and it can be the key to making your query tuning efforts proactive in nature.

One of the defining aspects of Query Workload Tuner is in the name: workload. With Query Workload Tuner, you can bring powerful tuning capabilities to bear on not just a single query, but a set of queries. We call that set a "query workload." What is it? It's what you need and want it to be. Query Workload Tuner makes it easy to define a set -- or sets -- of queries that matter to your company. Is it the top 25 dynamic SQL statements, ranked by CPU or elapsed time, pulled from the DB2 for z/OS dynamic statement cache? Top CPU-consuming queries -- whether static or dynamic -- as identified by IBM's DB2 Query Monitor for z/OS or another vendor's query monitor? The SQL statements in a DB2 package? The statements that access a particular database object? In any case, you tell Query Workload Tuner what you want to tune, and once you've defined a set of queries you can save the workload definition and re-generate the query set of interest when you want to. At plenty of sites, query tuning has become process-driven in exactly this way: a DBA tunes a specified query set, then periodically -- weekly, biweekly, monthly, or at some other interval that suits the needs of the organization -- regenerates the query workload and tunes it again. If that set is a "top n" group of queries, it could well change from interval to interval, as formerly high-cost queries fall out of the "top n" mix (a result of their running faster following tuning actions) and others bubble up.

To boost your SQL statement tuning productivity, Query Workload Tuner provides several "advisors" through which sets of queries can be run. Two of the most useful of these are the Statistics Advisor and the Index Advisor. The Statistics Advisor will generate RUNSTATS utility control statements that can be used to update DB2 catalog statistics so as to enable the DB2 optimizer to choose better-performing access paths for queries in the target workload (for example, it might be recommended that value-frequency statistics be gathered to provide DB2 with information about the distribution of duplicate values in a column or set of columns). The Index Advisor will suggest table-indexing changes aimed at reducing the CPU cost of a set of queries, thereby helping you to get the biggest bang for the buck with respect to new and modified indexes (maximizing the performance payback from new and/or expanded indexes is important, as indexes have associated costs and you don't want to incur these without a significant return on investment). A particularly attractive aspect of query tuning via updated statistics and indexing is the fact that performance benefits can be realized without having to change SQL statement coding. That's a big deal these days, as it's increasingly common to have a situation in which SQL statements cannot be modified, either because they are generated by a query/reporting tool or they are issued by purchased applications (e.g., a DB2-accessing ERP or CRM application).

Of course, when you're engaged in a proactive query workload tuning process it's nice to be able to see how things have changed for the better (and -- just as important -- to show other people in your organization how things have changed for the better). Query Workload Tuner delivers a big assist in this area via its Workload Access Plan Comparison feature. Using this capability, you can see the before-tuning and after-tuning change in cost for a query workload (the figures are estimated costs, but these are usually a good indicator of actual performance), along with details such changes in table join order, join method, index utilization, etc. And, Query Workload Tuner's Workload Access Plan Comparison feature isn't just a great way to track the progress of your ongoing query tuning efforts -- it's also very useful for zeroing in on access path changes resulting from a migration from one version of DB2 for z/OS to another.

One other thing: if it's been a while since you last looked at Query Workload Tuner, look again. This is a product that is being regularly enhanced by IBM. Version 3.1 of InfoSphere Optim Query Workload Tuner for DB2 for z/OS, delivered in the fall of 2011, provided the aforementioned Workload Access Plan Comparison functionality. More recently, with Version 3.1.1, the Query Workload Tuner Client was replaced by plug-ins that make IBM's Data Studio the end-user interface for the product.

So, consider what your DB2 for z/OS query tuning experiences are like now ("SMITH! Speed that query up NOW!"), and what you'd like them to be ("Nice work, Smith -- you've taken a lot of CPU time out of that application"). Query Workload Tuner can help you to get from A to B.