Tuesday, August 14, 2012

Got DB2 for z/OS? GET DATA STUDIO

If you work with DB2 for z/OS -- as a systems programmer, a DBA, an application developer, or in some other capacity -- you REALLY ought to be using IBM Data Studio. Why? Well, start with the fact that it's FREE and downloadable from the Web. Now, free is not such a big deal if the product in question is a fluffy piece of almost-nothing that delivers practically zilch in the way of useful functionality. In the case of Data Studio, free is a great thing, because the tool is loaded with features that can boost your productivity and effectiveness as a mainframe DB2 professional (in addition to DB2 for z/OS, Data Studio can be used with DB2 for Linux, UNIX, and Windows; DB2 for i, Informix, and several non-IBM relational database management systems). In this blog entry, I'll describe a few of the Data Studio capabilities that I most appreciate (and yes, I most definitely have it on my PC). I encourage you to get your own copy of Data Studio, and to explore its functionality to see how the tool can best deliver for you.

Before going further, a couple of preliminary information items: first, Data Studio as we know it today essentially made the scene in the fall of last year, with the delivery of Version 3.1 (the current release is Version 3.1.1). If you last looked at Data Studio prior to Version 3.1, look again. That was a big-time product upgrade.

Second, note that Data Studio comes in two forms: the administration client and the full client. There's a page on IBM's Web site that provides detailed information on the features of the administration client and the full client, but I'll tell you here that the major difference between the two Data Studio clients is the support for development of DB2-accessing Java code provided by the full client, and the ability you have with the full client to "shell-share" with other Eclipse-based tools, including a number of IBM's Rational and InfoSphere Optim products. Now, you might think, "Given that Data Studio is free, why wouldn't I just download the full client, and not bother with figuring out whether or not a capability I want is or isn't provided by the administration client?" In fact, going with the full client is a pretty good idea, as far as I'm concerned. One reason there are two clients is that the full client is about three times the size of the administration client. That can be important in terms of download time, IF you don't have access to a high-speed Internet connection. These days, lots of folks have high-speed Internet connections of 50+ or even 100+ Mbps, and when that's the case download time is less of a concern and getting the Data Studio full client is a good choice. If your download capacity is more limited, and if you don't need Java development support and/or shell-sharing capability with other Eclipse-based tools, the Data Studio administration client could be best for you.

OK, on now to my personal favorites among the capabilities provided by Data Studio:
  • SQL statement tuning. So much to like here. I like the access plan graph that I can get for a query. I like how I can quickly get information about an element in the access plan graph (for example, a table accessed by the query you're analyzing) just by hovering over it with my cursor. I like how I can get more information about that element in the access plan graph by clicking on it, and even more information via the handy explorer window I get for a clicked-on element of the access plan graph (if the element clicked on is a table, the explorer window lets me very quickly obtain information about indexes on the table, keys of these indexes, columns in the index keys, etc.). I like that the access plan graph shows me the optimizer estimates for the number of result set rows going into a step of the access plan, and for the number of rows coming out of that step. I REALLY like Data Studio's statistics advisor, which provides me with RUNSTATS utility control statements aimed at gathering statistics which could help the DB2 optimizer to choose a better-performing access path for the query I'm tuning (people at the IBM Support Center who spend their time working on query performance problems will tell you that MOST of the situations with which they deal could be resolved through updated and/or enriched catalog statistics). I am one who for years analyzed query performance by slogging through data retrieved from the PLAN_TABLE. I finally got wise and started leveraging the query tuning aids provided by Data Studio, and as a result I got more efficient -- and more effective -- in getting queries to run faster.
  • Submitting SQL statements. As I stayed for a long time with the old ways of query tuning before getting modern with Data Studio, so I also persisted in submitting SQL statements the old way: via SPUFI in a TSO/ISPF session. Know what got me into using Data Studio versus SPUFI for interactive SQL statement execution? XML. I was trying one day to retrieve data in an XML column in a DB2 for z/OS table using SPUFI, and got majorly frustrated in trying to get that to work. Thinking, "Why not?" I gave it a go using Data Studio, and it worked like a champ right off the bat. Not only did I get the XML data that I wanted using Data Studio, I also got it very nicely formatted, with different lines and different levels of indentation showing clearly the structure of the returned XML data. I've also gotten to where I appreciate what I can do with a multi-row query result set obtained via Data Studio -- an example is easy sorting of the result set rows by any column's values. I also like that Data Studio kind of looks over your shoulder as you enter a query's text, showing a little red X by a line of the query if you've left something out that would cause a syntax error -- I like seeing this and getting a chance to correct the text BEFORE submitting the statement for execution. I like that Data Studio keeps a record of statements you've submitted, making it very easy to re-submit a query if you want to. I could go on and on, but the bottom line is this: Data Studio is not just "SPUFI on a PC" when it comes to entering and executing SQL statements -- it's better than SPUFI.
  • Formatting SQL statements. Over the years I've done a fair amount of query analysis work in DB2 for z/OS-based data warehouse environments. Some of the queries you might be asked to look over in such an environment might take up several pages when printed out. If one of these humongous queries (lots of tables joined, lots of subquery predicates, lots of CASE expressions, etc.) is handed to you unformatted, you've got a big job on your hands just to make it readable -- a big job, that is, if you don't have Data Studio. If you do have Data Studio, you just paste the query into an SQL input window and ask Data Studio to format it for you. You then get indentations and line breaks that make the big hairy query easy to read, and that makes working with the query much easier.
  • Retrieving data from the DB2 catalog. Sure, you can do this by issuing queries against catalog tables, but it can be done a whole lot more quickly using Data Studio's data source explorer when you're connected to a DB2 for z/OS subsystem. 
  • Developing SQL stored procedures. So-called native SQL procedures, introduced with DB2 9 for z/OS, are, in my opinion, the future of DB2 stored procedures. Developing, testing, and deploying native SQL procedures is a snap with Data Studio. I've done it, and so can you.

The above-listed Data Studio features are just the ones that I've used most extensively -- there is a LOT more that you can do with the tool, such as creating Web services for DB2 data access; creating, altering, and dropping tables and indexes and other database objects; generating DDL from objects in a database; generating diagrams that show database objects and relationships between those objects; and comparing one database object with another (e.g., the definition of table A as compared to the definition of table B). The best way to see what Data Studio can do for YOU is to get your hands on it and exercise the functionality yourself.

Hey, if you're like me -- a longtime mainframe DB2 professional -- then you've done most of the DB2 work you've needed to do over the years using the good old 3270 "green screen" interface. That interface will be with us for a long time to come, I'm sure, but for a number of DB2-related tasks a GUI is not just a different way -- it's a better way. Data Studio is your opportunity to put this assertion of mine to the test. Give it a try. I think you'll like what you find.