|
With more than fifty systems running business critical applications, Athene is used by Transco to collect and collate data primarily running on UNIX, Windows and two MVS IBM compatible mainframes. A key application, which is being continuously monitored using Athene, is UK-Link. This is an integrated suite of software, developed by Transco, running on UNIX and MVS platforms, which supports the operation and commercial regime for Gas transportation in accordance with Government and the Gas Regulators requirements. In addition, the Athene software is also used to monitor asset and work management systems, metering systems, together with data warehousing, finance and payroll systems.
Chris Lees is a Team Leader at Transcos data centre in Killingworth. With overall responsibility for ensuring that the organizations central servers have sufficient capacity to meet agreed service levels, he believes that the only way of achieving this is to automate the process as much as possible: The Capacity Planning team is small, usually three to five people. We are responsible for a large number of different platforms in different locations and these are in turn managed by a number of different technical divisions. In this complex, somewhat fragmented environment, we have to collect data every single minute of the day from all the different systems to ensure that we have the relevant data to hand to help us identify trends and issues.
Historically, British Gas had always used Capacity Planning, although on a less complex scale. The wholesale restructuring of the organization and the formation of Transco resulted in a dramatic increase in UNIX systems during 1995 and prompted the need to implement a much more structured system which, amongst other functions, could handle a range of platforms. Chris Lees: We were keen that any new system had to have automation built into it, and that we wouldnt have to spend time writing code or recruiting specialist skills to maintain it. Good support and a responsive supplier were also important.
Having looked at the systems in the marketplace, Athene was selected in December of that year, and implemented early the following year. Athene Control Centers, which collect the data and are at the heart of Athene, are installed at both Killingworth and Hinckley. Capacity Planning is run as a completely transparent operation with staff at both locations able to capture and monitor data from any system, communicating across Transcos Wide Area Network. Chris Lees: This provides us with a totally flexible mode of working, and the ability to share workloads and responsibilities regardless of where staff, machines or the data are located.
The day to day work of the Capacity Planning team is centred around the production of two monthly reports: a comprehensive, internal set of graphs which contains key metrics from all the systems, and a second management report, which provides a concise summary of key issues.
Chris Lees: In terms of the internal report, the automatic reporting facility in Athene has made a dramatic difference. Before this, we had to manually go to each of the dozens of systems, select the desired metrics, choose at what times we wanted to select the data, wait for this to be produced in a tabular format, then select and graph the particular data to be analyzed, change the settings on the graph to make it as clear as possible, and then print it out. An incredibly time consuming process, which could literally take one member of the team a week to complete before we were even in a position to start looking at the data and assessing the information. With the automated reporting and scheduling in Athene, we simply decide which metrics we want to collect for each machine, specify how the resultant graphs should be presented, and schedule when and how often we want to see them. Now, the information is just sitting there waiting for us; we can start analyzing the data from day one of the month. |
Transco initially look at three primary resources each month: CPU utilization/queuing, memory utilization and I/O performance although the particular metrics selected to do this vary according to the flavor of operating system Having the data available so quickly enables us to use our experience and knowledge to assess where we might have performance issues, rather than spending time on the more mundane data collection process, said Chris Lees. We look at the data within the Capacity Planning team, perhaps drill down further, and then discuss it with the technical teams responsible for each platform, each month. It is at this stage that we are able to judge whether a peak, for example, was a one-off occurrence due to a particular workload or whether it heralds a trend that needs to be monitored more closely.
Recommendations and decisions are taken at these monthly meetings, which then go on to provide the basis for the second report, which is distributed to Transco management.
The management report uses a traffic light principle: systems are identified as being in a red, amber or green state, with the relevant notes and details of actions taken against those which merit it. Chris Lees: Traditionally, the monthly management report showed each machine and the trends of its performance, irrespective of any issues. Even with a summary, it really was just too cumbersome and too detailed for management. Since the beginning of the year, this report has been reduced to a highly focused summary of about half a dozen pages. This new format has been very well received. It exemplifies, in my opinion, the fact that Athene handles the mechanics of collecting and collating the data, and we are able to add value by attending to the issues, rather than just passively putting data into a report with no actions attached to it.
 |
Analysis of UNIX data over the past year has shown where I/O bottlenecks were occurring. Raising such issues has led to reviews of file placement and database reorganizations to improve throughput. Other issues that have been highlighted through routine monitoring include the identification of rogue processes which have been consuming abnormal amounts of processor time. A user of an NT system recently demanded proof that both processors on his system were being utilized effectively a quick analysis of the latest data showed an even balance of workload across processors.
Transco is predominantly using Athene as a monitoring tool. Chris Lees, however, is keen to use it in a more proactive role for modeling, forecasting and trending. In the ideal environment, Athene can become an integral part of the business process and an important support for decision making. Proposed changes, at any level, can be modeled in Athene and a range of what if scenarios can be produced.
Chris Lees: The original specification for a Capacity Planning tool was that it could handle our primary platforms and that it was fully automated. We have achieved this with Athene. In addition, we wanted a supportive and responsive supplier, and we have found this in Metron. For example, a number of the enhancements we have recommended have now been incorporated into the product, particularly when they offer benefits to the wider Capacity Planning industry.
Next Case Study |