Home What We Do Who We Are Case Studies News & Events Job Opportunities

What We DoRare & Dear

A Methodology for Website Performance Optimization

Often the first approach to optimizing website performance is to add more hardware. This works as long as hardware is the limiting factor, but may have little or no effect if the bottleneck is really the database behind the site. Faced with rapid website user growth, a shortage of skilled technical staff, and the careful scrutiny of Venture Capital investors, many dot.com startups are struggling to find effective ways to optimize the performance of their websites. This is a growing concern among IT managers, especially those who oversee high-performance, high-availability, rapidly growing web-based systems. The ultimate goal of a system manager is to stay ahead of the growth curve while providing optimal service. Many issues can be resolved by a competent internal IT staff. However, knowing when to bring in outside technical experts and how to work effectively with them can mean the difference between quick and efficient solutions or unnecessary expenditures and possible system failure.

This paper details the Rare & Dear Performance Team’s approach to diagnosing performance bottlenecks, including the strategies and tactics it employs to resolve performance limiting issues in a quick and effective manner. The methodology is neither startling nor proprietary, rather it is a well-planned, focused strategy that they developed to intervene in performance-critical situations. Knowledge of a process for identifying problems, implementing short-term tactical solutions with longer-term system strategies, and understanding the critical success factors for website performance remediation, are fundamental for every IT professional.

Key Success Factors

The following points describe key success factors that lead to victory .

  • In-depth knowledge of system:  It is important that the Performance team quickly gain a deep understanding of the entire system that is having problems.  This can be achieved through system monitoring tools, written documentation of system configuration, written records of system change activity, or oral recollections by Client staff members.
  • Team approach:  It is important that the Performance team take a broad look at the entire system, and not lose time getting lost in the details of any one sub-system.  Several individuals with complementary skills and background working as a team provide the most efficient means of achieving this goal.  A methodology that forces individuals to review and critique their own assumptions in light of all the information about the system allows the team to stay on track.
  • Top management support: The Client must commit full and unequivocal top management support to the Performance team.  They must communicate this to the staff consistently and frequently so that access to systems and reports, answers to questions, and identification of other relevant resources were readily obtainable by the Performance team.  Confronted with competing urgent problems and projects, the staff must promptly respond to Performance team requests.  Without this commitment, the process will take much too long to be tolerable in the dot.com world.
  • Dedicated command center: If possible, the Performance team should work off-site, to be isolated from the mainstream of daily operations. If that is not possible, then the Performance team should be housed in a dedicated operations room.  Here, they will set up network and system access, phone lines, email, and a meeting place for the team. It is a “do not disturb” zone that insulates them from being bombarded with worried developers and supervisors wanting the latest status report.
  • Detailed activities log: The Performance team maintains a detailed activity log of all meetings, data, performance issues, system development activities, communications and contacts so that as system modifications are made, they could be correlated with other events.  The log becomes part of the system documentation.
  • Internal and External Performance Teams:  Most companies will have their own internal performance team that works on performance issues prior to the arrival of the Performance team.  The external Performance team will work closely with the internal team but will bring fresh perspective provided by an external and unbiased team. 
  • No Assumptions: The Performance team arrives without preconceptions or attachment to the existing systems.  They question every statement and explanation presented by the staff.  Their holistic approach emphasizes the importance of the entire system and the interdependence of its parts.
  • Road to Recovery: The Performance team will provide the Client with immediate relief through temporary work arounds.  However, these short-term actions may not be appropriate for a long-term solution. The Performance team must also provide a prescription for system modifications that can provide a long term improvement.
  • Definition of Success: Criteria for successfully solving any performance issues are difficult to quantify in advance.  A working definition of success may be to “provide system availability during all levels of current site traffic and allow the system to function well into the future.”  This vague, but recognizable, definition can be workable when there is mutual trust between the Client and the Performance team.  In other situations, a more measurable definition may need to be defined, such as:  “>99% availability in a day” with “average user response time < 2 sec” for “two consecutive days”. 

Typical Situation and Tactics

This section describes a typical engagement by the Rare & Dear Performance team. 

As the first step in getting to know the Client system, Rare & Dear provides the Client with a Questionnaire asking for detailed information about technical aspects of the web system and its performance characteristics.  As the second step in the process, the Performance team installs a proprietary Remote Diagnostic package on the Client system to gather preliminary system and database statistics.  The data from a 24 hour period is analyzed in light of the information provided by the Client.  Both static and dynamic measurements are taken to highlight OS and database activity as well as website response time (from the users’ perspective).  These measurements are charted, graphed, compared with each other, and compared against “reasonable” values.  The result of this is a report that indicates the most likely areas to investigate more deeply for solutions to performance issues.

After the Remote Diagnostics are run, a decision is made as to whether or not the remediation work can be accomplished remotely or whether is requires the Performance team to work on-site.  If the team must work on-site, a command center is set up as described above.  The team is then briefed by top management and introduced to key personnel who will facilitate access to systems and information.  For remote projects this is accomplished via teleconference or even video-teleconference. 

As typifies all dot.com projects, this activity needs to be defined and implemented quickly, with maximum communications but minimum impact on the Client’s existing systems and staff.  To get a quick handle on the current system performance with an eye to identifying bottlenecks, the Performance team takes a broad look at a number of different factors with one sub-team assigned to investigate at a macro level (top down) while a second sub-team is assigned to investigate at the micro level (bottom up).  The entire Performance team meets every two hours to share findings and fine-tune the investigation.

Any existing tools for gathering system performance data are reviewed.  A Client may have a wide variety of tools in place and still not have the staff skilled enough to identify a complex problem.

The Performance team interviews key staff to gain knowledge of the historical site performance and growth patterns, recent system changes and planned system changes and additions.

The Performance team initially surveys the entire system for performance and capacity metrics.  The depth and breadth of the Performance Team’s knowledge enables them to quickly focus on critical issues that eventually lead to optimal solutions.  Their fresh, detached perspective supports the Client’s staff  to question all assumptions and hypotheses formed by both internal staff as well as Performance team members. 

A number of questions are asked of each application that is having performance problems, for example:

  • How is the application using and accessing the data?
  • How is the database parsing the SQL statements?
  • Are sufficient operating system resources allocated and configured to support Oracle?
  • Are sufficient hardware resources available and configured to support Oracle?

After synchronizing performance graphs from the various sources, a picture often emerges showing a high correlation between certain database activity and site performance degradation.  A theory of the cause of the bottleneck, can be hypothesized.  More examination of the apparent bottleneck will either prove or disprove the theory.  One or more rounds of hypothesis and examination may be required to clearly identify the problem(s).  Often there is more than one problem that contributes to poor site performance.

Within a day or so, the Performance team will usually have isolated the problem and provide a temporary work-around to achieve acceptable performance (if the live site is experiencing unacceptable performance).   After the Client has implemented the work-around, the Performance team continues to observe system behavior over the next 24 hours.  If performance is acceptable, they will develop both short term and more robust long term recommendations to make the improved performance permanent. 

After a de-brief to Client management, the Performance team will terminate the intense site scrutiny.  However, at the Client request, they may continue to monitor the performance for a longer period to assure no other problems arise.  A full report of the team’s findings and recommendations is also provided to the Client.  Recommendations may include hardware, software, development methodology, operations, and even management approach.  The Performance team is then in a position to provide direction and oversight to Client staff in implementing any of the recommendations.

Conclusion

If you recognize the need for outside intervention, you must provide full support down the chain of command to facilitate quick turnaround and a good return on your investment.  By using an external team of database and system performance experts, you can often leverage internal knowledge with an experienced and complementary external team to implement a performance improvement plan that can save hundreds of thousands of dollars in avoided hardware upgrades.

© Rare & Dear, Inc. 2002    

Top
Homepage | What We Do | Who We Are | Case Studies | News & Events | Jobs  | Internet Links

W O R L D   H E A D Q U A R T E R S
P.O. Box 408, Kalaheo, Hawaii 96741
voice 808.332.6633 | fax 808.332.5831 | toll free 888.727.3332
S A L E S   O F F I C E
1164 Bishop Street, Suite 124 #230, Honolulu, Hawaii 96813
voice 808.542.9752
General Inquiries | E-mail Resume