Performance Tuning Tips

Best Practices, Development, Operating Systems, Portal Server by matt on June 8th, 2008 No Comments

For those of you unaware, Dev2Dev is meeting a grisly fate:  it won’t be with us much longer (apparently all content except for the blogs up there will be migrated to the Oracle Mother Ship).  No doubt our friends at Oracle will come up with an alternative way for employees to speak their minds, but for now many employees and “alumni” still have something to say, and we want to give them a forum.  Today’s guest post is from Ray Gao, one he started while still at BEA (warning: dev2dev links are not long for this world…) on Performance Tuning of the ALI Portal.

The gist of his post is that there are a lot of moving parts in performance tuning, including the portal, remote tier, and the database: the performance chain is only as strong as its weakest link.

To get more of his great high-level overview on performance tuning, click on through for a good read!


I had written a blog several weeks ago about how tune portal applications for performance on the dev2dev.bea.com website. As I am accumulating more material on this topic, I decided to write a more detailed blog with Function1.

The ALUI portal is based on a stacked and component-based architecture.

The following diagram shows various ALUI portal modules.

perf_modules.jpg

Over the years, the ALUI portal has grown to provide a rich set of functionality. This means that the skill required for managing the portal and the complexity for implementing high performance applications (portlet) increased as well.

 

In addition to having functional & integration tests, high performance applications should also be vigorously tested under stress tests (in conjunction with the portal). Load tests help to detect issues, such as the application’s buckling & failure threshold – the maximum of users, load, and throughput, uneven load distribution, etc.

 

There could be many “moving parts”; a systematic approach is recommended for diagnosing performance bottleneck(s). We recommend using a checklist such as the one below to comprehensively test as many of these “moving parts” as possible:

  1. Examine high-level system architecture, e.g. component & deployment diagram, network infrastructure, to identify design flaws.
  2. Obtain & review performance metrics/reports and stress-test results
  3. Perform a detailed analysis of subsystems & servers, e.g. resource consuming, job scheduling, etc.
  4. Carry out a sequential (time-series) study of 2 & 3 in a time period over few weeks or several test runs.
  5. Use profiler to pinpoint specific implementation issue, e.g. insufficient caching & code optimization, etc.

A broad task is reviewing the architecture diagrams and performing a capacity planning analysis. This helps you identify whether sufficient resources (bandwidth, CPU, disk I/O, load balancing) have been allocated for the corresponding number of users. The ALUI Portal Deployment Guide shows some baseline numbers (http://edocs.bea.com/alui/deployment/docs604/planning/index.html), e.g. hits/pages per second for a system of certain CPU and memory configuration.

 

The next step is to review performance metrics. Leading Stress-testing tools such as Load Runner or Microsoft Web Application Stress Tool, can produce diagrams showing correlation between load and system performance.

perf_errors.jpg

Stress Test – User Load

 

perf_throughput.jpg

Stress Test – Throughput

 

perf_hits.jpg

Stress Test – Hit per Second

System performance has many different perspectives, e.g. CPU load, network bandwidth consumption, Memory, Disk I/O. Therefore, it is a good practice to design multiple test-cases for different criterions. For example, a “login-only” test consumes less network bandwidth than a “community page test” which needs to repaint the entire browser window. Therefore, the “login-only” test could help you identifying the maximum sustainable “Virtual Users”; whereas, “community page test” identifies the maximum “Throughput per Second” & “Hits per Seconds”.

 

Time progression tests (e.g. tests from week one, to week two, to week three, so on) can be used to further identify improvements in the application performance.

 

All systems have “saturation points”, beyond which the performance “levels off” (the response time no longer increases) even as the load continues to expand. This is similar to a car engine, where power is a byproduct of RPM. This phenomenon can be explained by examining various joints in the entire system. Caching mechanisms exists at various points & sub modules, e.g. application server, web server, network load-balancer, Database cluster, etc. They are buffers against increased load.

perf_throughput2.jpg

There is also a “buckling point”, at which the system fails. This can be identified by observing huge spikes in the number of errors, time-out, and the system hanging or rebooting.

 

Next, system resource consumption should be examined. Windows based systems use Perfmon tool for recording the system metrics during the stress-test, e.g. Processors load, network bandwidth consumption, Memory allocation, virtual memory swapping, Page Fault, Disk I/O, Process Execution Time, Request Wait, number of Worker Process Restart, etc. On Unix based machine, there are many powerful and robust tools for gathering resource consumptions, e.g. “iostat”, “cpustat”, “vmstat”, “top”, “sar”, to name a few, … Those data can also be used in making capacity planning/sizing decisions.

perf_perfmon.jpg

Portlets are hosted within application servers. The app servers’ log and performance metrics should also be reviewed. Depending on which platform, various tools can be attached to monitor the application servers’ health. On Java-based application, JMX plug-ins monitor performance metric and manage the application server. IIS servers can be managed thorough MMC plug-ins.

 

ALUI has a tool called PTSpy, which is based on Log4J and Log4C. PTSpy has several useful performance logging calls. Performance results can be recorded using various appenders (network, file, DB) and reviewed later.

perf_ptspy.jpg

It is also recommended to use application profilers in the development environment to optimize the portlet code. Application profilers can identify both common issues, such as lacking caching mechanisms, insufficient connection pooling, memory leaks, as well as showing less obvious things, such as how often certain methods are called, the size of objects being passed between methods, how many instances of an object or method are created, etc. Many IDEs offer either free or commercial grade application profiler plug-ins. The Eclipse IDE has a Test & Performance Tools Platform (TPTP) Project. It allows you to examine the code down to a line-by-line granularity as well as monitor the application at much higher-level, e.g. Web-Services calls. Similar tools are exists for Netbeans IDE, MS Visual Studio, Together J, etc.

 

Application & Server configuration files should also be examined. These include web.config, web.xml, portlet.xml, resources.xml, jdbc pool parameters (connection timeout, max number of connections), etc.

 

Applications use the database to persist non-volatile data. Therefore, properly tuning the database (SQL calls & package) will be very important aspect of performance management. Aside from system level indicators, including memory-consumption, network activity, total CPU utilization in minutes & seconds, number of worker threads, DB monitors can show the frequency of tables being used and updated and how long it takes to join different queries. These metrics can lead to better designing the database schema and/or improving queries.

perf_database.jpg

Database Server Resource Utilization

perf_database2.jpg

A system is only as good as its weakest link; whether it is the network, application logic (e.g. caching, patterns), database, or management. For example, job scheduling can also affect the system performance. Running virus-definition update and doing file backups during peak usage hours will degrade system performance. All non-essential activities, such as system back-ups, anti-virus scanning, and security policies (LDAP sync), should be scheduled to run during off-peak hours.

 

Portlet application performance tuning is a methodical process. It covers multiple facets, e.g. infrastructure, management & planning, profiling, patterns, etc. Some improvements are low-hanging fruit; others are not. They can be placed into following quadrants.

  1. Easy to implement, significant performance improvement.
  2. Easy to implement, insignificant performance improvement.
  3. Difficult to implement, significant performance improvement.
  4. Difficult to implement, insignificant performance improvement.

Decisions to modify the system can be influenced by time, budget, and required skill-sets.

No Responses to “Performance Tuning Tips”

Leave a Reply

You must be logged in to post a comment.