September 2007 Archives

Remote Desktop Tricks

Comments (0)

Rarely do we get the "privilege" of working in freezing cold server rooms to access Aqualogic portal servers directly. Instead, the connection of choice is the Remote Desktop Connection application that connects to Terminal Services on those systems.

It's pretty common to have to move files to and from these servers, and the usual route most people take is mapping a file share from their desktop to the remote system. But, another relatively unknown way to do this is to access your local system from the remote session itself, so you don't have to switch back and forth between Windows Explorer and the remote desktop. To do this, simply click "Options" in the Remote Desktop window before connecting, then go to "Local Resources". There you can set up local resources (such as your clipboard or local drives) to be visible in the remote session. Then, once you connect, you can just go to Windows Explorer and you should see your own local drive show up as a share on the remote server.

Another relatively unknown feature of Remote Desktop is the ability to connect to the "Console" of the remote server. If you work in an environment where a lot of people connect to the server, you're all too familiar with the usual 2-session limit imposed on remote connections; getting an error "The terminal server has exceeded the maximum number of allowed connections":

This error can happen even when noone is currently connected but their sessions are still alive. To guarantee a connection to the remote server, use the following command:

mstsc /v:SERVERNAME /console

This will start up the remote connection as the "console", as if you were physically sitting in front of it (rather than the 2 virtual sessions normally used). Keep in mind that it will log off anyone else connected to the console, even if they're actually logged on locally. Usually, this feature is used to go in to the console, open the Terminal Services configuration, and close the open sessions that are open so that you can log in the "normal" way.

Publisher Fault Tolerance - Part 1

Comments (3)

Publisher is one of the most critical infrastructure pieces of the ALUI portal, since it provides some of the most visible pieces of the average portal page (such as the header and footer). For this reason, it should be configured to be as fault-tolerant as possible.

A common misconception is that Publisher is a single, self-contained application that must be “load balanced” to achieve fault tolerance in the infrastructure. In reality, there are really four separate pieces to Publisher, each with unique characteristics and fault tolerance configurations.

Name Description Target Audience Can be redundant?

Publisher Application

This is the publisher application itself that drives Publisher Administration and Explorer, where Administrators create, edit, schedule, approve, preview, and publish actual content.  Once content is published, it is moved to the Published Content server.

Admins

Not easily

Workflow

This is the workflow engine that moves Content Items through predefined workflows on their path to publication.

Admins

Not easily

Publisher Redirector

This is the part of Publisher that the portal connects to when retrieving Published Content.  Essentially the portal sends the portlet ID to a published_content_redirect.jsp page, which looks up the location for the published content and issues a redirect back to the portal.

End Users

Yes

Published Content

Once the portal gets the redirection from the redirector, it issues another HTTP request to the Published Content web server, which is essentially a web server that sits in front of a file system that contains all the HTML and other content produced by Publisher.

End Users

Yes

The traffic flow is shown below; in general:
1) The portal goes to one or more Redirectors, which issues a redirection back to the portal.
2) The portal then issues HTTP requests to one or more Web Servers
3) The Web Servers retrieve the content from the file systems (either local or a NAS)

Also included in this diagram is how Publisher publishes content:
A) When new content is being published, Publisher can only write to a single file system or FTP server.
B) Some sort of mechanism needs to be implemented to synchronize the published content from one file system to another (if there are two). If the file system is actually a NAS, both Web Servers can be configured to point to the same file system location, since redundancy is already built in to the File System.

Because the number of End Users vastly exceeds the number of administrators, and content administration is not as mission-critical as actually displaying the content to hundreds or thousands of users, it really is only critical to set up the redirector and the Published Content itself in a load balanced/fault tolerant fashion. We'll discuss configuring each of these in upcoming posts.

Brand Your Portal's Icon

Comments (0)

Here's a neat (albeit relatively trivial) trick: Ever notice when you go to some web sites, the default icon changes from the little IE or FireFox icon? You can do that for your portal too, and it's simple.

Just create a small icon (16x16), save it as "favicon.ico", and put it in the root directory of your portal's app server. Browsers will (most of the time) request this file and replace the icon automatically, even using it for bookmarks into the portal:

For more gory details on how all this works, and some other caveats, check out the Wikipedia article.

Cool Tools Part II: BGInfo

Comments (0)

If you work different environments (dev, test, prod) with a lot of servers (portal, publisher, collab, search), you've no doubt experience the problem where you're looking at a remote window of a machine and completely forget which machine it is. Or, say, you know what the machine name is from the title bar, but you forget which system it is or what environment it's in. If you haven't used it already, an excellent tool is SysInternal's BGInfo. Basically, at login, it simply generates a wallpaper for you with any static or dynamic text you'd like.

For example, you could make the wallpaper red for production machines, orange for test, and green for dev. You could include the machine name, processor information, hard drive space, IP address, and even a text description about the box (i.e., "ALUI Search Server") so that you can tell exactly what box you're looking at in a glance.

Setup is easy - just run the tool to configure what text and background you'd like, then add a shorcut to your startup folder. Include the /timer command line parameter ("C:\Bginfo.exe /timer:0") to ensure that the BGInfo configuration interface doesn't appear every time you start up, but a new wallpaper is generated.

Check it out here!

Know your traffic flow - Part II

Comments (0)

We've touched on this time and again - optimizing your Aqualogic Portal is tricky because there are so many moving parts. We've talked about optimizations on the front end (image server), back end (portlet servers), and today I'd like to share a diagram that covers a lot of the HTTP requests in between.

The following diagram shows many of these transactions in a standard portal page request (click the images to load them full-size):


This diagram shows key events in a standard transaction:


Finally, this one shows key optimizations to tweak performance at various locations in the transaction chain:


Want it in Visio? It's all yours!

View ALUI Portal Source for Performance Data

Comments (0)

Here's a quick and easy tip: when someone reports the ALUI Portal is "slow", have them do a View:Source on the page and scroll to the bottom. There they should see an HTML comment on how long it took for the page to render:

Not only can you see what server the page was loaded from (in the event that there are multiple servers behind a load balancer) and the portal version, you can see how long it took for the portal to put the page together and respond to the request (in milliseconds). If the number is low (less than 2 or 3 seconds), then the slowdown is likely the network or any of the devices in front of the portal (like the load balancer itself). If, on the other hand, it's pretty high (over 5 seconds), you can probably rule out the network as the primary cause of the slowdown, and start focusing on the portal and portlet servers.

Integration products in the Aqualogic product suite often have diagnotic pages built in, and sometimes it's not that obvious how to get to them. But if you know the right URLs, they can provide a huge amount of useful information and sometimes even tips on how to remedy common problems. Note that you may be prompted for authentication when you go directly to these URLs; here you'd use the "authenticationid" user name and password you entered during installation. Click the screen shots to see them full size.

Aqualogic Publisher's diagnostic page can be accessed through the "Publisher Administrator" portlet, or you can go directly to its URL: http://publisherservername:7087/ptcs/console/index.jsp.

Aqualogic Publisher's Workflow diagnostic page can be accessed via a link on the Publisher Diagnostic page, or you can go directly to its URL: http://publisherservername:7087/wfconsole/status-index.jsp.

Aqualogic Collaboration's diagnostic page can be accessed in Administration under "Select Utility->Collaboration Administration", or you can go directly to its URL: http://collabservername:11930/collab/admin/diagnostic/index.jsp.

Finally, Aqualogic Studio's diagnostic page is only accessible through a direct link (or you can create a Web Service/Portlet and view it through the portal): http://studioservername:11935/studio/jsp/Admin/diagnostics.jsp.

11/2/07 Update: Ray Gao posted an even more comprehensive list than this one on his blog a couple weeks ago; check it out for a quick reference of virtually all the known diagnostic URLs out there (and one or two that aren't known...).

Don't Gateway Static Content

Comments (0)

In my last post, I talked about all the HTTP requests that are made during a single ALUI portal page request. One of the things you may have noticed was that all HTTP requests were routed through the portal and the portal gateway. In cases where you've got portlet HTML that needs to be aggregated on a page or transformed using Adaptive Tags, that is necessary.

However, content gatewayed through the portal by definition puts load on the portal server. A common mistake people make when publishing content is to put ALL content in the gateway space for a Publisher Content Item, including things like images, style sheets, and javascript files. This creates unnecessary load on the portal server, since it's just acting like a proxy server and not really doing anything with the content it's proxying.

Fortunately, Publisher offers an easy fix for this; it allows you to publish images to a separate target that can be outside the gateway space. That way, the HTML that you create in a Content Item can be gatewayed (so that it can be transformed and aggregated on a portal page), but the images can be placed on the image server. Then, when the browser needs to load all the associated images, it can get them directly from the image server, resulting in faster page load times and less load on the portal server - a win/win proposition.

To access these settings, simply right-click a folder in Publisher and go to "Publishing Target". There you can specify the other Publishing Target for Images.

Note that you shouldn't just do this on your existing content without making sure the image links are set properly, because it's possible that existing content uses links that point to the original publishing target. But it still could be worth the effort of fixing all those tags, given the performance benefits you're likely to see.

Know your traffic flow - Part I

Comments (0)

A complaint that comes up all the time is "The portal is slow". And from the end-users' perspective, this can be true - users shouldn't have to wait more than a couple of seconds for a page to load.

But that simple statement overlooks the fact that there are many different servers involved in the Aqualogic portal architecture. Unlike a simple web site, the browser is not just going to a single web server and loading static pages. Instead, there are a lot of other HTTP requests that happen behind the scenes that may be the culprit. So when someone tells you the "portal is slow", keep in mind that the issue may not be with the portal server at all.

Take the following traffic flow diagram for a very simple portal page with a single Publisher portlet on it:

Here's the basic gist of what's happening:

  1. Browser makes request for page; goes to Load Balancer
  2. Load Balancer directs the request to a single Portal Server
  3. Portal Server sees that there's a header, footer, and portlet on this page; for each of them it makes an HTTP request to the "Published Content Redirector"
  4. Published Content Redirector does a cache or DB lookup to get the published content URL for the content itself (header, footer, and portlet) and issues a 302 (Redirect) back to the Portal
  5. Portal gets the 302 (Redirect) response and makes another request for the published HTML
  6. In some cases, that published content can be code that goes and gets other content (say, a stock price to show up on the header)
  7. Portal aggregates all the Remote Portlet HTML and builds a single page; returns to Browser through Load Balancer

Notice that in this example, there are roughly 10-12 HTTP requests that are actually happening, and if a slowdown happens in any one of these areas, the browser's page request will take a long time, because the portal has to wait for all the remote portlets to return their HTML before it can generate its response. This is the main reason you shouldn't set your web service timeouts too high - it's better for a single portlet to time out and the user seeing an error message than for the entire page to time out in the browser.

Stay tuned - more on this in another post, include a much more comprehensive diagram!

Changing SSO Settings

Comments (0)

I was recently at a client site who was having problems with their Single Sign-On Settings. They were using RSA ClearTrust with ALUI running behind WebLogic 8.1 and were seeing performance issues related to SSO. We started troubleshooting the problem by looking at a combination of RSA settings, the portalconfig.xml, and various settings on Apache and WebLogic.

The first thing that we found, which is a common best practice that is avoided, was that the whole portal application was protected by SSO instead of just the SSOServlet. When SSO is enabled in the portal, the first thing the portal will do when it receives a new request is redirect to the SSOServlet, which defaults to /portal/SSOServlet. By limiting your SSO protection to just the SSOServlet, you allow the portal to automatically log users into the portal and also automatically get redirected to the SSO Login screen without the overhead of having every portal request go through the SSO system. By making that change in their system, we were able to increase performance test numbers by over 100%.

The next thing we changed was session timeouts. When deciding session timeouts for the portal and SSO you want them to be close but have the portal session timeout just before the SSO Session timeout. Why? Let's look at both pieces of that statement. First, you want the sessions to be close mainly for performance reasons. Let's say you have your SSO session timeout set at 8 hours and your portal session set at 20 minutes. In this case, after every twenty minutes of inactivity, the portal's session will die and there will be new session created and a "login" to the system will happen behind the scenes. You want the portal session to timeout right before the portal to enforce the session timeout of the SSO server. Remember, as I mentioned above the portal will only redirect you to the SSOServlet when there is no portal session. So, if our scenario from earlier in this paragraph is reversed, the portal session will be valid for 8 hours, but after 20 minutes of inactivity the SSO session will be invalidated. Because of this, the user will be able to do whatever they want normally within the portal, but if she happens to launch another application that is protected by SSO she will be prompted for authentication because her SSO session has expired.

The last change we made was less of a best practice and more dealt with ClearTrust specifically. When we made the first change (only protecting SSOServlet), we had a problem where RSA would not recognize the url /portal/SSOServlet as a valid protected resource and would let all traffic through without protecting it. After trying numerous syntaxes to get it to work, we came up with a workaround. We changed the SSOServlet URL. To do this, we had to make 2 changes on the portal side and one on the RSA side. First, we edited the web.xml in the portal.war file and changed the from "SSOServlet" to "/sso/SSOServlet". Then we edited the portalconfig.xml and changed the SSOVirtualDirectory Path from /portal/ to /portal/sso/. Then finally we changed what was protected in RSA to /portal/sso/*.

So, to summarize. The major things to look for with SSO and the portal are:

  1. Make sure you are only protecting the SSOServlet.
  2. Make sure your portal session timeout is slightly shorter than your SSO session timeout.
  3. If you are having problems protecting only the SSOServlet, change the path for it.