Howdy all. Hope the nice weather is finding happy, healthy, and allergy free. As we were doing spring ALBPM house-keeping with a client the other day, we stumbled upon a bit of a problem with the way that BPM loads some images. This led us to go digging into how the ALBPM->Portal integration works. And anytime I get a reason to dig into the guts of software, it usually ends up as a blog post….hope you enjoy.
Cliff Notes if you don’t want to ready my wordy explanation
- The BPM->Portal integration makes use of the old Plumtree OpenControls libraries
- Out of the box, the OpenControls libraries in use with the BPM->Portal integration are configured to go all the way back to the BPM container to load static files (images, js, css, etc) on every request to the portlet-ized BPM workspace.
- There are a bunch of static files that get loaded every time you hit the BPM workspace. The net result is that the BPM portlets are slower to render than they should be.
- There’s a servlet filter configured in the BPM workspace web.xml to cache static files on the client, but it doesn’t seem to work.
- You can force Open Controls to load images from a webserver of your choice (i.e. the portal image server where you presumably having caching set up) by adding the following entry to $BPM_HOME/enterprise/webapps/workspace/WEB-INF/web.xml
- You’ll also need to copy the following directory into your bpmContext folder of your imageserver:
- Bounce the BPM workspace after making the changes above and you should see a marked improvement in workspace performance
For the masochists amongst you, the long-winded explanation
First, a “Did you Know” on the BPM->Portal integration
- Did you know that the BPM->Portal integration was originally written by Plumtree way back when as a way to bolt Fuego BPM into the Plumtree Portal? This original codebase is still more or less intact, and is still used for the integration today.
- Did you know that the BPM->Portal integration is built in Java and uses Java Server Faces (JSF)?
- Did you know that the BPM->Portal integration also makes use of the somewhat dated Plumtree OpenControls libraries?
- Did you know that the code for the standalone BPM web-based client is a lot cleaner than the BPM->Portal integration?
- Did you know that there are servlet filters configured on the BPM Workspace that should do neato stuff like caching and compression auto-magically?
OK, so the whole “Did you know” thing is a bit of stretch, and really just a vehicle to give you background on the issue we were researching. So….
The Issue We Were Researching
It turns out that, out of the box, the portlet-based BPM workspace doesn’t handle some images too smartly. Specifically, on every UI request, there are a bunch of static images that get reloaded all the way from the embedded Tomcat server that runs the BPM workspace. Now, everybody and their brother knows that it’s just good common sense to cache images, so what gives? Welp, let’s start at the very beginning and see. After users reported slow-loading images in the BPM workspace, we used one of the best tools known to mankind
to take a look all the HTTP requests being made on page load:
As the lovely highlighting above points out, some images in the workspace are getting loaded from the portal imageserver, and are being appropriately cached in the client browser. We know they’re being cached by that handy-dandy 304 HTTP status code
How we researched the issue
This is the section of the blog post where I take you deep inside the black art of tearing apart a web application and making educated guesses to arrive at a testable hypothesis. Strap in for a non-stop thrill ride.
Anytime I’m faced with a black box type problem (i.e. there’s something going wrong with a piece of commercial software to which I don’t have the source code), I more or less go through the following checklist:
- Is this something I’ve seen before?
- Is this something that somebody else has seen before?
- Is it going to be more painful to open a support ticket on this than to just figure it out myself?
- What do I know about the inputs and outputs of the black box?
- What do I know about how the application is built?
- How can I tear the application apart to learn something useful?
In this case, we were looking at a problem that we’d never noticed before, nor could we figure out a way to ask the interwebs if anybody else had seen it before. I was pretty sure that going through the process of working a support ticket was going to be more painful that just fixing the problem (besides, what fun is working with support when you can break into the guts of a system?), so I asked myself, “What do I know about the inputs and outputs of the black box”?
- Input -> HTTP Request to a web page that loads mu
ltiple static files and some dynamic content
- Desired output -> Cached static content
- Actual output -> Non-cached static content
Then I thought about what I know about how the application is built:
Finally, how can I tear the application apart to learn something useful:
- Look at the inputs (i.e. look at the URLs being requested)
- Look at the web.xml and other configuration files of the application to see how everything is glued together
- Decompile the source to see what the hell is going on
So off we went. The starting point in our maze was to take a good look at the problem image requests (Warning: I’m going to get all stream of consciousness on you here…welcome to my troubled world)
Let’s strip that down to look at the real interesting stuff:
OK, I know that I should probably be logging onto the bpm_portlet_server box and figuring out what’s listening on port 123. What do you know, it turns out that there’s a Tomcat instance listening on port 123 - that’s nice.
Looks like the name of the web app running on that Tomcat instance is workspace. Maybe there’s a WAR file or something sitting around that I can play with. Let me take a look in the Tomcat webapps directory on bpm_portlet_server. Hmm, nothing here…WTF? Ah, but I know that this is a special Tomcat server that’s configured all crazy-like because it’s embedded with the ALBPM distro. Let me poke around in the ALBPM directories to see if there’s anything interesting there. Well how’s about that, there’s a directory at $BPM_HOME/enterprise/webapps named “workspace”. Seems like this might be useful. Let’s take a look. OK, there’s no WAR file here, but I do see a WEB-INF directory, guess they’re just deploying an exploded WAR to the Tomcat server. Let’s see open up the web.xml and see how this thing is configured:
Well that was pretty useless. Except…hmm…what’s this:
Looks like there’s a servlet filter configured that should be caching the images. Let me just decompile that guy real quick and see what he’s doing:
What the hell? This thing should looks to be setting the cache expiration. Let me go back to Firebug and see what the max-age is on the images I’m loading:
WTF? I guess the portal gateway is somehow stripping off the cache headers for this image. Why don’t we just hit the web-app directly to see if the portal is screwing with the headers. Back to browser/Firebug pointed at:
<Reload a few times>
<Look at FireBug to see cache headers>
WTF?!?!?! OK, guess that servlet filter isn’t working right – that’s annoying.
Let’s take a look at that web.xml again…
Yep..still noise. Time to start randomly looking around in directories in the workspace web application.
<cd $BPM_HOME/webapps/workspace/jsf/><Look around at a million different files that aren’t very insteresting><Stumble upon …/workspace/jsf/view/viewPresentationNormal.xhtml>
OK, this looks somewhat useful. Looks like it’s building the BPM inbox. Let’s pay attention to this file. It includes viewPresentationPanel.xhtml…let’s open that up. OK, this guy is rendering a table. The BPM inbox table has images that aren’t getting loaded correctly. Let’s zero in here. OK, including yet another file…
Hmm, this guy is including a ton of stuff that look like low-level building blocks for a table cell. I think that the attachment gif was rendering incorrectly, so let’s look at the attachment file it’s including: instance/hasAttachments.xhtml
wait for it….
wait for it…
hasAttachments.xhtml is definitely interesting, if only because of this line:
OK, so we’ve finally managed to dive down to the where the app is making calls to get images. <Quick timeout to pat myself on back and IM friends a steady stream of profanities about how ridiculously ALBPM is built>. Now, that fn: prefix is a tag libarary. Let’s see what taglib is being used. Obviously that stupid taglib isn’t included in hasAttachments.xhtml, that would be too simple. So backtrace through the xhtml files until we get to viewPresentationPanel.xhtml which includes that line:
Now to find where the actual code for this taglib lives…BACK to web.xml
Yep, still noise. Except for:
Who am I to stop now? Let’s open up bpmWorkspaceLibrary.taglib.xml
OK, the class file backing the tag library is:
Off to WEB-INF/classes to see if we can find the class file. Nope, not there.
Never fear, WEB-INF/lib, here we come. Tons of jar files here…awesome. Oh well, seems likely that the jsfcomponent classes are probably in fuego.jsfcomponents.jar.
jar -xvf fuego.jsfcomponents.jar
Yep, there’re a bunch of class files that look useful in that there jar file. Time to start decompiling.
Note: At this point I’m going to spare you the horror of walking through the next hour or so where I decompiled a ton of class files. The upshot is that I ended up at a dead end. The images loading from these tag libraries were the ones that were loading correctly all along.
<Take another break to tell friends on IM that I’m quitting my job to go pursue my lifelong dream of being a garbage man>
Back to troubleshooting now. I notice in the xhtml files that there are also some calls that look like:
So let’s see what the oc: tag library is all about. ViewPresentationNormal.xhtml was nice enough to include the line:
Back to our old friend web.xml.
Surprisingly, still useless.
Let’s just go look for some jar file in the WEB-INF/lib directory that looks like it probably contains the opencontrols:
jar -xvf opencontrols.jar
Yeah, there’s a bunch of stuff there, but it’s just the standard Plumtree open controls code…I know I don’t want to look at that. Meh, maybe there are some more interesting jar files in here that I can blindly look into: fuego.workspace.jar you say? OK, let’s open it up.
jar -xvf fuego.workspace.jar
Hmm, there are a bunch of renderer classes in here. Let’s decompile one and have a little looksy.
Let’s see if the decompiled source is doing anything with images:
grep -i image WorkspaceTableRender.jad
grep -i img WorskpaceTableRender.jad
BINGO. For reals this time.
Hey, that line that says:
String urlBase = “/plumtree/common/private/opencontrols/image/table”;
Sure looks a lot like the problem URLs I was seeing back in Firebug. Let’s see how these URLs are getting built. Looks like something’s going on in XPResourceRequest. I suppose we’re going to have to decompile that file too.
The WorkspaceTableRender code is calling the GetReference method on XPResourceRequest, let’s hone in there:
Hey, there’s the “r=” that that I see on the bad image request URLs in Firebug. I must be in the right place. Let’s look at some more code here:
AH-HA. I OWN YOU BPM. I AM SO THE BOSS OF YOU!
So I bet if we just add an initialization parameter to that useless web.xml file that tells BPM and Open Controls to use the image server for loading images, everything will work.
BACK ONE LAST TIME to web.xml to add:
web.xml - No Longer Useless!!!
Bounce the BPM Workspace. Reload the page a few times. Look at the images in Firebug, and, viola! They’re all being loaded from the imageserver, and they’re all being cached. And hey, the BPM workspace sure does seem to be chugging along faster than it was before.
OK, not really the end. A quick summary
Look, I know the whole previous section was just a bunch of rambling to 99.99% of you out there. But I wrote it that way on purpose, because it’s a pretty accurate account of how a lot of debugging/trouble-shooting gets done when you’re dealing with COTS software. You don’t have the application source to definitively know what the code is doing. So you have to look for clues and follow educated guesses. Sometimes those clues are red herrings and lead to dead ends. But if you unravel enough thread, you eventually get down to the crux of the problem. It just takes time, patience, and a good understanding of how applications are built and put together.
You’ll may have noticed that I referred to “we” several times in the text above. No, I haven’t started referring to myself in the royal third. I actually had a partner in crime for some of this debugging exercise. Some of you may know him as Howard, some of you might refer to him as Ross, but I just call him HRoss. In any case, he’s one of the best portal consultants still working directly for Oracle. So if you’re still shelling out the big bucks for an Oracle consultant (rather than a more reasonably priced alternative *cough* Function1 *cough*), ask for him by name.
Tech folks Only
Are you a non-techie trying to read this section? Seriously? Didn’t you read the header? Ugh, I’ll wait…
OK, now that all the non-geeks are gone. Remember that time two paragraphs ago when I said I wrote that whole stream of consciousness/rambling debugging section the way I did to make a point? Well I did, it’s just that the point was really that I hoped some of you guys could relate to the experience/thought process…I hope you did. And hey, even if you didn’t; the next time somebody asks you what you do for a living and you don’t feel like getting into it, just point them to this post…I’m pretty sure they won’t ask you again.