September 2008 Archives

Beware the Security Propagation Bug(s)

Comments (1)

We've warned you before about ACL propagation when you're changing the security in ALI.  Heck, we even created a product to ease the pain of this important task.  Today's bug is about another issue with security propagation.

Well, it's actually 2 bugs (maybe 3).  Let me explain:

When you answer "yes" to that question about security propagation, a job is created.  Here's the problem:  the job is run as the user who created the folder, not the user changing the security.  What if you later delete that user?  Well, bug #1: automation server is hosed.  You're going to get an error like this:

failed-job.jpg

The Exception says "*** Job Operation #1 of 1 with ClassID 20 and ObjectID 898 cannot be run, probably because the operation has been deleted.", and the error's wrong (because it says "probably", I won't count this as a bug).  The real problem in this case is that the folder's OWNER has been deleted, not the operation itself.

This gets me to Bug #2: when you delete a user, they're removed from all groups, but apparently they're not removed as OWNERS of any of the admin folder objects (and how could they?  What should they revert back to?).  Obviously, this is what causes the problem with automation server.  If this one was fixed somehow, bug #1 would become irrelevant.

Bug #3 (this one I haven't confirmed yet, but is likely to exist given the testing and logs mentioned above): suppose Joe Content Manager creates an entire structure of folders.  Then Joe's boss decides he shouldn't be an admin for one of the child folders (say, an "Executive Committee" folder) and removes his access.  He changes the security and propagates it on that "Executive Committee" folder.  Later, he wants to add Mary Executive to have privileges to the folder.  He changes the ACL again and chooses to propagate.  Here's the rub:  because the job is run as Joe, and Joe no longer has access, the job fails, and Mary can not be added.  In fact, it's likely that Joe can't even be added back, so that ACL is completely frozen unless the admin goes through every object and re-adds Joe back (or changes the owner of the folders through the DB).  Have I mentioned LockDown?

Anyway, want the SQL Scripts to fix the owners on the various folders?  Hit the jump.

This is a little old-school, but still a very relevant tip.  This problem has been around for all of the ALI 6.x days (including 6.5), and if I remember correctly, even the Plumtree 5.x days.  Basically, the automation server logs its activity to the PTJOBLOGS table, and if you've got jobs that are logging in verbose or very frequently, this table can become HUGE:  I had a recent customer whose database had grown to over 80 GB because of this table.

Basically, to prevent this table from growing astronomically big, make sure:

  1. You don't have jobs that are running way too often (once a minute typically means a job is running - and logging - constantly)
  2. You don't have verbose logging for any jobs running in the portal, and
  3. That your portal configuration is set to only save a reasonable amount of job log data.  By default, the portal will keep around 60 days worth of logs, which is a pretty big number.  If you ask me, any job logs older than 7 days are worthless because jobs always run more than once a week.  But, of course, you're not asking me, so I'll just tell you how to change this setting and you can decide for yourself.  The configuration setting isn't available through the UI; instead you have to tweak the database.  Specifically, you have to update the PTSERVERCONFIG table in the ALI database with SETTINGID=15.  Set the value to whatever you think is appropriate based on your use of the job log:

ptjoblogs.jpg

Finally, what happens if you've already got a PTJOBLOGS table that contains 200 million rows?  Here's a tip that I got from our friend and client, Mike Jones at MedSolutions, Inc.: if you run an amateur update like "delete * from ptjoblogs", the update will take forever because all transactions will be logged.  If you run "truncate table ptjoblogs", on the other hand, you're just dropping the table and don't have to wait ridiculous amounts of time for the update to happen (not to mention the amount of database log space you'd be consuming).

Here's a little feature that some of you may find useful:  Collaboration Server 4.5 can send you an email if it can't connect to the new Notification Service.  For those of you that have countless problems with the old Notification Server that shipped with Collab 4.2 and earlier, this is a must-have feature.

Just go to Administration: Select Utility...: Collaboration Administration: Collaboration Notification, and enable Health Monitoring:

 

collab_notify2.jpg 

It works, too (assuming your SMTP server doesn't require authentication for internal addresses).  Collab is even kind enough to mark the mail as "High importance":

 

collab_notify_email.jpg

On the other hand, for those of you that don't appreciate this new little nugget of functionality, consider this irony:  Collaboration 4.5 uses email to tell you when email is broken.  Cosmic, man.

STILL working on Collab 4.5 RSS feeds...  Today's tip will either be completely useless to you, or save you a ton of time.  If you ever see the following error:

collab.server.administrator CNS.SECURITY com.bea.notification.security.SecurityManager
Unable to authenticate user with token '1|1288135304|xarwmCj1FbIzv/Mo/yyd0tEjkgI='

Don't bother looking at the Security And Directory Service (really, what's that thing for anyway?). The following screen shot explains it all:

notification_token.jpg

Open your portal database, query PTSERVERCONFIG ("select VALUE from PTSERVERCONFIG where SETTINGID=65"), and stick that value in the "Message authentication code seed value" field on the "Aqualogic Notification Service"/"Login Tokens" page.

Maybe when I'm much less annoyed at this whole process I'll dig more into the SAML2 Token Type.  And that elusive "Security and Directory Service", which still seems to provide no more valuable service than being a red herring for bizarre portal issues. 

ALI 6.5 Configuration Manager

Comments (0)

When I first introduced the Configuration Manager, I wondered aloud how it would work in a multi-system environment.

Well, it turned out the the Configuration Manager isn't that complicated after all:  it's pretty much an interface for editing a configuration file on disk, so it does only apply to the system you're running it on.

The file can be found at bea\alui\settings\configuration.xml, and can be edited directly as easily as you would through the Configuration Manager interface (with the exception of passwords, which are encrypted):

ALI 6.5 Directory Services

Comments (0)

In my (seemingly) never-ending quest to get Collaboration Notification working with 6.5, I ran into yet another error resulting in a ridiculous amount of diagnostic work.  The good news is that the error I was running into was a simple self-inflicted problem.  The bad news is there is an amazing lack of documentation about how the new Notification system works with Directory Services (the "BEA ALI LDAP Directory" service in Windows).

Here's the general premise to ALI Directory Services: ALI 6.5 ships with this new Directory Services component that provides an LDAP service for Portal User accounts.  The idea is that historically, the portal has been great at synching users from external repositories (from AD, LDAP, or custom sources) into its own database. Once those users get synched and aggregated into the portal, though, they're not exposed to any other services.  Directory Services aim to resolve that problem: 6.5 provides an LDAP server that uses the industry-standard LDAP protocol to expose users that have been synched to the portal.  So any other system can use LDAP to get user information.

Fantastic feature, right? But with the dearth of documentation out there, what may not be immediately obvious is that this Directory Service is also used by internal components such as the Notification Server.

I've only begun to scratch the surface with how all these components work together, but if you're interested in reading about how they DON'T work together (saving yourself hours of diagnostic time), hit the jump.

Publisher 6.5 - Workaround for Broken Upgrades

Comments (1)

Ruh, roh.  This one's a doozy.  Yeah, I'm pretty high on Publisher 6.5, especially the feature where Publisher no longer issues a redirect for published content, greatly improving performance because it allows the portal to cache all this content.

Unfortunately, I just ran into a major problem with this feature in a dev upgrade: the servlet for streaming all this content back doesn't do any transformations on any of the URLs.  This is bad, since if you're doing an upgrade, you've likely got hundreds or thousands of Content Items with relative links or images in them.  For example, suppose you have an existing Content Item with an image in it (that has been published to the same folder in Publisher).  The reference for this image will likely be:

<img src="myimage.jpg">

This worked fine with the old redirector, as the image tag would be transformed:

http://server/portal/server.pt/gateway/PTARGS_0_1_493_234_684_43/http%3B/ptpublisher%3B7087/publishedcontent/publish/folder1/folder2/image.jpg

But with the new content streaming functionality, this link does NOT get transformed the way you'd expect; the portal sees the relative link as relative to the SERVLET, not relative to the existing Content Item.  So the browser sees a URL like this:

http://server/portal/server.pt/gateway/PTARGS_0_1_493_234_684_43/http%3B/ptpublisher%3B7087/ptcs/PublishedContentServlet/image.jpg

... which obviously isn't right.

In fact, Oracle seems to be aware of this: in the content.properties file, they changed the LTCUseRelativeURLs setting to "false" (it was "true" in 6.4).  That way, for NEW Content Items, the URLs are all absolute, which prevents this problem from coming up.

I was going to revoke my recommendation to upgrade to 6.5 for now, but came across a really easy fix to get all the benefits of 6.5 without this glaring bug.  Hit the link for the workaround.

Cool Tools Part XV: LDAP Browser

Comments (0)

Granted, the vast majority of the Cool Tools we've listed here aren't exclusive to the AquaLogic line, and frankly, while I've used this one before to diagnose problems with the AD and LDAP Identity Services, I probably wouldn't have even included it until recently, when the new ALI Directory Service made its debut.

The tool is Softerra's LDAP Browser, and it comes in both free and paid versions.  Using it (and the tip for the proper user name to authenticate with), you can peruse the account information exposed by the new service.  Simply start it up, point it at the right port with the right credentials, and browse away like you would with Windows Explorer or Regedit:

ldap.jpg

The two interesting things I've noticed off the bat is that the base DN (Distinguished Name) for the user and group OUs (Organisational Units) is dc=bea,dc=com, and that different auth sources have different OUs under that (while native Plumtree accounts exist at the top level.

ALI 6.5 Directory Services Part II

Comments (0)

In my last post, I mentioned "6.5 provides an LDAP server that uses the industry-standard LDAP protocol to expose users that have been synched to the portal.  So any other system can use LDAP to get user information."

All well and good, sure, but how do you authenticate against this fancy new LDAP service in the ALI stack?  I tried using an LDAP Browser (stay tuned for that "cool tool" coming up) to see what the service had to offer, and had no idea how to authenticate against it.  It kept requesting a password, and I kept using "administrator" and the admin password.  No dice.

So I turned to another trust Cool Tool, TcpTrace, and exploited the fact that the Configuration Manager allows you to specify which port the LDAP server listens on separately from the port the Notification Service connects to it on (again, see the last post).  By getting the LDAP Server to listen on port 2389 and the Notification Service to connect on port 9999, I ran TcpTrace to proxy those connections from 9999 to 2389.  Here's what I saw:

ldap_tcp_trace.jpg

Aha!  The User ID isn't just "administrator"; it's "uid=administrator,ou=users,dc=bea,dc=com".  (the red blobs are actually censoring our administrator users' password).  Remind me again, how was I supposed to know that?  Oh yeah, maybe I wasn't...

Anyway, when you do your 6.5 upgrade, you should be able to use the same format to connect to this LDAP service and check it out for yourself.  How?  Stay tuned!

Aqualogic Interoperability Matrix

Comments (1)

I mentioned the ALI Interoperability Matrix (login required) in the Publisher 6.5 post, and realized that I don't always go to the official site to check compatible versions of the Aqualogic stack; instead I just use a locally saved copy because it's quicker to pull up. 

Obviously, the locally saved snapshot can get out of date over time, but it's nice to have when you need it in a pinch (having been on sites that don't allow you to connect to their network).

If you want an Excel copy of the matrix, you can download it here.  Keep in mind it's only current as of the end of August 2008, but for prior versions of the portal, the information should be pretty static. For the latest version, visit one.bea.com (while it lasts).