Solaris, PTSocketSelector, and sun.nio.ch.PollSelectorProvider...Oh My!

Comments (0)

Howdy all.  After a long hiatus I'm back to blog your socks off with some technical minutiae that will save a few of you lots of headaches, and help the rest of you get a good night's sleep.

 

Long Story Short (i.e. "Just the useful info, please") 

There's a bug in the JDK that causes problems when setting up certain types of Java server socket constructs on some Solaris 10 boxes.  This bug will likely manifest itself in one of the following ways in your ALUI environment:

 

1)  You're running the ptlogging utility and see your logs filled up with something like:

 

PTSelectorThread_17679958      

com.plumtree.openkernel.impl.openhttp.core.network.PTSocketSelector     Unexpected exception.

java.io.IOException: Invalid argument

        at sun.nio.ch.DevPollArrayWrapper.poll0(Native Method)

        at sun.nio.ch.DevPollArrayWrapper.poll(Unknown Source)

        at sun.nio.ch.DevPollSelectorImpl.doSelect(Unknown Source)

        at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)

        at sun.nio.ch.SelectorImpl.select(Unknown Source)

        at sun.nio.ch.SelectorImpl.select(Unknown Source)

        at com.plumtree.openkernel.impl.openhttp.core.network.PTSocketSelector.run(PTSocketSelector.java:400)

        at java.lang.Thread.run(Unknown Source)

 

2) Portal starts up fine, but it can't connect to any remote servers, giving you the same stacktrace as above:

 

java.io.IOException: Invalid argument

        at sun.nio.ch.DevPollArrayWrapper.poll0(Native Method)

        at sun.nio.ch.DevPollArrayWrapper.poll(Unknown Source)

        at sun.nio.ch.DevPollSelectorImpl.doSelect(Unknown Source)

        at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)

        at sun.nio.ch.SelectorImpl.select(Unknown Source)

        at sun.nio.ch.SelectorImpl.select(Unknown Source)

        at com.plumtree.openkernel.impl.openhttp.core.network.PTSocketSelector.run(PTSocketSelector.java:400)

        at java.lang.Thread.run(Unknown Source)

 

So what's a good god-fearing person like yourself to do?  Well, if you act now, we'll send you TWO workarounds for the price of one:

 

1) Have an SA up the hard File Descriptor (FD) limit on the server to 8193 or greater by editing /etc/system to have the line:

 

set rlim_fd_max=8193

 

Note: You'll need to bounce the box after this change.  You can then verify it worked by running:

 

ulimit -n -H

 

Which should return a number >= 8193.  Sadly, this approach will probably have the SA asking you why you want to make the change, which means you'll have to read the technical details below....so onto option #2

 

2) Tell the appropriate JVMs to use a different Socket Selector configuration.  You do this by passing the following option to the JVM:

 

-Djava.nio.channels.spi.SelectorProvider=sun.nio.ch.PollSelectorProvider

 

Depending on what ALUI component you're updating, you may pass this option in different ways.  For instance, if you're dealing with one of the back-end servers (Collab, Studio, etc), you'll want to update wrapper.conf to add additional arguments like:

 

wrapper.java.additional.7=-Djava.nio.channels.spi.SelectorProvider=sun.nio.ch.PollSelectorProvider

 

Note: Replace "7" in the above line with the appropriate number for your wrapper file.

 

Or you may just need to update a shell script somewhere that's kicking off Tomcat/Weblogic/etc.  Note that these scripts all have their own Shell variables for holding additional Java arguments, so just look through them and update as appropriate.  If you have problems, feel free to post questions here and we'll do our best to help out.

 

 

Long Story Long (i.e. "I'm kind of a geek, and I'm sitting at work with nothing else to do, so give me the details")

So ALUI uses the NIO java packages that were introduced in Java 1.4.  FWIW, I always thought NIO stood for Non-blocking IO, but a little Googling reminds me that it's actually New IO...silly me.  In any case, the NIO packages let you do some cool things with sockets to more efficiently manage high-volume connections.  The under-lying problem you're running into is that out-of-the-box Selector implementation in the JDK uses /dev/poll to allocate 8192 File Descriptors (FD) for use by the selector, and 8192 exceeds the nofiles (Number of File Descriptors) limit on your server.  So, you can either bump up the server FD limit ala work-around #1 above, or tell Java to use a different selector implementation that doesn't allocate all those FDs ala option #2 above.  If you're interested in more detail, you can find the Sun bug on the issue here.

 

Until next time...thanks for reading! 

Leave a comment

Recent Entries

AJAX Refresher
It's been a while since we touched on AJAX, but a question came up recently about it an I thought…
The Stack Trace Strikes Back
Howdy all. Welcome to part two of three of what was originally conceived as a one part series. It's entirely…
Cool Tools Part XVI: My Love Affair With JAD
Hi, my name is Brian.  I like sunsets, long walks on the beach, puppies, and de-compiling Java code.  If you…