Archive for March, 2015

The Dreaded Collaboration Search Index Rebuild

Wednesday, March 18th, 2015

Does the following screen shot fill you with dread?
collab-search-rebuild

If so, you’ve likely one of us who have had the dubious displeasure of having to work with the terribly bad “Collab Reindexing” functionality, where you ask Collab to rebuild the contents of the search index – usually after a migration or upgrade. The problem is that Collab can store hundreds of thousands of documents, and this process tends to fail, fill up disk space, or do various other terrible things like kidnapping your hamster and holding it for ransom. When any of those things happens, you have to just start over from scratch and pray harder next time.

I recently did a WebCenter 10gR4 migration/upgrade because Windows Server 2003 is approaching end of life in July, and we wanted to upgrade to Windows Server 2008 without jumping through any crazy hoops that Oracle wouldn’t support in 10gR3. The 10gR4 support matrix indicates Windows 2008 (x86 and x64) support for WCI 10.3.3, Collab 10.3.3, and Analytics 10.3.0.2 (don’t ask why the versions aren’t the same, or why Publisher is now technically unsupported for those upgrading to 10gR4). I won’t even rant here about how terrible the installation process was, resolving error after error and manually installing services and deploying web apps, but the Collab Search rebuild deserves a special place in hell.

First, let’s get the “good news” out of the way since there isn’t much of it: in Collaboration Server 10gR4, there’s now an option to re-index only the documents that haven’t been submitted to search yet because the process just exploded. Which is super. While I’d prefer the process not “explode” all the time, I’ll take what I can get: restarting the process at document 110,000 out of 230,000 is much better than restarting at document 1.

The bad news is that the process still fails all the time, despite claims of fixes and notes of “re-breakings” (Oracle bug – login required). The logging is terrible (hint: you can tweak some log settings by cracking open the Collab .war file and changing log parameters there, and you can turn on the following in PTSpy – in addition to “Collab” and “SearchService” – to get a more accurate picture of this bizarrely convoluted process: ALUIRemoting, Remoting, dummy, Configuration. When the search rebuild process fails, you can no longer access Collab Administration (because it seems the first thing it tries to do is connect to the Search Service, and that’s dead now). And the errors that show up aren’t even consistent over time. You’re likely to see a ton of different errors after a period of time, usually with the remoting process:
jms-connection-failed
This error seems to be quite common though:

javax.jms.JMSException: Exceeded max capacity of 50

What can you do about this? I changed the transport protocol for the Collab Search Service from TCP to HTTP and it cleared up the problem, although for the life of me I can’t tell you why. Read on for details of how to do this yourself… (more…)