Posts Tagged ‘search’

The Dreaded Collaboration Search Index Rebuild

Wednesday, March 18th, 2015

Does the following screen shot fill you with dread?
collab-search-rebuild

If so, you’ve likely one of us who have had the dubious displeasure of having to work with the terribly bad “Collab Reindexing” functionality, where you ask Collab to rebuild the contents of the search index – usually after a migration or upgrade. The problem is that Collab can store hundreds of thousands of documents, and this process tends to fail, fill up disk space, or do various other terrible things like kidnapping your hamster and holding it for ransom. When any of those things happens, you have to just start over from scratch and pray harder next time.

I recently did a WebCenter 10gR4 migration/upgrade because Windows Server 2003 is approaching end of life in July, and we wanted to upgrade to Windows Server 2008 without jumping through any crazy hoops that Oracle wouldn’t support in 10gR3. The 10gR4 support matrix indicates Windows 2008 (x86 and x64) support for WCI 10.3.3, Collab 10.3.3, and Analytics 10.3.0.2 (don’t ask why the versions aren’t the same, or why Publisher is now technically unsupported for those upgrading to 10gR4). I won’t even rant here about how terrible the installation process was, resolving error after error and manually installing services and deploying web apps, but the Collab Search rebuild deserves a special place in hell.

First, let’s get the “good news” out of the way since there isn’t much of it: in Collaboration Server 10gR4, there’s now an option to re-index only the documents that haven’t been submitted to search yet because the process just exploded. Which is super. While I’d prefer the process not “explode” all the time, I’ll take what I can get: restarting the process at document 110,000 out of 230,000 is much better than restarting at document 1.

The bad news is that the process still fails all the time, despite claims of fixes and notes of “re-breakings” (Oracle bug – login required). The logging is terrible (hint: you can tweak some log settings by cracking open the Collab .war file and changing log parameters there, and you can turn on the following in PTSpy – in addition to “Collab” and “SearchService” – to get a more accurate picture of this bizarrely convoluted process: ALUIRemoting, Remoting, dummy, Configuration. When the search rebuild process fails, you can no longer access Collab Administration (because it seems the first thing it tries to do is connect to the Search Service, and that’s dead now). And the errors that show up aren’t even consistent over time. You’re likely to see a ton of different errors after a period of time, usually with the remoting process:
jms-connection-failed
This error seems to be quite common though:

javax.jms.JMSException: Exceeded max capacity of 50

What can you do about this? I changed the transport protocol for the Collab Search Service from TCP to HTTP and it cleared up the problem, although for the life of me I can’t tell you why. Read on for details of how to do this yourself… (more…)

Portal API Search Sample Code: IPTSearchRequest

Tuesday, June 14th, 2011

Today’s post is a quick code snippet from Integryst’s LockDown product (which relies heavily on search to identify objects for reporting on security configurations), and provides a pretty good sample of how to search for objects in the Plumtree / ALUI / WebCenter portal using the Search APIs. Hope you find it helpful!

See the docs for IPTSearchRequest and PT_SEARCH_SETTING for more options in developing your own search killer app.

// get the folder ID to load the grid data
string searchQuery = Request.Params["query"];

if (searchQuery == null)
return; // do nothing if there's no query

ArrayList userGroupResults = new ArrayList();
int classid, objectid;
string classname, objectname;

// get the portal session from the HTTPSession
PortalSessionManager sessionManager = PortalSessionFactory.getPortalSessionManager(Session, Request, Response);
IPTSession ptSession = sessionManager.getAPISession();

//search for users and groups that match this query
IPTSearchRequest request = ptSession.GetSearchRequest();

//turn off requests for collab/content apps
request.SetSettings(PT_SEARCH_SETTING.PT_SEARCHSETTING_APPS, PT_SEARCH_APPS.PT_SEARCH_APPS_PORTAL);

// Restrict the search to users and groups
request.SetSettings(PT_SEARCH_SETTING.PT_SEARCHSETTING_OBJTYPES, new int[] {PT_CLASSIDS.PT_USER_ID, PT_CLASSIDS.PT_USERGROUP_ID});

// Restrict the search to specific folders
request.SetSettings( PT_SEARCH_SETTING.PT_SEARCHSETTING_DDFOLDERS, objectFolder);
request.SetSettings( PT_SEARCH_SETTING.PT_SEARCHSETTING_INCLUDE_SUBFOLDERS, 1);

// make sure the appropriate fields are returned.
request.SetSettings(PT_SEARCH_SETTING.PT_SEARCHSETTING_RET_PROPS, new int[] {
	PT_PROPIDS.PT_PROPID_OBJECTID,
	PT_PROPIDS.PT_PROPID_CLASSID
});

// set search order
request.SetSettings(PT_SEARCH_SETTING.PT_SEARCHSETTING_ORDERBY, PT_PROPIDS.PT_PROPID_CLASSID);
IPTSearchQuery query = request.CreateBasicQuery(searchQuery + "*", "PT" + PT_INTRINSICS.PT_PROPERTY_OBJECTNAME);

IPTSearchResponse results = request.Search(query);

UserGroupObject tempRes;
int numUsersGroupsFound = 0;

// iterate the results
for (int x = 0; x < results.GetResultsReturned(); x++)
{
	objectname = results.GetFieldsAsString(x, PT_INTRINSICS.PT_PROPERTY_OBJECTNAME);
	objectid = results.GetFieldsAsInt(x, PT_INTRINSICS.PT_PROPERTY_OBJECTID);
	classid = results.GetFieldsAsInt(x, PT_INTRINSICS.PT_PROPERTY_CLASSID);
	classname = GenericObject.getClassNameFromID(classid);

	// search filter doesn't seem to work; make sure the classid is user or group
	if ((classid == PT_CLASSIDS.PT_USER_ID) || (classid == PT_CLASSIDS.PT_USERGROUP_ID))
	{
		// do stuff
	}
}

WCI Collaboration Search Server re-indexing

Thursday, March 10th, 2011

Oracle’s Upgrade Guide for WebCenter Interaction Collaboration Server include “Rebuild the Oracle WebCenter Collaboration search collection“.

A while back, I ran into an issue where the rebuild process was spiking the CPU on the Collab Server at 100% forever (which, I suppose, is more of a plateau than a spike).  In viewing the PTSpy logs, I saw hundreds of thousands of messages that included this SQL statement:

select * from csIndexBulkDeletes where status = 1

Checking that table, I found over 110 MILLION rows. Which is particularly odd, given that this client only had 42,000 Collab Docs. Now, I have no idea how the table got that enormous, but it’s clear that Collab’s Search Rebuild process uses that table to determine which documents to update, much like the Search Update job uses the PTCARDSTATUS table – which, incidentally, can also get messed up.

It was clear that if the search rebuild process goes haywire, Collab starts queuing up search server updates in this table, and if the table gets too big, cascading failures start to occur where the queue grows faster than it can get purged.

The solution is: before starting the Collab Search Re-index process, clear this entire table, which is rebuilt during the re-index process anyway. To do so, just run:

truncate table csIndexBulkDeletes

I should note that this isn’t all that common, as I’ve only seen it once, but at least now you know one possible solution if your rebuild process can’t seem to gain escape velocity.

There’s a WCI App For That 5: SearchFixer

Monday, October 18th, 2010

We’ve discussed a tiny bit about Knowledge Directory cards and how the WCI Search Update plays into the crawler ecosystem, and seen that it’s possible to directly query the WebCenter Search Service, so how ’bout a quick real-world application example, expanding both of those concepts?

Here’s the scenario:  I had a client that was showing discrepancies between “Browse” and “Edit” modes in the ALUI Knowledge Directory, and in Snapshot Queries.  I suppose I owe you all a more detailed explanation of these topics – which I’ll put up in a couple of days – but for the purposes of this article, suffice it to say that the “Search Index” and “Database” were mis-matched, and the WCI search index didn’t match the database.  Worse, the regular method of repairing this discrepancy (using the Search Update job after scheduling a Search Repair) wasn’t working.

So, to fix this issue, I developed another quick and dirty application that enumerated all folders in the Knowledge Directory, doing a search for cards within the folder, then querying the database.  The application would then compare the results, and if they were different, would allow the admin to “fix” the problem by deleting all cards from the Search Index for that folder.  When the Search Repair job next ran, it would re-create these entities without all the extraneous records in there.

Like this post, I’m not particularly proud of the code as a well-architected solution, but it works and I’d be happy to help you out if you want to get in touch.  Some of the relevant code is after the break. (more…)

Oracle Support Master Notes and Webinars

Saturday, October 2nd, 2010

I’ve been critical of Oracle Support in the past, but recently had a great experience with some of the old Plumtree support buddies that are still around – specifically, Merrick Huang in Oracle Support was able to provide a tremendous amount of assistance on a very thorny search issue I was having at a client site and will be writing about here in upcoming posts.  Before we get into the nitty gritty of that problem, I want to share with you a great resource I didn’t know existed until now: Oracle Support Master Notes and Webinars (login required).

The purpose of “Master Notes” is to “provide the most important links that users will need to install and support the product”, and there are some pretty decent pages in there if you know where to look.  For example, the IDK Master Note is a collection of a bunch of documentation, KB articles, known issues, and bug fixes all in one place.

But what I really wanted to highlight here is the Webinars provided by Oracle Support – with one in particular being the best Oracle Webinar I’ve seen: the Search Webinar, by Eno Gjerasi.  Eno shows that there’s still life left from the Plumtree support group, and demonstrates a level of knowledge of the Search Server that rivals most engineers or consultants.  There was one tip in particular that I’ll focus on in upcoming posts (about how to communicate directly with Search), but I encourage you to check out all three Webinars (Search, Portal / SSO, and Analytics) and the other Master Notes – you may just find a gem in there and wonder how you made it all these years without knowing “that one thing” you never knew you needed.

Keep up the good work, Oracle support!