Archive for the ‘Search Server’ Category

The Dreaded Collaboration Search Index Rebuild

Wednesday, March 18th, 2015

Does the following screen shot fill you with dread?

If so, you’ve likely one of us who have had the dubious displeasure of having to work with the terribly bad “Collab Reindexing” functionality, where you ask Collab to rebuild the contents of the search index – usually after a migration or upgrade. The problem is that Collab can store hundreds of thousands of documents, and this process tends to fail, fill up disk space, or do various other terrible things like kidnapping your hamster and holding it for ransom. When any of those things happens, you have to just start over from scratch and pray harder next time.

I recently did a WebCenter 10gR4 migration/upgrade because Windows Server 2003 is approaching end of life in July, and we wanted to upgrade to Windows Server 2008 without jumping through any crazy hoops that Oracle wouldn’t support in 10gR3. The 10gR4 support matrix indicates Windows 2008 (x86 and x64) support for WCI 10.3.3, Collab 10.3.3, and Analytics (don’t ask why the versions aren’t the same, or why Publisher is now technically unsupported for those upgrading to 10gR4). I won’t even rant here about how terrible the installation process was, resolving error after error and manually installing services and deploying web apps, but the Collab Search rebuild deserves a special place in hell.

First, let’s get the “good news” out of the way since there isn’t much of it: in Collaboration Server 10gR4, there’s now an option to re-index only the documents that haven’t been submitted to search yet because the process just exploded. Which is super. While I’d prefer the process not “explode” all the time, I’ll take what I can get: restarting the process at document 110,000 out of 230,000 is much better than restarting at document 1.

The bad news is that the process still fails all the time, despite claims of fixes and notes of “re-breakings” (Oracle bug – login required). The logging is terrible (hint: you can tweak some log settings by cracking open the Collab .war file and changing log parameters there, and you can turn on the following in PTSpy – in addition to “Collab” and “SearchService” – to get a more accurate picture of this bizarrely convoluted process: ALUIRemoting, Remoting, dummy, Configuration. When the search rebuild process fails, you can no longer access Collab Administration (because it seems the first thing it tries to do is connect to the Search Service, and that’s dead now). And the errors that show up aren’t even consistent over time. You’re likely to see a ton of different errors after a period of time, usually with the remoting process:
This error seems to be quite common though:

javax.jms.JMSException: Exceeded max capacity of 50

What can you do about this? I changed the transport protocol for the Collab Search Service from TCP to HTTP and it cleared up the problem, although for the life of me I can’t tell you why. Read on for details of how to do this yourself… (more…)

Rebuilding WebCenter Collaboration Index By Project

Thursday, April 4th, 2013

Another tip of the hat to Brian Hak for this pretty awesome Hak (see what I did there?).

Last year, Brian was faced with a problem: Some documents in WebCenter Interaction Collaboration Server weren’t properly indexed, and his only option was the rebuild the ENTIRE index (a pain we’re all pretty familiar with). With many thousands of documents, the job would never finish, and end users would be frustrated with incomplete results while the job toiled away after wiping everything out.

So he took it upon himself to write CollabSearchRebuilder. CollabSearchRebuilder allows you to rebuild specific subsets of documents without having to wipe out the whole index and start all over again.

Feel free to download the source and build files and check it out!

Validate WCI Search Server performance

Wednesday, June 27th, 2012

WCI Search is a fickle beast. In my opinion, it started as one of the most stable pieces of the Plumtree stack and has been losing steam since the AquaLogic days with Clustered Search. Nowadays, it seems even a slight breeze will bring the damn thing down and either needs to be restarted or rebuilt.

But, if you’re getting reports that “search is slow”, how do you quantify that? Sure, something like Splunk would be useful for event correlation here, but we can build a pretty clear picture with what we’ve already got. And, I’m not talking about WebCenter Analytics – although that is possible too by querying the ASFACT_SEARCH tables.

Instead, let’s try to get the information from the log files. First, if you look at an entry for an actual search query in the search log files at \ptsearchserver\10.3.0\\logs\logfile-trace-*, you’ll see lines like this:

See it? There’s a line there that looks like this:

<request client="" duration="0.015" type="query">

So although there is no timestamp on that line, if we can isolate that line over a period of time and parse out the “duration” value, we can get a sense of whether the search server is slowing down or if we need to look elsewhere.

I did this using my old friend UnixUtils, and ran a simple command to spin through all the search log files and just find those lines with the phrase type=”query” in it:

cat * | grep "type=\"query\"" > ..\queryresults.txt

The “duration” value can then be parsed out of those lines, and we can produce a nice purdy chart for the client that says “indeed, it seems searches that used to be taking 100ms are now running around 1s, so we should dig into this”:

WebCenter Search Thesaurus

Tuesday, May 22nd, 2012

In 2001 Plumtree acquired RipFire, Inc., which provided the foundation for its search technology in the intervening decade.

Somehow, I’ve always known this search server (that’s grown up through the ALUI and WCI phases) allowed a thesaurus to be created, but I’d never created one at a client site until recently. Turns out, it’s not so hard.

The Search Thesaurus means that if an end user searches for term X, you want the search server to internally retrieve documents for term y as well. Creating one is pretty straight-forward: you just need to create a text file using a comma-separated list of values where the first value is the search term and the second value is the internal entry, with one entry per line.

For example, let’s say you’re not happy with the results when users enter “business cards” in the search box (because “business” and “cards” are pretty generic terms). You could create a line in your thesaurus text file like this:

business cards,business card order login

… and import the text file into the WCI Search Server with these steps:

  1. Stop Search Service
  2. Run this command: C:\plumtree\ptsearchserver\6.5\bin\native>customize.exe -r C:\thesaurus.txt C:\plumtree\ptsearchserver\6.5\prod_search_node_01
  3. Start Search Service

For more details on setting up a Search Thesaurus, see the BEA ALUI Admin Guide.

WCI Search Repair Scheduling Nuances

Monday, October 10th, 2011

WebCenter Interaction Search Server can be pretty finicky. It used to be rock-solid in the Plumtree days, but stability has declined a bit since Clustering was added in the BEA Aqualogic days. I don’t even recommend clustering any more if an index is small and manageable enough – I’ve just spent way too much time wiping out and rebuilding indexes due to problems with the cluster sync.

Over time, the database and the search index get out of synch, which is completely normal. In some cases, a custom utility like Integryst’s SearchFixer is needed to repair the index without purging it, but in most cases, a simple Search Repair will do. To schedule a Search Repair, go to Select Utility: Search Server Manager, change the repair date to some time in the past, and click Finish:

The Search Repair process runs when you run the Search Update Job. Just make sure in the log you see the line “Search Update Agent is repairing the directories…”:

Why would you NOT see this? For years I’ve been doing this “schedule search repair to run in the past, kick off Search Update job” procedure, and it always seemed like I had to run the Search Update job twice to get the search repair to actually work. It turns out, though, that you don’t have run the job twice – you just need to wait for a minute after you schedule the search repair to run in the past.

Why? Because when you set the repair to run in the past and click Finish, the date doesn’t STAY in the past – instead, the date gets set to “Now plus 1 minute”. So if you kick off the Search Update job within a minute of changing the search repair date, the repair is still scheduled to run a couple of seconds in the future.

Note that this little date-related nuisance uses the same logic that Search Checkpoint scheduling uses – dates in the past are actually set to “now plus one”. This is significant because it changes the way you should plan schedules. For example, if you want an Automation Server job to run now, and then continue to run every Saturday at midnight, you’d just go into the job and schedule it for LAST Saturday at midnight, repeating every 1 week. Automation Server will see that the job is past-due, run it now, and schedule the next run time for next Saturday at midnight. With Search Repair or Checkpoint manager, if you do this, the date will actually be set to “now plus 1”, and the job will run now. But the next job will be scheduled for 7 days from now, not 7 days from the original “Saturday at midnight” schedule.

Bottom line: for heavy jobs like search repair or checkpoint management, you should schedule them to run in the future at a specific (weekend or off-hours) date rather than some time in the past.

Portal API Search Sample Code: IPTSearchRequest

Tuesday, June 14th, 2011

Today’s post is a quick code snippet from Integryst’s LockDown product (which relies heavily on search to identify objects for reporting on security configurations), and provides a pretty good sample of how to search for objects in the Plumtree / ALUI / WebCenter portal using the Search APIs. Hope you find it helpful!

See the docs for IPTSearchRequest and PT_SEARCH_SETTING for more options in developing your own search killer app.

// get the folder ID to load the grid data
string searchQuery = Request.Params["query"];

if (searchQuery == null)
return; // do nothing if there's no query

ArrayList userGroupResults = new ArrayList();
int classid, objectid;
string classname, objectname;

// get the portal session from the HTTPSession
PortalSessionManager sessionManager = PortalSessionFactory.getPortalSessionManager(Session, Request, Response);
IPTSession ptSession = sessionManager.getAPISession();

//search for users and groups that match this query
IPTSearchRequest request = ptSession.GetSearchRequest();

//turn off requests for collab/content apps

// Restrict the search to users and groups

// Restrict the search to specific folders

// make sure the appropriate fields are returned.

// set search order
IPTSearchQuery query = request.CreateBasicQuery(searchQuery + "*", "PT" + PT_INTRINSICS.PT_PROPERTY_OBJECTNAME);

IPTSearchResponse results = request.Search(query);

UserGroupObject tempRes;
int numUsersGroupsFound = 0;

// iterate the results
for (int x = 0; x < results.GetResultsReturned(); x++)
	objectname = results.GetFieldsAsString(x, PT_INTRINSICS.PT_PROPERTY_OBJECTNAME);
	objectid = results.GetFieldsAsInt(x, PT_INTRINSICS.PT_PROPERTY_OBJECTID);
	classid = results.GetFieldsAsInt(x, PT_INTRINSICS.PT_PROPERTY_CLASSID);
	classname = GenericObject.getClassNameFromID(classid);

	// search filter doesn't seem to work; make sure the classid is user or group
	if ((classid == PT_CLASSIDS.PT_USER_ID) || (classid == PT_CLASSIDS.PT_USERGROUP_ID))
		// do stuff

WCI Collaboration Search Server re-indexing

Thursday, March 10th, 2011

Oracle’s Upgrade Guide for WebCenter Interaction Collaboration Server include “Rebuild the Oracle WebCenter Collaboration search collection“.

A while back, I ran into an issue where the rebuild process was spiking the CPU on the Collab Server at 100% forever (which, I suppose, is more of a plateau than a spike).  In viewing the PTSpy logs, I saw hundreds of thousands of messages that included this SQL statement:

select * from csIndexBulkDeletes where status = 1

Checking that table, I found over 110 MILLION rows. Which is particularly odd, given that this client only had 42,000 Collab Docs. Now, I have no idea how the table got that enormous, but it’s clear that Collab’s Search Rebuild process uses that table to determine which documents to update, much like the Search Update job uses the PTCARDSTATUS table – which, incidentally, can also get messed up.

It was clear that if the search rebuild process goes haywire, Collab starts queuing up search server updates in this table, and if the table gets too big, cascading failures start to occur where the queue grows faster than it can get purged.

The solution is: before starting the Collab Search Re-index process, clear this entire table, which is rebuilt during the re-index process anyway. To do so, just run:

truncate table csIndexBulkDeletes

I should note that this isn’t all that common, as I’ve only seen it once, but at least now you know one possible solution if your rebuild process can’t seem to gain escape velocity.

Update: WCI Search Configuration through Configuration Manager

Sunday, November 7th, 2010

A couple months ago, I wrote a post about using the old nodes.ini or cluster.ini files to change your WebCenter Interaction Search Service settings through the deprecated nodes.ini or cluster.ini files.  Since then, I’ve found you can achieve the same effect through the portal’s Configuration Manager:

Now you know.

There’s a WCI App For That 5: SearchFixer

Monday, October 18th, 2010

We’ve discussed a tiny bit about Knowledge Directory cards and how the WCI Search Update plays into the crawler ecosystem, and seen that it’s possible to directly query the WebCenter Search Service, so how ’bout a quick real-world application example, expanding both of those concepts?

Here’s the scenario:  I had a client that was showing discrepancies between “Browse” and “Edit” modes in the ALUI Knowledge Directory, and in Snapshot Queries.  I suppose I owe you all a more detailed explanation of these topics – which I’ll put up in a couple of days – but for the purposes of this article, suffice it to say that the “Search Index” and “Database” were mis-matched, and the WCI search index didn’t match the database.  Worse, the regular method of repairing this discrepancy (using the Search Update job after scheduling a Search Repair) wasn’t working.

So, to fix this issue, I developed another quick and dirty application that enumerated all folders in the Knowledge Directory, doing a search for cards within the folder, then querying the database.  The application would then compare the results, and if they were different, would allow the admin to “fix” the problem by deleting all cards from the Search Index for that folder.  When the Search Repair job next ran, it would re-create these entities without all the extraneous records in there.

Like this post, I’m not particularly proud of the code as a well-architected solution, but it works and I’d be happy to help you out if you want to get in touch.  Some of the relevant code is after the break. (more…)

Communicating Directly with WebCenter Interaction Search Server

Wednesday, October 6th, 2010

Years ago I wrote about checking the Plumtree (ALUI?) Search Server Status The Hard Way.  And I just let it go.  A couple days ago, I told you about a great webinar on Oracle’s support site, and it took that great presentation for me to put two and two together: communicating directly with the search server is possible for more than just “checking its status the hard way”:  once you know how to connect to the port and issue commands via Telnet, you can do ALL KINDS of stuff.  Anything, in fact, that the portal can do – and more.

It turns out that WebCenter Interaction – the portal, IDK, custom code, you name it – is just building complex text queries under the covers based on the actions a user performs (such as typing a search term in the search box).  And you can see these queries when you run PTSpy (Turn on INFO for the SEARCH component):

Taking this search string and adding the (secret?) key, you can compose an identical query:

KEY redacted (((NAMESPACE english BESTBET “matt chiste”) TAG bestbet OR ((ptsearch:”matt chiste”) TAG phraseQ OR ((ptsearch:matt or ptsearch:Matt or ptsearch:Matt_) order near 25 ptsearch:chiste) TAG nearQ OR ((SPELLCORRECT (ptsearch:matt) or ptsearch:Matt or ptsearch:Matt_) and SPELLCORRECT (ptsearch:chiste)) TAG andQ)) AND (((ancestors:”dd1″)[0]) OR (((subtype:”PTUSER”)[0]) OR ((subtype:”PTCOMMUNITY”)[0]) OR ((subtype:”PTPAGE”)[0])) OR (((subtype:”PTGADGET”)[0]) AND ((gadgetsearchtype:”bannersearch”)[0])) OR ((@type:”PTCOLLAB”)[0]) OR ((@type:”PTCONTENT”)[0]))) AND (((((@type:”PTPORTAL”)[0]) OR ((@type:”PTCONTENTTEMPLATE”)[0]) OR ((@type:”PTCONTENT”)[0])) AND (((ptacl:”u200″) OR (ptacl:”212″) OR (ptacl:”211″) OR (ptacl:”207″) OR (ptacl:”202″) OR (ptacl:”201″) OR (ptacl:”51″) OR (ptacl:”1″))[0]) AND (((ptfacl:”u200″) OR (ptfacl:”212″) OR (ptfacl:”211″) OR (ptfacl:”207″) OR (ptfacl:”202″) OR (ptfacl:”201″) OR (ptfacl:”51″) OR (ptfacl:”1″))[0])) OR (((@type:”PTCOLLAB”)[0]) AND ((istemplate:”0″)[0]) AND ((collab_acl:”- \~ 1″)[0]))) METRIC logtf [1]

Then, you telnet to the search server port and paste in the text.  Search will respond with an XML formatted reply (in this case, no results):

Of course, once you have this revelation, you can see how the search text can be tuned based on the folder you’re looking for, the ACLs you want to check, or any other number of parameters.

The other startling revelation is that security is applied at the portal tier, and not the search server tier. That is, if I’m a bad guy and I know that key, and I know the query format, I can construct a query that goes against the search server to circumvent any security that has been applied to cards. Notice there are no credentials or login token passed in the query for the search service to check. Now, before you get all up in arms about this being a major security vulnerability, I offer some counter-points:

  1. Anyone with any knowledge of network sniffers or tunnel tools could easily find this key, as the traffic is not encrypted – “Security through obscurity” is not valid security.  However, I don’t consider this a fundamental design flaw or major security hole, and it is no doubt not the security that the Plumtree engineers had in mind when they implemented this. Instead, the search server should reside in a DMZ, and the port shouldn’t be open within the general network anyway. The port is NEVER to be opened to the Internet (try it on my site – “telnet 15250” doesn’t work).
  2. Even if someone did have this secret key, and they had network access to the search server port, and they knew the search format, and they knew how to craft the request to omit ACLs, the most they could get was search results they didn’t have privileges for – not the documents themselves.

How can this be applied in a real-world scenario?  Stay tuned!