Archive for the ‘Best Practice’ Category

Increase WCI Collaboration Server Memory Settings

Friday, July 9th, 2010

The Plumtree Server stack has had a long history, forming a decent patchwork of usable applications, but never quite getting to the point where every part of the stack is consistently configured.  When it became ALUI (AquaLogic User Interaction), there was a movement towards putting configuration settings in one place – the Configuration Manager – but unfortunately now that Oracle is holding the reins and the future of the stack is in question, it looks like we’ll never have that utopian vision of single, centralized way of configuring all applications the same way.

Case in point:  configuring the memory parameters for Collaboration Server.  While Publisher utilizes a config file for memory settings, Collaboration Server passes memory parameters via the service startup path.  So, if you’ve got a decently large Collaboration install, you might find that you’re running relatively low on memory:

To up the amount of RAM available to Collaboration Server, you need to edit the registry (and yeah, back it up first!).  The key you need to change is in HKLM\ SYSTEM\ CurrentControlSet\ Services\ ptcollaborationserver, and it’s called “ImagePath”:

Change the “-Xmx” value to something larger, restart Collab, and you’ll have more RAM breathing room:

Use Host Files for better WCI security, portability and disaster recovery

Tuesday, June 8th, 2010

When configuring a WebCenter Interaction portal, it’s highly recommended to use host files on your machines to provide aliases for the various services.

For example, instead of referencing Publisher’s Remote Server as http://PORTALPROD6.site.org:7087/ptcs/, create a host file in C:\Windows\System32\drivers\etc\hosts, and add a line like this:
wci-publisher 10.5.38.12 #IP Address for Publisher in this environment
… then set your Remote Server to http://wci-publisher:7087/ptcs/.

I’m always surprised how many times the knee-jerk reaction to this suggestion is that this is a poor “hack”, or something worse like this:
“Host files??? Host files on local servers need to be avoid and you should use DNS in AD for the Portal servers. Host files, again, are an antiquated and unmanageable configuration in this day and age and, in my opinion, should only be used when testing configurations—not for Production systems. I haven’t seen host files used locally on servers in a decade…is that how you are configuring this portal system? If so, I would highly recommend you try to use the AD DNS instead.”

Yes, that’s an actual response from an IT guy who prefers telling others what idiots they are rather than actually listening to WHY this approach is being used.  In all fairness, most knee-jerk reactions are based in the reality that host files are more difficult to maintain on many servers rather than DNS entries on a single server.  But hopefully, if you’re reading this blog, you’ve got an open mind, and will agree with this approach once you see the list of benefits below.

Benefits of using host files in your portal environments:

  1. Security.  When you access a service through the portal’s gateway, the name of the remote server shows up in the URL: http://www.integryst.com/site/integryst.i/gateway/ PTARGS_0_200_316_204_208_43/ http%3B/wci-pubedit%3B8463/publishereditor/ action?action=getHeader.  For most people, this isn’t a huge problem, but allowing the name of the servers to be published in this way can be perceived as a security risk.  By using host files, you’re essentially creating an alias that hides the actual name of the server.
  2. Service Mobility.  Take the NT Crawler Web Service, for example.  When you crawl documents into the portal, the name of the server is included in the document open URL.  Now suppose the NTCWS is giving you all sorts of grief and you decide to move it to another server.  If you use host files, you can just install the NTCWS somewhere else and change the IP address that the wci-ntcws alias points to.  This way, the portal has no idea the service is being provided by another physical system  If you used a machine name, all documents would get crawled in as new the next time you ran the crawler, because the card locations will have changed.
  3. Maintainability.  This one’s a pretty weak argument, but is based on the fact that most of the time, the Portal Admin team doesn’t have access to create DNS entries and has to submit service requests to get that done.  By bringing “DNS-type services” into host files, the portal team can more easily maintain the environment by shifting around services without having to submit “all that paperwork” for a DNS entry (your mileage may vary with this argument).
  4. Environment Migration.  Here’s the clincher!  Most of us have a production and a development environment, and occasionally a test environment as well.  Normally, code is developed in dev and pushed to test, then to prod, but content is created in prod, and periodically migrated back to test and dev, so those environments are reasonably in synch for testing.  This content migration is typically done by back-filling the entire production database (and migrating files in the document repository, etc.).  The problem is, all kinds of URLs (Remote Servers, Search, Automation server names, etc.) are stored in this database, so if you’re using server names in these URLs, your dev/test environments will now have Remote Servers that point to the production machines, and you need to go through and update all of these URLs to get your dev environment working again!  If, however, you use host files, then you can skip this painful step:  your Publisher server URL (http://wci-publisher:7087/ptcs/) can be the same in both environments, but the host files in dev point to different machines than the ones in production.  Cool, huh?
  5. Disaster Recovery.  This is essentially the same as the “Environment Migration” benefit:  When you have a replicated off-site Disaster Recovery environment, by definition your two databases are kept in synch in real-time (or possibly on a daily schedule of some sort).  If a disaster occurs and you lose your primary environment, you’re going to want that DR site up as soon as possible, and not have to go through changing all those URLs to get the new environment running with new machine names.  Of course, unlike “Environment Migration” (where your dev, test, and prod environments typically share the same DNS server), this argument is also slightly weaker.  Since the DR site will likely have its own DNS server, you could conceivably just use different DNS entries at the two different sites and all will work fine.

So that’s it – hopefully you’re convinced that host files are the way to go for configuring ALUI / WCI portals; if so, stay tuned for helpful tips on how to set this up for various servers.  While Remote Servers are a no-brainer, configuring things like Automation Server and Search can be a little trickier.

Analysis Paralysis

Wednesday, June 2nd, 2010

Don’t

Configure node.ini or cluster.ini for Search Services

Sunday, April 25th, 2010

Years ago, Ross Brodbeck wrote some excellent articles on AquaLogic search clustering.  The information there is still applicable in WebCenter Interaction Search, and I won’t re-hash it here.  I definitely encourage you to give those articles a read for everything you ever needed to know about search – well, except this one more thing that I personally always seem to overlook.

In the old days, there was a node.ini file that allowed you to configure Plumtree (and AquaLogic) search memory parameters.  When clustering arrived, the node.ini file disappeared from the installer but the sample files stayed in PT_HOME\ptsearchserver\10.3.0\<servername>\config\.  These settings were supported in PT_HOME\ptsearchserver\10.3.0\cluster\cluster.ini so they could be applied to all nodes, but there are no sample files in that folder.

It turns out that you can still use a nodes.ini file by copying one of those sample files (such as win32-4GB.ini) to node.ini and restart your search node, but whether you use nodes.ini or cluster.ini, you absolutely should do this.  Out of the box, Search Server only uses 75MB of RAM, which is wildly inadequate in a production environment.  You can see how overloaded Search is by going to Administration, then “Select Utility: Search Cluster Manager”.  Notice the document load is almost at 500%, and the Cluster Manager provides helpful recommendations:

A little tweak to nodes.ini can go a long way:

… with just these settings:

# Recommended Search Server configuration for a
# Windows machine with 4Gb or more RAM and taking
# advantage of the /3GB switch in the boot.ini file
# of Windows 2000 Advanced Server.  The settings
# below allow considerable headroom (~1Gb) to support
# multiple concurrent crawls taxonomizing documents
# into the Knowledge Directory.  Installations with
# simpler crawler configurations may be able to
# increase the amount of memory allocated to the
# index and docset caches below, in which case a
# 3:1 ratio of index:docset should be maintained.

[Environment]
RFINDEX=index
RFPORT=15250
RF_NODE_NAME=servername
RF_CLUSTER_HOME=C:\bea\alui\ptsearchserver\10.3.0\cluster
#JAVA_HOME=@JAVA_HOME@

RF_DOCUMENT_TOKEN_CACHE_SIZE=1000000
RF_SPELL_TOKEN_CACHE_SIZE=50000
RF_MAPPING_TOKEN_CACHE_SIZE=5000

# Index cache 750Mb
RF_INDEX_CACHE_BYTES=786432000
# Docset cache 250Mb
RF_DOCSET_CACHE_BYTES=262144000

Clear those audit logs before it’s too late!

Saturday, March 13th, 2010

WebCenter Interaction Audit Logs are decent for tracking what’s going on in your portal, but they can grow really quickly.  So, you should make sure to configure the Audit Log to regularly purge rows from the database and write them to disk using the “Audit Manager” in the WCI portal administration page (under “Select Utility”):

audit-manager

(By the way, for that network path, remember to make sure there are no extraneous files in that directory.)

This isn’t just good practice, but it could save you some huge headaches down the line.  In particular, I’ve seen two “worst-case scenarios”:

  1. At one client with an Oracle database, there was a limit on how big the tablespace could grow for audit logs.  So as soon as this limit was hit, noone could visit the portal.  Because the portal was trying and failing to record the guest users’ “login” to the portal, it ended up throwing an exception and not allowing anyone in.
  2. The Audit Log Management job has some wildly inefficient code, which appears to load ALL rows into memory before writing them out to disk.  If the audit log in the DB grows to a certain threshold, the job to actually clear out those rows will fail with an OutOfMemory exception:

wci-audit-job1

wci-audit-job2

If you find yourself stuck in the predicament, where the audit log is too big to even clear it out through traditional means – with the Audit Log Management job – your Last Great Hope is to simply purge the DB through SQL:

delete from ptauditmsgs where auditmsgtime < '30-DEC-09'

So schedule your audit job archives now, and let’s hope you never have to resort to just dropping those records just to get users logging in and events archived again.