Posts Tagged ‘RSS’

Crawl RSS Feeds with WebCenter Interaction

Wednesday, October 19th, 2011

I don’t know whether to file this one under “obvious” or not. On one hand, I guess most people have always known this. But on the other, it’s such an under-used feature it bears repeating: Web Crawlers in Webcenter Interaction (and even back in the ALUI days) aren’t just for web sites – they can crawl RSS feeds too.

Configuration is identical to creating a Web Crawler. In administration, select “Create Object: Content Crawler – WWW” and choose the “World Wide Web” Content Source:

Here, instead of entering a web site, just provide the URL of the RSS feed:

Once the job runs, a card is created for each article in the feed:

Note the created date shows when the feed was crawled, not when the original articles were written. And in this example, only 11 cards have been created because that’s all that’s being provided on the Integryst RSS Feed. Both of these problems can be resolved by running your crawler job regularly, so that the dates are closer to when the posts are written, and the cards stick around after they’ve “left the feed”.