Federated Search & External Web Sites

March 22 2010 11 comments

MOSS has a great search functionality, especially for indexing SharePoint content. But when it comes to indexing organization’s external web sites, some problems may arise. The crawling and ranking mechanisms are tuned for indexing structured intranal content and do not always perform optimically with external web site content.

Indexing organization’s external web sites creates also problems in presenting the content. If the external content is included in default scope (“All sites” or similar), the number of search results may grow too high. If the external content is separated into a separate scope, the scope will be rarely used. Marking external sites as deauthorative pages drops their rank but also makes including them in search less meaningful.

Fortunately there is a new approach to this. In post-SP1 Infrastructure Update and later in Service Pack 2 was introduced a new concept of Federated Search. This feature exists also in SharePoint Server 2010. Quoting TechNet:

A federated search is the simultaneous querying of multiple online databases (locations) for the purpose of generating a single search results page for end users.

When you add a federated location to Office SharePoint Server 2007, end users can search for and retrieve content that has not been crawled by your server. Federated locations allow queries to be sent to remote search engines and feeds, after which Office SharePoint Server 2007 formats and renders the results to your end users as part of your crawled content.

The Search Center site template includes by default the federated search web parts, and the federation connections are configured to show search results from Bing. Creating your own connectors can be done in SSP as long as the target system has OpenSearch interface.

Just federating search results from Bing using out-of-the-box setup may not be very meaningful. But by a simple tweak, you can target Bing to search only from the web sites of your organization. This way you do not have to index the sites with MOSS search anymore, which frees the server resources. Search results look more refined when the external site results are displayed in a separate web part.

Targeting federated results to only your organization’s external sites using Bing can be done like this:

1. Create a copy of the existing “Internet Search Results” federated location.

2. Modify the Query Template of the location:

http://search.live.com/results.aspx?q={searchTerms}+%28site%3Awww.site1.com+OR+site%3Awww.corp2.com+ OR+site%3Awww.corp3.com%29&count={itemsPerPage}&first={startItem}&mkt={language}&format=rss&FORM=SHAREF

3. Modify the “More Results” Link Template of the location:

http://search.live.com/results.aspx?q={searchTerms}+%28site%3Awww.site1.com+OR+site%3Awww.corp2.com+ OR+site%3Awww.corp3.com%29&first={startItem}&mkt={language}&FORM=SHAREM

4. Edit the properties of the Federated Results Web Part in Search Center results page to use the newly created location.

One thing to always consider with federated search is the confidentiality of search queries, as intranet search terms may contain confidential information. When federating search results from an external system, the query terms are sent to system over the Internet and usually unencrypted. And when talking about big search providers like Bing or Google, the query data is stored for several months. If you have the federation web parts on your search center results page, this means that Microsoft or Google practically gets all search terms used in your intranet. And theoretically a potential electronic eavesdropper might get those as well. In most of the cases this is not a problem, but the risks must be analyzed beforehand.

Popularity: 3% [?]

11 comments to “Federated Search & External Web Sites”

  1. Very good tip. Also an excellent point about the security implications of this feature. Many intranet search logs are full of confidential information about internal systems, people’s names, internal project knowledge, even passwords and usernames.

    But in cases where this is not considered a problem this might be an interesting way to do a more complete search center for employees.

    Same feature can also be used to query wikipedia information, dictionaries and even translation engines like kaannos.com. Those can also be useful sources in many intranet scenarios – allthough the same security implications apply to them also (and might be even bigger ones than with Microsoft or Google).

  2. Delete Duplicate Files offers to help users clear valuable space on their computer by erasing duplicates.

  3. Satish says:

    Hi All,

    I configured federated search for bing.com as per steps mentioned above.
    I am facing one issue.

    The result are displayed to web part only after searching the same keyword in BING.COM explicitely, otherwise webparts remains blank.

    Please advise me on this ASAP.

    Thanks & Regards,
    Satish

  4. Appreciation to my father who informed me on the topic of this web site, this weblog is really remarkable.

  5. Hello there! I could have sworn I’ve been to this site
    before but after reading through some of the post I realized
    it’s new to me. Nonetheless, I’m definitely happy I found it
    and I’ll be book-marking and checking back often!

  6. Pretty! This was an incredibly wonderful post.

    Thank ƴou for supplying these details.

    Feeel free tօ surf to mү blog post :: comment avoir des gemmes clash of clans gratuit 2014 – français

  7. Valuable information. Fortunate me I discovered your web site by chance, and I’m surprised why this twist
    of fate didn’t took place earlier! I bookmarked it.

  8. Magnificent beat ! I would like to apprentice while you amend your website, how could
    i subscribe for a blog web site? The account helped me a acceptable deal.

    I had been tiny bit acquainted of this your broadcast provided bright clear concept

  9. Thanks for any educational internet site.. acheter gemmes clash of clans Where else might I buy that sort of information and facts written in this sort of fantastic indicates? I own a mission that I’m purely right now working on, and that i are already within the start looking outside regarding this kind of info.

  10. alteweiber says:

    Whenever you want to have free sexy chat go and check our young hot local girls – alteweiber

  11. Jane says:

    When it comes to external website content, the crawling and ranking algorithms are not always as effective because they are designed to index structured intranet information. See: gold coast retaining wall

Leave a Reply