Home | Blog | Screencasts | Projects
# Saturday, February 20, 2010

If you’ve ever built a custom protocol handler for MOSS before you may have in the past used the object model to create the content source, since there is no way in Central Admin to do this.

 

Something like:

 

SearchContext context = SearchContext.GetCurrent(spSite);

Content content = new Content(context);

CustomContentSource source = (CustomContentSource)content.ContentSources.Create( ...
source.Update();

But the SearchContext in SharePoint 2010 is marked as obsolete, which makes total sense when you consider that the SSP is no more, it’s been replaced by service applications.

 

The good news is that we can use Powershell to create the custom content source:

 

> $sapp = Get-SPEnterpriseSearchServiceApplication –Identity “Search Service Application”

>New-SPEnterpriseSearchCrawlContentSource –SearchApplication $sapp –Name “Custom Source Name” –Type Custom –StartAddress  protocol://servername –CrawlPriority Normal –MaxPageEnumerationDepth 1 –MaxSiteEnumerationDepth 1

 

 

So much easier …

 

Also when you register your custom protocol handler for SharePoint 2010, remember that the registry hive location now has the number ‘14.0’ in its path:

 

HKEY_LOCAL_MACHINE/SOFTWARE/Microsoft/Office  Server/14.0/Search/Setup/ProtocolHandlers

 

I still need to have a good play the new BCS stuff in SharePoint 2010, but looking at this post by Todd Baginski I have a feeling that we might have a few more tricks up our sleeves before we go down the custom protocol handler or FAST pipeline route.

Saturday, February 20, 2010 9:36:00 AM (E. Australia Standard Time, UTC+10:00)  #    Comments [0] - Trackback
Search | SharePoint 2010

In MOSS 2007 we could use the codeplex faceted search solution to provide what is now called ‘Search Refiners’ in SharePoint 2010.

 

By default the standard refiner will look like this:

 

image

 

The really nice thing about the refiner is that it will only show a value if a search result is returned for it, so a user will never be faced with clicking on an option and have it return zero results.

But those of us who are familiar with the old faceted search solution will know that the web part displays a count of the results, well the SharePoint 2010 refiner can as well:

 

image

Notice the subtle counts next to the metadata property.

 

This can be achieved by adding the following XML attribute to the refiners configuration xml:

 

Find the <Category>  element that you wish to display counts for and add:   ShowCounts=”Count”

Saturday, February 20, 2010 9:21:00 AM (E. Australia Standard Time, UTC+10:00)  #    Comments [2] - Trackback
Search | SharePoint 2010
# Wednesday, November 19, 2008

I was asked recently if the BDC search results (when indexed by the search) can be controlled by an access list. The answer is that yes, the Security trimmer is the SharePoint feature to accomplish this. In fact any search result can be trimmed, so if you wanted to index some website that used custom permissions (i.e. a content access account that has full rights to a website) but you didn’t want to show that information to say public users of your site, this same security trimmer functionally can be used.

The important things to note are:

  • The security trimmer is attached to a crawl rule
  • The security trimmer is a class that implements the ISecurityTrimmer interface, the registration process defines the full assembly name, as such it must be loaded into the GAC.
  • After the security trimmer is registered, you will need to recreate the content source and perform a full crawl
  • Performance might be an issue, since every search result will be access checked, if your looking for insight on how to approach this refer to this MSDN article
Wednesday, November 19, 2008 10:49:00 AM (E. Australia Standard Time, UTC+10:00)  #    Comments [0] - Trackback
BDC | Search | Tip
# Sunday, November 02, 2008

In both MOSS and Search Server it is possible to configure an xml file that controls expansion and replacement words, so in effect this is a thesaurus file that you can customise with words that may be specific to your organisation.

 

The classic example given is one which expands the technology acronyms (computer types sure do like them!):

   1: <XML ID="Microsoft Search Thesaurus">
   2:   <thesaurus xmlns="x-schema:tsSchema.xml">
   3:     <diacritics_sensitive>0</diacritics_sensitive>
   4:   <expansion>
   5:     <sub>Internet Explorer</sub>
   6:     <sub>IE</sub>
   7:     <sub>IE5</sub>
   8:   </expansion>
   9:   <replacement>
  10:     <pat>NT5</pat>
  11:     <pat>W2K</pat>
  12:     <sub>Windows 2000</sub>
  13:   </replacement>
  14: </thesaurus>

 

To find the location of this file you first need to look in the registry under: [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\Gathering Manager] Key: DefaultApplicationsPath

 

Once you have found the file you can simply add expansion and replacement elements with the children nodes as needed.

Don’t forget to do this to all the servers in your farm. Also you will need to restart the search service for these changes to take effect.

 

The following table (from the enterprise search blog) lists the elements and what they do:

 

Term Meaning
thesaurus marks beginning (and end) of thesaurus
diacritics_sensitive Diacritics are marks, such as accents that are added to letters that change their pronunciation. For example, the acute accent over and e gives you: é.
0 – ignore diacritics
1 – respect diacritics
expansion A list of alternative forms each marked by <sub> by the sub keyword
sub One of several alternatives in an expansion
replacement Several patterns will be replaced with a substitution
pat A pattern to be replaced
sub Item to be substituted

 

I’ve only scratched the surface here, for a full overview of this topic, check out the enterprise search blog

Don’t forget that SQL Server full text search (FTS) has the same capabilities in terms of a thesaurus file that supports expansion and replacement words. If your looking for information on SQL Server 2005 or 2008 you should refer to this post.

The other feature that both products support is the notion of noise words, these are words that when used add no value to the search like ‘been’, ‘before’, ‘being’, ‘both’ etc. This KB describes the process to add or remove words from this list, but simply it is modifying the contents of a file which lives in: Data\Ftdata\SharePointPortalServer\Config, it’s a simple format where each word is on it’s own line (no xml).

Have fun customising the search experience.

Sunday, November 02, 2008 8:37:00 PM (E. Australia Standard Time, UTC+10:00)  #    Comments [0] - Trackback
Search | setup | Sharepoint
# Friday, October 31, 2008

With little effort you can make your site a little more search friendly, it’s possible to embed some xml into your site which your browser can use in its search box. Most modern browsers support OpenSearch, which is what this xml fragment is using.

 

The first step is to create the following xml, but replace the bits that are specific to your site:

 

   1: <?xml version="1.0" encoding="UTF-8"?>  
   2: <OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/">  
   3:     <ShortName>SharePoint Search</ShortName>  
   4:     <Description>Search for SharePoint</Description>  
   5:     <Url type="text/html" method="get" template="http://YourSite/SearchCenter/results.aspx?k={searchTerms}"/>  
   6:     <Image width="16" height="16">http://YourSite/favicon.png</Image>  
   7:     <InputEncoding>UTF-8</InputEncoding>  
   8:     <SearchForm>http://YourSite/SearchCenter/</SearchForm>  
   9: </OpenSearchDescription>  

 

In the master page, you will need to refer to this xml file:

 

   1: <link rel="search" type="application/opensearchdescription+xml" href="/Style%20Library/OpenSearch.xml" title="SharePoint Search">    

 

Finally you can drop down the search provider box in your browser to select your new search provider, there is no need to browse to the search centre again.

 

image

 

Little things like this can help entrench searching as the primary navigation method in an organisation.

Friday, October 31, 2008 12:18:00 AM (E. Australia Standard Time, UTC+10:00)  #    Comments [0] - Trackback
Search | Sharepoint | Tip
# Thursday, October 23, 2008

One of the cool things that MOSS offers is the ability to display the people search results ordered by social distance. You can see what result format was returned by looking at the search action links web part:

 

socialdisresults

The default people search results view can be changed in the Core results web part:

 

coreresultswp

 

The MSDN SharePoint blog has a detailed post which outlines how the colleague connections are formed:

  • Immediate colleagues which are formed using the manager profile field.
  • Colleagues added by you
  • Suggested colleagues

It’s also an interesting read to find out some small details like:

  • The first 3 pages of search results are grouped by colleague-ness: first your colleagues appear, then colleagues of your colleagues, then everyone else.
  • Within each group, the ordering is still by relevance.
  • When paging through results, another 3 pages of results will be grouped once you reach page 4, then page 7, etc.

 

Overall I think the feature works extremely well, although I’ve seen some users struggle with the feature, these users were typically expecting the results to be in alphabetical order like their previous pre-MOSS system. While I don’t agree with the concept since the results are returned by relevance (but in a social context) it is possible to sort the results alphabetically, Paul Galvin has posted some XSLT that does this. Remember that this is done outside of the search engine itself, so the XSLT is only going to sort the results per page. So its possible to have page 1 contain 10 results ordered alphabetically, then page 2 will contain 10 results that are again sorted alphabetically, which might cause problems to some users.

I think ultimately these users just need a little bit of training to understand the social distance format. Just like any search engine if you don’t get the results you are looking for on the first page, I really think you need to refine your search. If your just looking for a colleagues details the social distance is fantastic and saves lots of time, my experience is that I use the people search 90% of the time to find colleagues, of course your situation may be different.

Thursday, October 23, 2008 7:39:14 AM (E. Australia Standard Time, UTC+10:00)  #    Comments [0] - Trackback
MOSS | Search
# Thursday, October 16, 2008

By default SharePoint will create a content source called ‘Local Office SharePoint Server sites’ such as:

 

contentS

 

This will contain the starting addresses of all the sites on your SharePoint server such as:

editContentS

 

Notice how it also includes the sps3://, this is the indexing of your user profiles.

 

My tip is to remove the sps3:// link from the default content source and add it as a new content source on it’s own.

 

The reasons why I think this is helpful:

  • By default you need to crawl all your other content just to update your user profile information.
  • You can schedule your profile crawls at a time that suits your active directory imports

In any case it’s worth considering breaking the profile crawl into it’s own content source.

Thursday, October 16, 2008 10:55:00 PM (E. Australia Standard Time, UTC+10:00)  #    Comments [0] - Trackback
Search | Sharepoint | Tip
# Sunday, October 05, 2008

MOSS 2007 has the option to use a dedicated web front end server for crawling content:

Dedicatedcrawler

 

Why would you want to do this:

 

Advantages:

  • Search doesn’t compete with the end users – Large environments that need to crawl constantly can cause more traffic than normal user load, you don’t really want your users to experience slow pages just because your doing an index? By using a dedicated web front end that isn’t part of the load balanced cluster, your indexing won’t impact your users as much (I say as much, because you still need to think about the impacts of the database server).
  • Easy to move the WFE into the load balanced cluster – It’s a rather crude disaster recovery method, but it’s not that hard to move this box into the load balanced cluster if you really need the extra capacity or if one of your other servers fail. After all it’s just a normal web front end, but one that is reserved for the indexer.
  • Perfect place to run a backup central admin – You should always try to have more than one server running central admin (on a large farm anyway), that way if your main central admin server goes down, you still have a way to manage the farm.

 

Disadvantages:

  • More hardware – The obvious disadvantage to having an extra machine is the requirement of more hardware, which also means:
  • More cost – New hardware is an additional cost, but now that you have an idea of the advantages it brings, you can make a more informed decision.

 

 Joel has some other tips such as adding a robots.txt to servers that you don’t wish to participate in the indexing process.

Sunday, October 05, 2008 7:24:00 AM (E. Australia Standard Time, UTC+10:00)  #    Comments [0] - Trackback
configuration | MOSS | Sharepoint | Search
# Wednesday, September 24, 2008

Just a quick tip, if your crawling external sources with the MOSS (or Enterprise Search), you might find that your crawl doesn’t finish or hangs, it might be worthwhile checking to see if the site you are crawling has a calendar with links:

 

image

 

You will need to determine the URL and then add a crawl rule to exclude that path, the crawler will see an infinite number of pages (it thinks each date and next link is a separate page).

Wednesday, September 24, 2008 1:17:00 PM (E. Australia Standard Time, UTC+10:00)  #    Comments [0] - Trackback
MOSS | Search | Sharepoint
Statistics
Total Posts: 190
This Year: 3
This Month: 0
This Week: 0
Comments: 38