Monday, December 20, 2010

Developing SharePoint 2010 Search Solutions (Fast and SharePoint)

Technorati Tags: ,,,

Developing SP2010 custom search solutions can be rewarding. Custom solutions can enhance SharePoint search by giving user’s the ability to search by properties and manipulate the results. However, making custom search solutions that can be used with either MSS or FAST search can be much more complicated. In this post I am going to layout the similarities, differences and problems between both MSS search and FAST search. I am also going to explain problems that currently exist in SP2010 and FAST search and possible remedies.

MSS compared to FAST search

MSS (Microsoft Shared Services) and FAST have much in common. In fact if you have both installed on your farm, then users will not see much difference between SharePoint and FAST search web parts and search centers. The noticeable difference is in the results where FAST results will include a refinement web part that displays counts and thumbnail images of Word and PowerPoint files. Even from an administrative perspective both MSS and FAST support the following:

Service Application Infrastructure

Metadata schema management

Crawl scheduling

Scopes, Best Bets and Synonyms

The biggest differences between SharePoint and FAST is FAST’s more robust ability to crawl millions of documents and better relevance in search results. SharePoint search can efficiently crawl and query up to 100 million documents, whereas, FAST can efficiently do the same up to a 500 million documents.

Fast Search Capacity Planning

Another substantial difference is the object model and many other little quirks that you will encounter when developing custom search solutions.

  Supported Syntax Object Model
MSS Keyword
FullTextSQL
Microsoft.Office.Server.Search
FAST Keyword
FQL
Microsoft.Office.Server.Search
Microsoft.SharePoint.Search.Extended.Administration

When developing search solutions that support managed property searching you can use either the KeywordQuery or the FullTextSQLQuery class. The KeywordQuery class now supports the operators (OR,AND,NOT,NEAR) when doing property searching. These type of operators were only available through the FullTextSQLQuery class using SharePoint Search SQL syntax in SP2007. Keyword Query Syntax

In some situations you may want to use the FullTextSQLQuery class which supports other proximity operators and full text operators such as CONTAINS which can be more effective for exact results. In addition, the mapped crawled property does not need to be mapped to the full text index which is required for Keyword property searching. SharePoint Search SQL syntax reference

FAST does not support SharePoint Search SQL queries.  Microsoft now recommends you develop all your search solutions using the KeywordQuery class so they can be seamlessly used between both SharePoint and FAST search. However, just like with SharePoint search if you need to create more complex searches in your solution, then you should use FQL (FAST Query Language). The KeywordQuery class exposes the  EnableFQL property. By setting this property to true your solution can use FQL which is an ABNF (Augmented Backus-Naur Form) language for more exact searching using managed properties. FQL syntax reference

Below are examples of the same query using SQL, Keyword and FQL Syntax:

SQL SELECT Title,Created,Path FROM SCOPE() WHERE (Title = 'Whatever' OR FileExtension = 'PDF')
Keyword title:whatever OR fileextension:PDF
FQL or(title:equals("whatever"),fileextension:equals("PDF"))

So this is where things start getting different. You will notice that the SQL query includes a SELECT list of properties to retrieve in the results. So how do you tell SharePoint which properties to return with Keyword or FQL syntax? In the code below you will see how using the SelectProperties collection of the KeywordQuery class lets you add the properties you want to return. You can easily add a range of property names.

 

public DataTable Execute(string queryText,
    Dictionary<string, Type> selectProperties)
{
    ResultTableCollection rtc = null;
    DataTable retResults = new DataTable();

    try
    {
        SPServiceContext context =
            SPServiceContext.GetContext(SPServiceApplicationProxyGroup.Default,
            SPSiteSubscriptionIdentifier.Default);

            SearchServiceApplicationProxy ssap =
                    context.GetDefaultProxy(typeof(SearchServiceApplicationProxy))
                    as SearchServiceApplicationProxy;

        using (KeywordQuery query = new KeywordQuery(ssap))
        {
            query.QueryText = queryText;
            query.ResultTypes = ResultType.RelevantResults;
            query.RowLimit = 5000;
            query.ResultsProvider = SearchProvider.FASTSearch;
            query.EnableFQL = true;
            query.EnableStemming = true;
            query.EnablePhonetic = true;
            query.TrimDuplicates = true;

            if (selectProperties != null && selectProperties.Count > 0)
            {
                query.SelectProperties.Clear();
                query.SelectProperties.AddRange(selectProperties.Keys.ToArray<string>());

            }

            rtc = query.Execute();

            if (rtc.Count > 0)
            {

                using (ResultTable relevantResults = rtc[ResultType.RelevantResults])
                    retResults.Load(relevantResults, LoadOption.OverwriteChanges);

            }

        }

    }
    catch (Exception ex)
    {
        //TODO:Error logging

    }

    return retResults;

}

 

Notice how you can switch search providers by using the ResultsProvider property. The property is set to the FASTSearch provider, but it can also be set to SharePointSearch or Default. Default will use whatever provider the  search service application proxy is configured for. If your query is using FQL syntax you must set the EnableFQL to true. If you don’t and the solution submits a FQL syntax query it will raise an error. A final note about using FQL and FAST search is that the property names must be in lower case. SQL and Keyword search property names are case insensitive, but not FQL. So if you use a property name that is not all lower case, then the code will raise a “Property doesn't exist or is used in a manner inconsistent with schema settings” error.

Both FullTextSQLQuery and the KeywordQuery class’s execute method returns a ResultTableCollection object which you then load the results into a DataTable. Here is the strange part with FAST. It returns a DataTable object where the data columns are read only and all the data column types are strings. This can be a problem if your solution binds directly to the DataTable . For instance if your grid has sorting and the managed property is expected to be a date time value, then the dates are sorted as strings. You can fix this issue by cloning the DataTable, changing the column’s data type and then importing the row.

         

          DataTable convertedResults = results.Clone();

          foreach (DataColumn dc in convertedResults.Columns)
          {
              dc.ReadOnly = false;

              if (selectProperties.ContainsKey(dc.ColumnName))
                  dc.DataType = selectProperties[dc.ColumnName];
          }

          foreach (DataRow dr in results.Rows)
          {
             convertedResults.ImportRow(dr);
          }

 

Searching Problems

Both SharePoint and FAST have quirky issues when searching decimal type managed properties. SharePoint search has a feature in the schema where you can automatically create new managed properties for newly discovered crawled properties. However, if the crawled property is a decimal, then the crawler does not store the decimal portion of the value from SharePoint. For example,  if your value in SharePoint is 10.12345, then the value stored is 10.00000. This basically makes searching for decimal amounts useless. Fortunately, Microsoft will be issuing a hot fix for this in the February 2011 cumulative update. The work around for this is to delete the automatically created managed property and create your own, then do a full crawl.

FAST has similar issues with the decimal type managed properties but more subtle. When using the FQL int or float functions, FAST will only search up to 3 decimal places. Using the example above,  if you search for 10.123 you will find your invoice, however if you use 10.12345 it will not. Is this a problem? I am not sure how many people use more than 3 decimal places in SharePoint.

One of the most common ways to search in SharePoint is to find a document based on a text managed property. Unfortunately, SP2010 has made this more complicated. SP2010 search is more scalable than SP2007 and one reason is the new feature of reducing storage space for text type managed properties. When creating a new text managed property you can set the “Reduce storage requirements for text properties by using a hash for comparison” option. If you do not set this option the “=” operator will not work. You can only use the CONTAINS full text predicate function with the FullTextSQLQuery class or the “:” operator with the KeywordQuery class, both of which will return results where the term is located within the text. This does not produce an exact match.

Schema Issues

Both SharePoint and FAST give you access to managed and crawled properties using the object model. You can access SharePoint search schema using the Microsoft.Office.Server.Search.Administration. However, with FAST you must use the Microsoft.SharePoint.Search.Extended.Administration.Schema namespace located in the Microsoft.SharePoint.Search.Extended.Administration.dll. FAST schema administration object model reference

One of the most common errors seen when searching SharePoint is the “Property doesn't exist or is used in a manner inconsistent with schema settings” error. To avoid getting this error in your custom solution you must prevent managed properties from being used that are not “Queryable”. The “Queryable” criteria is different between SharePoint and FAST. With SharePoint search you must use the ManangedProperty.GetDocumentsFound method to determine if any documents in the index are using this managed property. However, with FAST you must check both the ManagedProperty.Queryable and ManagedPropety.SummaryType properties. Queryable must be true and the SummaryType cannot be disabled.  Both these options are available when creating a new managed property in FAST.

A convenient features in SharePoint search is the the ability to have your managed properties automatically generated when a new crawled property is discovered during crawling. This eliminates the need to have an administrator set up the crawled property before your solution can start using it. This setting can be set by editing the crawled property category. Unfortunately, setting it in FAST does not work. All your managed properties must be manually created when using FAST.

Best bets for SharePoint search solutions

Microsoft is recommending standardizing on using the KeywordQuery class for custom search solutions to make it easier for your solution to seamlessly use both search technologies. However, there are still many differences between both which require your solution to add logic that depends on which technology you are using. To keep your solution clean and maintainable, I recommend that you develop your own provider based object model to abstract away the differences between SharePoint and FAST search. Your solution would then interact with a standard interface and each one of your custom providers would handle differences in syntax, schema, searching and object model dependencies.

Microsoft has made it easy to use FAST in SP2010, but in order to leverage it you still must have a deeper knowledge.

Thursday, December 9, 2010

Using SharePoint 2010 Secure Store Service Object Model

The secure store service in SP2010  is a great new feature which enables the BDC (Business Data Connectivity) service to connect to external resources. The secure store service along with BDC are consider two components of SharePoint’s Business Connectivity Services. You can read more about these services here Overview. It also can be used by your own custom SharePoint solutions to access external resources such as web services. For example, there are times when your solution may need access to external resources on another domain, therefore you would need to map the current user to credentials stored for that external resource. Your solution may also want to redirect users to a custom credential page to have user’s enter credentials for other applications, thus eliminating the need to prompt them every time they try accessing an external application. In this post I will show you how to set and get credentials for users from the secure store service object model. In addition, I will show you how to use stored credentials to access a web service. I have also put together a class that contains all the code in this posting, along with code for other secured stored service tasks, such as creating different types of secured stored applications and deleting credentials. The code can be downloaded from here: SecureStoreManagement.zip

The secured stored service provides two basic types of applications, group and individual. Group type applications are used to assign one set of credentials to groups and individual users. An individual type application is used to store one set of credentials for each individual user. You can also create a group or individual ticketed type applications. The ticketed type of application gives the ability to issue tickets to obtain credentials that will expire or timeout. These are useful for  more secure types of external applications. Finally, you can also create group or individual restricted type applications. The restricted application type only allows fully trusted code to obtain credentials. The examples, below deal with the individual type application. The classes used in the examples can be found in the Microsoft.BusinessData.dll and Microsoft.Office.SecureStoreService.dll.

 

Setting user credentials

The biggest problem that I ran into when using the secure stored service object model was trying to determine which credential corresponded with what field or parameter in the external resource. When you create a secured store application you are allowed to create up to 10 fields. You set each field’s name and credential type (Windows User Name, Windows User Password, Generic, PIN …). However, when using the object model you must use two different collections so you know which values you are setting or getting. You must use the collection of TargetApplicationField and the collection of ISecuredStoreCredential. In an individual type of application each set of credentials must be associated with a SecureStoreService claim, which you can create from a user’s login. The example below takes a user name, password, domain name and a user login to create the credentials and add them to the SecureStoreCredentialCollection in the same order as the TargetApplicationFields collection. This ensures that you can retrieve certain credentials from the SecureStoreCredentialCollection for a given TargetApplicationField. Finally, when creating the credential you must store the value as a System.Security.SecureString. The downloadable code contains the simple code to do this.

public static void SetUserCredentials(string userName,
    string userPassword,
    string domain,
    string targetApplicationID,
    string userLogin)
{      
            SPClaim claim = SPClaimProviderManager.CreateUserClaim(userLogin,
                SPOriginalIssuerType.Windows);
            SecureStoreServiceClaim ssClaim = new SecureStoreServiceClaim(claim);
            SPServiceContext context =
            SPServiceContext.GetContext(SPServiceApplicationProxyGroup.Default,
            SPSiteSubscriptionIdentifier.Default);

            SecureStoreServiceProxy ssp = new SecureStoreServiceProxy();
            ISecureStore iss = ssp.GetSecureStore(context);

            IList<TargetApplicationField> applicationFields =
                iss.GetApplicationFields(targetApplicationID);

            IList<ISecureStoreCredential> creds =
                new List<ISecureStoreCredential>(applicationFields.Count);

            using (SecureStoreCredentialCollection credentials =
                new SecureStoreCredentialCollection(creds))
            {

                foreach (TargetApplicationField taf in applicationFields)
                {
                    switch (taf.Name)
                    {
                        case "Windows User Name":
                            creds.Add(new SecureStoreCredential(MakeSecureString(userName),
                                SecureStoreCredentialType.WindowsUserName));
                            break;

                        case "Windows Password":
                            creds.Add(new SecureStoreCredential(MakeSecureString(userPassword),
                                SecureStoreCredentialType.WindowsPassword));
                            break;

                        case "Domain":
                            creds.Add(new SecureStoreCredential(MakeSecureString(domain)
                                , SecureStoreCredentialType.Generic));
                            break;
                    }
                }

                iss.SetUserCredentials(targetApplicationID, ssClaim, credentials);
            }

}

Getting user credentials

Getting a user’s credentials for a particular application is very straight forward. The code below takes an application ID and get’s the current users credentials. This method works for both group and individual type applications. The key thing to note is that it returns the credentials of the  currently  logged in user. If you are hoping to obtain the credentials for another user will you will be out of luck. The object model has no methods to retrieve credentials for other users. The internal code uses the current thread’s Identity to lookup the credentials in the secure store database. Now if you were able to somehow change the identity of the thread, then that identity also has to be logged in. There should be no need to impersonate another user when you can just easily map credentials to groups and individual users.

public static SecureStoreCredentialCollection GetCredentials(string targetApplicationID)
{
    SecureStoreCredentialCollection credentials = null;
    SPServiceContext context =
    SPServiceContext.GetContext(SPServiceApplicationProxyGroup.Default,
    SPSiteSubscriptionIdentifier.Default);

    SecureStoreServiceProxy ssp = new SecureStoreServiceProxy();
    ISecureStore iss = ssp.GetSecureStore(context);
    credentials = iss.GetCredentials(targetApplicationID);
    return credentials;
}

 

Using user’s secure store credentials

Now you can set and get credentials. So how do you use them? The code below shows how to get the current user’s credentials for an external application and create new credentials to call a web service. You can have many credentials but your code needs to know which ones to map to what fields to effectively use them. The code will take an application ID and get the credentials and using the list of TargetApplicationFields will map the values to variables by using the index position of the TargetApplicationField in the list, which will correspond to the same position in the list of ISecureStoreCredential. The new NetworkCredential then can be used to call a web service. I chose to use it with the SharePoint Lists web service to return the schema of a list. The secure store credential is stored as a System.Security.SecureString so you must translate it.

public static void UseSecureStoreCredentials(string targetApplicationID)
{

    listservice.Lists listsProxy = new listservice.Lists();
    listsProxy.Url = "http://basesmc2008/_vti_bin/lists.asmx";
    listsProxy.UseDefaultCredentials = false;
    string userName = string.Empty;
    string userPassword = string.Empty;
    string domain = string.Empty;

    using(SecureStoreCredentialCollection ssCreds =
        GetCredentials(targetApplicationID))
    {

        IList<TargetApplicationField> applicationFields =
            GetTargetApplicationFields(targetApplicationID);

        if (ssCreds != null && ssCreds.Count() > 0)
        {
            foreach (TargetApplicationField taf in applicationFields)
            {
                switch (taf.Name)
                {
                    case "Windows User Name":
                        userName =
                            ReadSecureString(ssCreds[applicationFields.IndexOf(taf)].Credential);
                        break;

                    case "Windows Password":
                        userPassword =
                            ReadSecureString(ssCreds[applicationFields.IndexOf(taf)].Credential);
                        break;

                    case "Domain":
                        domain =
                            ReadSecureString(ssCreds[applicationFields.IndexOf(taf)].Credential);
                        break;

                }
            }

            NetworkCredential externalCredential =
                new NetworkCredential(userName, userPassword, domain);

            listsProxy.Credentials = externalCredential;

            XmlNode listNode = listsProxy.GetList("shared documents");
        }
    }

}

 

Storing it all up

The secure store service is one component of the SharePoint’s Business Connectivity services and is used to enable integration of external resources with Microsoft Office. Here I have shown how your own solutions can leverage this service. Microsoft could have made this object model better by giving developers the ability to navigate the credentials without having to have the target application fields. However, having a central service to store, retrieve and manage external application credentials can enable SSO solutions between SharePoint and CRM systems. It can help you with NTLM “double hop” issues where credentials cannot be transferred across more than one computer boundary. The secure store service provides a way to store credentials securely rather than hard coding them in code or configuration files. SP2010 is making it easier to create more sophisticated enterprise solutions.