Friday, June 24, 2011

Advanced Search for SharePoint 2010 Hold and eDiscovery

Technorati Tags: ,,,,

In the last ten years “eDiiscovery” has become more important with many companies dealing with litigation. Courts can require companies to search and discover evidence within electronic documents. It is the responsibility of the company to put these documents on “hold”. “Holding” a document can lock the document in place, preventing it from being edited, moved, deleted, or checked out. The document can also be copied to an authorized document or record center.  Hold and eDiscovery is a very important ECM task.

Recently I have been working with the SharePoint 2010 Hold and eDiscovery feature and the feature left me wanting more. Unfortunately, the feature is not easy to use. For example, it was very difficult to construct an effective keyword search. First, the user must have an intimate knowledge of the SharePoint keyword syntax, Secondly, the text box used to enter the keyword search could only show the first 25 characters. Constructing complex keyword searches within this small of a space was impossible.

Getting search administrators to adopt this feature would be difficult given these problems.  Fortunately, SharePoint offers the Search Center and the Advanced Search Web Part that can enhance the Hold and eDiscovery feature.

In this post I will show you how to add a “Hold and Discover” tab to a enterprise search center. This tab will host an Advanced Search Web Part which is an excellent tool for users to easily construct complex keyword searches. The Advanced Search Web Part will post its keyword search to the SearchAndAddToHold.aspx page which is the page that is used for creating Holds. The keyword search then can be displayed in the text box and used for creating Holds. Finally, I will show you how to add a custom button to launch the Text Editor Web Page Dialog to display the complete keyword search and also enable users to construct keyword searches in a larger text space.

Create an Advanced Search Page

This step assumes you have already created a enterprise search center. The search center gives users a consistent user interface for different types of searches. In this case were creating a Hold and Discover type search. To create a new Advanced Search Page in the search center navigate to your search center. From the Site Actions menu select More Options.

The More Options gives you the option to create a publishing page. The Search Center framework contains four types of publishing pages, Advanced Search, Search Results, Search Box and People Search. In the Create dialog select Publishing Page and then click Create. Enter in HoldQueryBuilder as the title and URL, then select the Advanced Search page layout. Then click Create.

 

Add a new Hold and Discover tab to your search center

Your next step is to add a new tab to your search center. Click on the Site Actions menu and select the Edit Page menu item. Here you will see a Add New Tab link. Click on this link to add a new tab.

Enter in the Tab Page Name text box  “Hold and Discover”. Enter in the HoldQueryBuilder.aspx as the page that will be displayed when a user clicks on the tab, this is the Advanced Search Page created in the previous step.

 

Customize the Advanced Search Web Part

The next step is to customize the Advanced Search Web Part on the HoldQueryBuilder.aspx advanced search page. Navigate to the advanced search page located at http://severname/yoursearchcenter/pages/HoldQueryBuilder.aspx. Select the Site Actions menu and then select Edit Page menu item. Select the Edit Web Part menu item in the upper right hand corner context menu. In the web part tool pane expand the Search Box section. In the Search Box Section Label enter “Hold and Discover Documents that have…”.  Next, expand the Miscellaneous section. In the Results URL enter in the URL to the SearchAndAddToHold.aspx page.  This is the page for Holds and eDiscovery and is located in the _layouts directory, for example, http://servername/_layouts/SearchAndAddToHold.aspx. The Results URL is where the Advanced Search Web Part will send the constructed keyword search  using a query string appended to this URL. Click OK when done. Finally in the Page tab, click the Check In ribbon button, then in the Publish tab, click the Publish ribbon button.

 

Almost Finished

So below is what you have so far. You have a search center with a new tab called “Hold and Discover”. When a user clicks on it they are presented with the customized Advanced Search Web Part. After filling in search criteria and clicking the search button the user is presented the SearchAndAddToHold.aspx.

Hooking up it all up

When you click on the search button the Advanced Search Web Part appends the keyword search as a querystring argument to the SearchAndAddToHold.aspx page and navigates to it. Now we need to inject some java script into the SearchAndAddToHold.aspx page to grab the keyword search, URL decode it, and insert it into the search criteria text box. Below is a screen shot showing how the keyword search is displayed after the injection.

To capture the querystring and place the keyword search in the search criteria text box, the following java script should be inserted at the bottom of the SearchAndAddToHold.aspx page.

<script type="text/javascript">

    var keywordQuery = querySt("k");

    if(keywordQuery != undefined)
        document.getElementById('<%=m_tbSearchString.ClientID%>').value = decodeURIComponent (keywordQuery);

    function querySt(ji) {
        hu = window.location.search.substring(1);
        gy = hu.split("&");

        for (i = 0; i < gy.length; i++) {
            ft = gy[i].split("=");

            if (ft[0] == ji) {
                return ft[1];
            }
        }
    }

    function loadViewer(builderUrl, editorId, dialogFeatures) {
        var pReturnValue = showModalDialog(builderUrl, editorId, dialogFeatures);
        editorId.value = pReturnValue;
        try {
            editorId.focus();
        }
        catch (exception) {
        }
    }

</script>

This java script code is executed when the page loads. It calls the “querySt” function to grab the value of the “k” querystring variable which contains the keyword search built by the Advanced Search Web Part. If the value is present a jquery function decodes it since it is URL encoded. The java script then takes the decoded value and places it in the search criteria text box.

Something is missing

You can now leverage the Advanced Search Web Part to easily construct complex keyword searches without having to know the keyword search syntax. As you can see the keyword search built in the example cannot be completely viewed by the user. It would be nice to view the complete query. This would help the user to learn the syntax or even add additional conditions. Many SharePoint web parts use the Text Editor Web Page Dialog. For example, the Core Search Results Web Part uses it to help users edit xml to add additional return properties to display. The  following html can be added the the SearchAndAddToHold.aspx page to display a button next to the search criteria text box. This will call the “loadViewer” java script function listed in the previous code.  This function launches the Text Editor with the keyword search shown. The user can now view the complete keyword search and even edit it.

 

 

Summary

It is important to remember that only site collection administrators and users given permission to the Holds list can access the SearchAndAddToHold.aspx page. If the user does not have this permission then when clicking the search button an “Access Denied” page will be displayed.  In addition, you should create a backup copy of the SearchAndAddToHold.aspx file located in the layouts directory. You could even make a copy, rename it, make changes, and have the Advanced Search Web Part use the new web page.

This post has shown how to create a useful Hold and eDiscovery search. The Advanced Search Web Part is a great tool to create complex keyword searches without having to know the syntax. The actual search then can be viewed or edited easily using the familiar Text Editor used through out SharePoint. The Hold and eDiscovery process is an important process in an ECM system. Should it not be easy and useful to use? You can use this example as a starting point to help record administrators be more productive.

Wednesday, June 1, 2011

The Life and Times of a SharePoint Search Results Click Event

Technorati Tags: ,,

In my previous posting I wrote about how to customize your own ranking models. Ranking models are used to sort search results according to their relevance. The higher the rank of the document the higher it is listed in the search results.  SharePoint Search 2010 implements the BM25 ranking model. This model uses field weighting (managed properties) for dynamic ranking. For instance, some properties are more important than others, like title or social rating.  For static ranking calculations, SharePoint search ranks a document higher  if a document in a search results set was visited (click through) from a search results page.  In this post I will explain how SharePoint Search implements the tracking of click through events by users from the search results page. Also, I will show how custom search applications can leverage a SharePoint web service and the object model to log click through events.

 

Enable Query Logging

The first step of making sure search result click through events are used in ranking calculations is to enable query logging in the Search Service Application associated with the web application.  This can be enabled by navigating to the Search Service Application in Central Administration and clicking the Enable link next to Query Logging.

 

 

The Life of a Search Results Click Event

The flow diagram below illustrates the steps taken to process a search result click through in order for it to affect the relevance ranking of the document.

 

The first step of this process is initiated by the Core Search Results Web Part using JavaScript. The web part emits JavaScript to call the Search web service’s RecordClick method when a user clicks on the title or the URL of the result item. You can easily verify this using Fiddler. In the second step the Search web service calls the associated Search Service Application’s RecordClick method. The service application then calls the internal QueryLogger’s RecordClick method which in turn then calls the internal QueryLogQueue. This object caches the information sent and periodically flushes the data to the MSSQLogUnprocessed table contained in the Search Service Application’s database. These unprocessed URLs are processed and used in the next ranking calculation.

In addition to tracking click through events from search result items the web part will also track click through events for Best Bets. Best Bets are suggested links set up by search administrators that represent the most relevant documents for given keyword terms. Best Bet click through events are used to help manage Best Bet using the Best Bet Usage page in the Search Keywords section of Site Collection Administration. The search administrator can use this information to determine if Best Bets are being used.

 

Enable Custom Search Applications Click Through

SharePoint Search only records the search result click through event if query logging is enabled and if the search results are hosted in the Core Search Results Web Part. Custom search applications that do not use this web part must implement it. There are two methods to track search result  click through events, one using the Search web service’s RecordClick method or the SearchServiceApplication class’s  RecordClick method.

 

Search Web Service RecordClick Method

Custom search applications running remotely should use the Search web service to record a click through event. The important task is to create and populate the xml that is sent to this method. The xml represents a serialized Microsoft.Office.Server.Search.Query.QueryInfo class. If you reflect the QueryInfo class you will see custom xml serialization attributes and elements defined for each property. The attributes are cryptic. Below is an example of an xml serialized QueryInfo object:

The “c” element represents the URL that was clicked. The “g” attribute represents the guid ID for the SPSite that the search was executed from and is required. The “qi”  element is required and represents the id of the query. In the case of custom search applications you can create your own guid for the query id. Finally, the “t” attribute represents the time that the search was executed and is required. You must also send it in UTC format. Below is sample code for recording a click through event using the Search web service.

public static void RecordSearchClick()
{
    string xmlTemplate = "<i  ";
    xmlTemplate += "g=\"e1c43a36-4dcf-4fd9-9287-30e7773c4159\" ";
    xmlTemplate += "t=\"2011-05-31T21:49:28.0642745-05:00\" ";
    xmlTemplate += "xmlns=\"urn:Microsoft.Search\"> ";
    xmlTemplate += "<qi>" + Guid.NewGuid().ToString() + "</qi>";
    xmlTemplate += "<c>http://basesmc2008/tester/pdfimg.pdf</c> "; 
    xmlTemplate += "</i>";

    search.QueryService queryService = new search.QueryService();
    queryService.Url = "http://basesmc2008/_vti_bin/search.asmx";
    queryService.UseDefaultCredentials = true;
    queryService.RecordClick(xmlTemplate);

}

 

Search Service Application RecordClick Method

Custom search applications that can access the SharePoint object model should use the Microsoft.Office.Server.Search.Administration.SearchServiceApplication object associated with the web application it is running in.  This class exposes a RecordClick method using a QueryInfo object as an argument.  The Search web service RecordClick method calls this same method by de-serializing the xml sent to it into a QueryInfo object, then passing it as an argument. The code below shows you an example.

SPFarm farm = SPFarm.Local;
SearchServiceApplication searchApp = (SearchServiceApplication)farm.Services.
    GetValue<SearchQueryAndSiteSettingsService>().
    Applications.GetValue<SearchServiceApplication>("Search Service Application");

using(SPSite site = new SPSite("http://basesmc2008"))
{
    QueryInfo qi = new QueryInfo();
    qi.QueryGuid = Guid.NewGuid().ToString();
    qi.ClickedUrl = "http://basesmc2008/tester/pdfimg.pdf";
    qi.SearchTime = DateTime.Now;
    qi.ClickTime = DateTime.Now;
    qi.SiteGuid = site.ID.ToString();
    searchApp.RecordClick(qi);
}

 

Relevant Clicking

Search result click through events influence the relevance of a document. The action is important enough for Microsoft to integrate the tracking of this action directly into the Core Search Results Web Part. If your custom search applications do not extend this web part, then you can enable the same type of tracking using one of these two methods. The QueryInfo class is the structure used to hold this information. I recommend looking at this class further. The class has many properties and is also used for logging queries for search administration reporting. Other properties may play a role in influencing relevance, for example, the NonClickedUrls property can hold a string array of URLs that were not clicked. So it appears that not clicking on search result items can lower their relevance.