SharePoint 2010 Content Source loading very slow in Central Admin Website

******************
UPDATE JUNE 10TH 2011

This issue has been resolved and confirmed – included in April 2011 Cumulative Update, as well as Service Pack 1.

The following PowerShell property has been added to the Search Service Application. This property will allow you to set the number of days to retain the crawl log history:

CrawlLogCleanUpIntervalInDays

Example PowerShell to set the property:

1. Run the following to get a list of the search service application Ids.

Get-SPServiceApplication | Where {$_.TypeName -eq "Search Service Application"}

2. Run the following command to get the Search Service Application Object. Replace the Id with the Id of your search application that was output in the previous PowerShell command.

$searchApp = Get-SPServiceApplication | Where {$_.Id -eq "<your content service application id>"}

3. Set CrawlLogCleanUpIntervalInDays to 10

$searchApp.CrawlLogCleanUpIntervalInDays = 10
$searchApp.Update()

******************
UPDATE MAY 5TH 2011

MS Support have been in touch to say that the issue has been resolved as part of the April 2011 Cumulative Update, however I cannot see any mention of this issue being resolved in the Knowledge Base article itself. I have contacted them to confirm.

******************
UPDATE APRIL 6TH 2011

I believe the stored procedure in the Search Service Application DB (on my wizard-built install this is Search_Service_Application_DB_<GUID>) called proc_MSS_CleanupWithInterval is the procedure responsible for removing old history data after the hardcoded 90 days.

USUAL CAVEAT/NOTE/WARNING – MAY NOT BE DIRECTLY SUPPORTED BY MS – DANGER WILL ROBINSON etc etc

Running the following command in SQL Server on the Search Service Application DB:

exec [proc_MSS_CleanupWithInterval] @CleanupInterval=60

will remove all search crawl history items older than 60 days. I’ve tested this on one environment and it worked fine (and check that stored proc for the strangest bit of SQL I think I’ve seen… you have to see it to believe it!). IMHO you would be pretty safe running this manually as it goes through and deletes corresponding entries from:

MSSCrawlHistory

MSSCrawlComponentsState

MSSCrawlComponentsStateLog

MSSCrawlComponentsStateTemp

MSSCrawlHistoryLog

MSSCrawlStoresState

which neatly reduces the size of your crawl history table to a much more managable, and more importantly speedier, size. I still prefer the option of adding an index to the table myself – I like my crawl history! And besides which you may need to run analysis over crawl history data at some stage for troubleshooting purposes. But the proposed fix sounds like it will help with the central issue – slow loading of content source history.

******************
UPDATE 31-March-2011
My MS Support case has been updated – the issue will be addressed in the April 2011 CU for SharePoint 2010.
“The fix is to expose crawl log cleanup interval as property of a Search Service Application via powershell. Previously it is hardcoded to 90 days, which causes the slow loading of the listcontentsources.aspx when there is a large crawlhistory table. ”
I prefer the table index fix myself as that would cater for more aggressive search interval requirements, but I guess this works too.
******************

When using Central Administration to view content sources for our FAST Search Connector (Search Service Application) the content source history and content source pages takes a very long time to load. In November it took 40 seconds, in December it took 60 seconds, and now at the end of January it takes 100 seconds. I have five content sources made up of three Intranet sites and two very small static websites. Total content is around 20,000 items (pages/data/list items/etc). Two of the Intranet sites are set to index incrementally every 15 minutes, and the third is set to every 5 minutes (it needs to most up-to-date search results and changes the most).

Looking in the SharePoint 2010 log files there were several entries similar to the following:

12/24/2010 10:02:36.61 w3wp.exe (0x1B6C) 0x16F8 SharePoint Server Database fa43 High Slow Query Duration: 33749.6701396407 4fb4bc60-f4a4-4467-9d63-03dee577b04b

which isn’t hugely helpful but does point to the fact that something in SQL Server is taking A Long Time ™.

Using SQL Server Management Studio – Manager Activity Monitor (available via right-clicking on the Server Name in Ent Manager and selecting “Activity Monitor” underneath “Recent Expensive Queries”) you should find any long-running queries that SQL Server has logged. You can right-click on these queries to view the query text. On our server I found the following:

SELECT
max(a.CrawlID) as CrawlID ,
max(a.CrawlType) as CrawlType ,
max(a.ContentSourceID) as ContentSourceID ,
max(a.Status) as Status ,
max(a.StartTime) as StartTime ,
max(a.EndTime) as EndTime ,
sum(c.SuccessCount) as Success ,
sum(c.WarningCount) as Warning ,
sum(c.ErrorCount ) as Errors ,
sum(c.DeleteCount) as Deletes ,
sum(c.NotModifiedCount) as NotModified,
sum(c.SecurityOnlyCount) as SecurityUpdates ,
sum(c.LevelHighErrorCount) as TopLevelError
FROM dbo.MSSCrawlHistory a INNER JOIN @TEMP b
ON a.ContentSourceID = b.id INNER JOIN
(
select CrawlID,sum(SuccessCount)as SuccessCount,sum(WarningCount)as WarningCount,sum(ErrorCount)as ErrorCount,sum(DeleteCount) as DeleteCount,sum(SecurityOnlyCount) as SecurityOnlyCount,sum(SecurityOnlyErrorCount) as SecurityOnlyErrorCount,sum(NotModifiedCount) as NotModifiedCount,sum(LevelHighErrorCount) as LevelHighErrorCount from dbo.MSSCrawlComponentsState GROUP BY CrawlID
union
select CrawlID,sum(SuccessCount)as SuccessCount,sum(WarningCount)as WarningCount,sum(ErrorCount)as ErrorCount,sum(DeleteCount) as DeleteCount,sum(SecurityOnlyCount) as SecurityOnlyCount,sum(SecurityOnlyErrorCount) as SecurityOnlyErrorCount,sum(NotModifiedCount) as NotModifiedCount,sum(LevelHighErrorCount) as LevelHighErrorCount from dbo.MSSCrawlComponentsStateLog GROUP BY CrawlID
) c
ON a.CrawlID=c.CrawlID
AND
a.CrawlID IN
(SELECT TOP (@Count) e.CrawlID FROM dbo.MSSCrawlHistory e
WHERE e.ContentSourceID = a.ContentSourceID
ORDER BY e.StartTime DESC
)
WHERE a.StartTime IS NOT NULL
GROUP BY a.CrawlID,a.ContentSourceID,a.StartTime
ORDER BY a.StartTime DESC

You can also view the execution plan by right-clicking on the query as well:

Expensive sort taking up precious query time

Expensive sort operation taking up precious query time

The MSSCrawlHistory table stores a record for each successful/unsuccessful crawl attempted/completed on a content source, and is set to store a maximum of 90 days of crawl history. The MSSCrawlHistory table in the FAST Content database has no indexes on the field “StartTime”. As per the query above, the ORDER BY StartTime DESC hammers this query, and the situation gets worse as the size of data in the MSSCrawlHistory table increases. As an example, I have about 20,000 entries in this table and the query takes 100 seconds+. I should add I have a very aggressive content source incremental indexing happening (once per 5 minutes 12 hours per day) because one of our sites relies heavily on accurate data for searching.

PLEASE NOTE THE FOLLOWING CHANGES WERE MADE IN A TEST ENVIRONMENT – I have not cleared this with Microsoft as yet (support call in place) as the following scripts adds an Index to the MSSCrawlHistory table which in theory invalidates your warranty for SharePoint 2010. SO ONCE AGAIN DO NOT IMPLEMENT THIS IN PRODUCTION WITHOUT REALISING THE CONSEQUENCES!

While I can’t change the query that is being executed, I can add an Index to the SQL table to help speed things up a lot. This SQL script will add a DESCending index on StartTime to the MSSCrawlHistory table. This improved my query times from 100seconds to <1 second. This works a treat in my test environment, I am currently checking with Microsoft to see if this would become a supported fix/hotfix. Adding an index would only impact a table when there is a lot of data being added to it – however in this case the Crawl History table logs the results of crawls and is quite a small table. It should be reasonably safe to run – in theory 🙂

/* To prevent any potential data loss issues, you should review this script in detail before running it outside the context of the database designer.*/
BEGIN TRANSACTION
SET QUOTED_IDENTIFIER ON
SET ARITHABORT ON
SET NUMERIC_ROUNDABORT OFF
SET CONCAT_NULL_YIELDS_NULL ON
SET ANSI_NULLS ON
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
COMMIT
BEGIN TRANSACTION
GO
CREATE NONCLUSTERED INDEX IX_MSSCrawlHistory_StartTime ON dbo.MSSCrawlHistory
(
StartTime DESC
) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
ALTER TABLE dbo.MSSCrawlHistory SET (LOCK_ESCALATION = TABLE)
GO
COMMIT

Advertisements

3 Responses to “SharePoint 2010 Content Source loading very slow in Central Admin Website”

  1. testing2 Says:

    I’ve the same issues. Did you receive any answers from Microsoft regarding your question:

    UPDATE MAY 5TH 2011

    MS Support have been in touch to say that the issue has been resolved as part of the April 2011 Cumulative Update, however I cannot see any mention of this issue being resolved in the Knowledge Base article itself. I have contacted them to confirm.

    • gavinmckay Says:

      No, I haven’t received any confirmation. But I have tested the Powershell command in my SharePoint environment, and it does work. I set it to 10 days and it automatically removed entries older than 10 days. This changed the content source speed to seconds, instead of minutes!

  2. DN Says:

    We also have the same problem.

    UPDATE JUNE 10TH 2011:

    We executed your code examples. Is it correct, that the Shell does not give a hint after entering and submitting the code? There was no error message, but also no success message!?
    2nd question: How long did it take, after proceeding the code, until the content sources were faster again?
    Thanks for your help anyway.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: