Archive for the ‘Stepwise’ Category

Software boundaries and limits for SharePoint 2013 – Content database limits

April 14, 2013

One of the most oft-quoted Microsoft TechNet articles for architecting and designing SharePoint 2013 environments is Software boundaries and limits for SharePoint 2013 (and previously SharePoint Server 2010 capacity management: Software boundaries and limits). This is an excellent article that details explicitly the upper boundaries and thresholds that have been tested and are supported by SharePoint across all environments. Most of the boundaries and thresholds, such as 30,000,o00 items per list, will likely never become a problem for most organisations. They do however need to be monitored and sized accordingly.

The one section that almost always is included as part of any SharePoint design is the section on Content Database limits, and in particular the general usage content database size scenario. It states:

We strongly recommended limiting the size of content databases to 200 GB, except when the circumstances in the following rows in this table apply.

If you are using Remote BLOB Storage (RBS), the total volume of remote BLOB storage and metadata in the content database must not exceed this limit.

and it continues on to discuss IOPS per GB, high availability, disaster recovery, and so on. There are also references to Remote Blob Storage, which is of huge interest to me and my company’s product, Stepwise.

Whenever I meet with a client and discuss their architecture, one of the first points I hear is “and of course we are planning to split up our Content Databases so they don’t get bigger than 200GB”. And that’s when I ask them “why?” and they say “because Microsoft”. And if you read the article, it is really quite clear – stick to 200Gb. But here is why I think the article is misinformed. Let me go through the scenario line-by-line.

Line 1: “We strongly recommended limiting the size of content databases to 200 GB, except when the circumstances in the following rows in this table apply.

I’ll get to this later. Suffice to say, if you have more than 200GB of content, you probably meet all the criteria which says you *can* have a larger than 200GB content database.

Line 2: “If you are using Remote BLOB Storage (RBS), the total volume of remote BLOB storage and metadata in the content database must not exceed this limit.

Remote Blob Storage is a fully supported Microsoft solution that enables electronic files (Binary Large OBjects, or BLOBs) to be extracted from the content database and stored on the file system. Stepwise is an example of a Remote Blob Storage system. BLOBs take up the majority of any content database, so we often find that databases shrink by as much as 95% after implementing Stepwise. So your 200GB database just became about 5GB in size.

Line 3: “Content databases of up to 4 TB are supported when the following requirements are met: Disk sub-system performance of 0.25 IOPs per GB. 2 IOPs per GB is recommended for optimal performance

If you are working with a medium to large organisation, and/or have a lot of data, you are probably using enterprise storage like a SAN. And most SANs offer these types of performance. Check with your enterprise storage team, but you should be able to tick this one.

Line 4: “Content databases of up to 4 TB are supported when the following requirements are met: You must have developed plans for high availability, disaster recovery, future capacity, and performance testing.

As with Line 3 above, any medium-to-large organisation probably has this well in hand. They will have failover plans, multiple servers to handle load and loss of service, possible secondary and even tertiary redundant sites, and be managing your infrastructure on a daily basis. This is usually a tick.

Line 5: “Requirements for backup and restore may not be met by the native SharePoint Server 2013 backup for content databases larger than 200 GB. It is recommended to evaluate and test SharePoint Server 2013 backup and alternative backup solutions to determine the best solution for your specific environment.”

I worked with a team to set up an enterprise backup solution, and this is what I asked them to add to their backups.

  1. Farm configuration database
  2. Content database(s)
  3. Virtual server drives (C:, D:, etc)

and if using Stepwise:

  1. Stepwise administration database
  2. File system(s) used to store BLOB data

You can include search databases in this list as well if you have them, and if they are worthwhile to back up. I have a complete article on backup/restore for Remote Blob Storage using Stepwise.

Line 6: “It is strongly recommended to have proactive skilled administrator management of the SharePoint Server 2013 and SQL Server installations.”

You need a SQL person and a SharePoint person, or at least someone with skills in these areas if you don’t have a dedicated resource.

Line 7: “The complexity of customizations and configurations on SharePoint Server 2013 may necessitate refactoring (or splitting) of data into multiple content databases. Seek advice from a skilled professional architect and perform testing to determine the optimum content database size for your implementation. Examples of complexity may include custom code deployments, use of more than 20 columns in property promotion, or features listed as not to be used in the over 4 TB section below.”

This is more of a configuration control issue that a content database issue. In plain English, if you install things that will create dependencies and/or impact future upgrades, separate it out to its own content database. That will limit the impact the customization and configuration will cause.

Line 8: “Refactoring of site collections allows for scale out of a SharePoint Server 2013 implementation across multiple content databases. This permits SharePoint Server 2013 implementations to scale indefinitely. This refactoring will be easier and faster when content databases are less than 200 GB.”

The issue here is around the time it will take when refactoring. If your refactoring takes 3 hours instead of 2 hours, will that be an issue? What about 20 hours instead of 16 hours? For most organisations this will be a task that is not performed often. Large content migrations can be done out of hours, with little to no downtime. You will spend more time fixing things like link changes and planning the content migration, than actually performing the task itself.

Line 9: “It is suggested that for ease of backup and restore that individual site collections within a content database be limited to 100 GB. For more information, see Site collection limits.”

Most enterprise storage and backup solutions will not find this a problem, but it should be included as part of the calculations for disaster recovery. The length of time to restore your SharePoint environment is the sum total of restoring all the individual components. If your content databases takes 8 hours to restore, then that is how long your environment could be down for.

This of course is relevant if you want to restore a single content database because it has become corrupted. But what happens if you need to restore all your site collections? Then it won’t matter if your content is split across 2 content databases, or across 200 – the total time to restore them will be the same.

So let’s summarise:

  1. Don’t let the 200GB recommendation control your design.
  2. Content databases can go beyond 200GB, without issue.
  3. You need to be managing your environment properly and effectively.
  4. Backup and restore works, make sure you document the process and test.
  5. If you environment can handle it, everything will be OK.

 

 

Advertisements

Remote Blob Storage

June 17, 2012

Remote Blob Storage is a Microsoft SQL Server 2008/R2 downloadable package that can be used by applications to store Binary Large Objects (BLOBs) on a filesystem, rather than storing them in an SQL Server database. The Remote Blob Storage package includes client-side components (.NET assemblies) that clients can integrate into their applications to access the Remote Blob Storage API. SharePoint 2010 has native support for Remote Blob Storage, and Stepwise can provide the integration between SharePoint 2010 and your enterprise storage environment.

Remote Blob Storage with Stepwise externalises SharePoint documents to cheaper storage

Remote Blob Storage with Stepwise

There are numerous benefits to using Remote Blob Storage to externalise (i.e. not store in a varbinary(MAX) field or similar) BLOBs, including:

  1. Smaller SQL server databases
  2. BLOBs don’t use tempdb
  3. You can choose the storage type and location
  4. Purchase cheaper disk storage
  5. Use on-site or off-site storage (such as Cloud storage)

But if you implement a Remote Blob Storage solution like Stepwise as part of your SharePoint environment, what does this mean to various user roles in your business?

SharePoint Users

Your SharePoint users are not impacted at all. They carry on using SharePoint as they always have, while Remote Blob Storage and Stepwise handle the integration. In most cases they should get better open/save performance using Remote Blob Storage.

SharePoint Administrators

Implementing Remote Blob Storage with Stepwise reduces the size of SharePoint content databases. This has a number of good results for SharePoint Administrators, including things like quicker backup and restore times, less time to upgrade SharePoint (via Cumulative Updates and Service Packs) and reducing the need to split out content into multiple site collections.

The last point is probably the most important, and one I’ve blogged about previously. Maintaining the same site collection structure instead of using the traditional method of splitting content into multiple site collections vastly reduces the overhead of managing a SharePoint environment.

Database Administrators (DBAs)

DBAs have a lot less to be concerned about. Because the BLOB files themselves (which could be quite large – up to 2GB if you really want to!) never pass through the SQL Server engine, I/O to the server can be greatly reduced. Database size is smaller (95% less size is quite common) so repairs are shorter, backup is shorter, and the impact on infrastructure is greatly reduced.

Also, the BLOBs don’t pass through tempdb. If you are not using Stepwise, all files are part of the SQL transaction so they end up traveling via tempdb. Remote Blob Storage is much friendlier to your SQL Server environment.

Storage Administrators

For Storage Admins, things just get a whole lot easier. For a start, they don’t need to purchase very expensive disk drives to keep their very large SharePoint databases working. This can add up to tens/hundreds of thousands of dollars worth of savings, and is one of the initial Return on Investment (ROI) savings that are gained immediately.

Storage admins don’t need to maintain large expensive environments for SQL Server. SQL Server is demanding in terms of storage I/O, which means you need great storage speed and performance.

Backup Administrators

Stepwise can be configured to match your enterprise backup retention schedule. This makes designing your backups, replication, and off-site storage solutions a lot simpler and more stream-lined. Most backup designs rotate through hourly, daily, weekly, and monthly retention strategies. Stepwise supports splitting up your files into different disk areas (such as network drives, folders, CIFS shares, cloud storage, etc) so Stepwise storage design can match your backup and retention schedules.

This becomes important for backup and restore for a number of reasons:

  1. Quick recovery. In the event of a disaster recovery scenario, you want to get your most important information back first. For a SharePoint and Stepwise integrated environment, you can restore your (much smaller) SQL Server databases, your first tier of storage, then your second tier of storage, and so on. Your first tier can contain your <30 day old documents, and this may be the most important/most used documents in your SharePoint site.
  2. Data dedupliation. As data is “aged” and moved to old tiers of storage, there is a gradual change in the size of data. This gives you a huge amount of data storage reduction, and consequently less drive storage used.
  3. Off-site replication. Replication can be managed more efficiently as you can manage changes in different locations. Your <30 day storage replication can use a more up-to-date replication strategy, whereas you 90+ day replication may only need to be done nightly.

Backing up SharePoint 2010 and Remote Blob Storage with Data Protection Manager 2012

June 10, 2012

I’ve recently finished upgrading one my SMB’s from Data Protection Manager 2010 (DPM) to DPM 2012. The upgrade went well, added a bunch of useful functionality, and gave me a chance to modify their long-term storage solution to use external USB drives with Firestreamer virtual tape library. All going well so far!

One of the other components we installed was Stepwise for Remote Blob Storage, and this also meant slightly modifying the protection groups to make sure their backups were consistent. I’ve already blogged about backup with Stepwise previously, but the fundamental principle is to ensure you backup your database first, then your filesystem(s) second. This ensures all the files exist that your SharePoint database is referencing. However with incremental backups in DPM you can relax this somewhat – see below for more information.

I created a Protection Group that consisted of my SharePoint 2010 content databases and a separate Protection Group for my files and my network file shares. The protection groups were set up to complete a full express backup every night at 10pm, with a synch frequency of 15 minutes. DPM uses block-based backup technology when it performs full express backups, which can save an enormous amount of time and effort for the systems to back up data. It does this by only backing up blocks on the disk that have changed, rather than iterating through the filesystem searching for individually changed files.

Data Protection Groups for SharePoint 2010 and Remote Blob Storage

Protecting SharePoint 2010 and Stepwise Remote Blob Storage with DPM 2012

The issue with DPM is that you can’t force an order for a protection group to run in and guarantee that it will occur. For example, even though the sych frequency is every 15 minutes, there is no guarantee that my file-based protection group and my sql server protection group will happen at *exactly* the same time. For a “perfect” backup scenario it would be better to have the backups synchronised in their proper order, but here is where I think it doesn’t really matter too much.

Scenario 1 – you need to restore your content database

Simple enough – use DPM to restore the content database. Your Remote Blob Storage files haven’t changed, so you don’t need to restore them. IMPORTANT:  all Remote Blob Storage technologies are effectively Write-Once Read-Many (WORM) devices – they *never* overwrite existing files, always create new files. Even if you save over the same document in SharePoint that does not have version control turned on, Stepwise would create a completely new file as per the Remote Blob Storage interface requirements.

Scenario 2 – you need to restore your filesystem

This should only usually occur when there has been a hardware failure or corruption, but if so you would restore your filesystem from the DPM protection group to the same, or another, location.

Scenario 3 – you need to restore both SharePoint and your filesystem

Complete site failure? Of hopefully you are testing in your pre-production environment! Regardless, you would restore both protection groups to get all your data back again.

What about restoring only one file? Or one site?

Here is where things get fun. Because Stepwise has a copy of all files and versions of files, you typically don’t need to restore your filesystem at all. As long as the SharePoint content database still has a reference to your original file, you only need to worry about the SharePoint content database. It may be as simple as restoring your content database as a different database name and then using the SharePoint Central Admin site to get your data exported and re-imported to a location of your choosing. Because the backup of your database will have a reference to the Stepwise document id, it can be retrieved with the data as part of an export process.

For a more in-depth discussion of Remote Blob Storage, SharePoint 2010 and Stepwise, see my previous post series.

Remote Blob Storage for SharePoint 2010 with Stepwise – Part 3 Disaster Recovery

May 25, 2012

This is the third post in this series on Stepwise, Backup and Restore, and Disaster Recovery.

Using Stepwise Remote Blob Storage we have learned that the documents are “externalised” from the SharePoint SQL Server content database and stored on alternate filesystems. One of the interesting side-effects of this externalisation process is the fact that documents are created by Stepwise whenever a file is added to SharePoint, but if the same document is updated in SharePoint a completely new file is created in Stepwise. This happens regardless of whether you have version control turned on in your document library or not – every write to SharePoint results in a new file being created in Stepwise.

The interesting part of this is what happens in a disaster recovery scenerio when you lost your SQL Server content database(s). Because Stepwise has a copy of every single document written to SharePoint, when restoring your data you only need to restore the SQL Server databases. The filesystem that Stepwise is using already has your previous versions of documents on it (up to a point – see Garbage Collection below). This can drastically reduce your restore time.

Furthermore, Stepwise actively collects SharePoint information and metadata as it processes documents. This information is maintained with the document and is accessible to Stepwise administrators. So here’s another nice side-effect of the process – you may have lost 4 hours of SharePoint content database, but Stepwise can determine what files were added in the last 4 hours and provide the metadata and documents for everything that occurred during that 4 hour window. Result – no lost documents!

But what happens if you lose your filesystem, and 4 hours goes up in smoke on that? Obviously that is not a good thing! That is definitely a time to get your restore process started, however again Stepwise can assist in the task. Stepwise still maintains metadata and file information in it’s own database, and it can report back all the documents that were added/changed in the 4 hour window. So while it isn’t all good news, you can at least let clients know exactly what they have lost rather than leave them in the wilderness. And some more good news? Stepwise caches recently accessed documents locally on your web front-ends – including documents that have been uploaded to SharePoint. So we can interrogate the Stepwise cache and pull documents out of there as well.

It’s not perfect – but anything that can help you recover your data after a disaster is a good thing!

What is Garbage Collection?

Remote Blob Storage uses the term Garbage Collection to describe the clean-up process of deleting documents that are no longer being referenced. As an example, consider a document that has been deleted from a document library in SharePoint. It first hits the user recycle bin, then the site collection recycle bin, then finalled the deleted from end user recycle bin. After it leaves this area, SharePoint no longer maintains any connection to the document.

It is at this stage that the Garbage Collection process in Stepwise kicks in. Stepwise uses the Remote Blob Storage API to identify any documents that SharePoint no longer references. It then checks a configurable number of days parameter (defaults to 7 days) and if the date is older than this, it will physically delete the document from the back-end storage.

In some situations, such as WORM-based configurations of Stepwise and/or for compliance reason, Garbage Collection can be disabled completely. This ensures that all documents are maintained by the system indefinitely.

Remote Blob Storage for SharePoint 2010 with Stepwise – Part 2 Backup and Restore

May 25, 2012

You can’t have a storage conversation about SharePoint 2010, Remote Blob Storage, and Stepwise without quickly getting into Backup and Restore options.

Remote Blob Storage with Stepwise externalises documents from a SharePoint content database to external storage systems – see Part 1 of this blog series for additional information. This then presents two separate sets of data that need to be backed up, the SQL Server content database(s) and the storage devices themselves (such as a file share). Backup and restore operations need to synchronise their schedule to ensure they are capturing all data.

How Much Data Can I Afford To Lose?

This is the classic backup/restore question. It is entirely possible with today’s technology to get very close to perfect data integrity – given enough budget! Enterprise storage systems such as EMC Data Domain offer extremely high performing data backup solutions. It really depends on how much you want to spend to achieve your goals.

I recently spoke with a Courier Company about what the impact would be if they lost their system for a day, how much would it cost them? They keep electronic run-sheets of their jobs, obtain signatures for completed work, and have GPS systems on hand-held devices. They could recreate their entire day – it would take time, but it could be done. The cost to them for being offline for a day would be small.

Switch over to a legal services company who charges in 10-minute increments. They have multiple sites world-wide, across time-zones, and are heavily reliant on their IT systems for both case management and time management. Being offline for a day for this company would cost them tens of thousands of dollars.

Often it is situations like these that should govern your backup and restore designs. It must be matched to the business requirements and of course what is affordable!

Backing Up SharePoint when Using Stepwise

When documents are externalised (stored outside the SQL content database) via Stepwise they are stored on a filesystem, and the storage location is maintained within the Stepwise administration database. The SharePoint database is also updated to store metadata about the document as well as tracking information about the document’s usage status (i.e. is it still active, is it in the recycle bin, has it been deleted entirely).

In order to ensure you maintain all the components of a SharePoint + Stepwise installation, you can use this procedure:

1. Snapshot the SQL Server content database(s) and the Stepwise administration database

2. Backup the databases

3. Remove the snapshots

4. Backup the file share(s)

Let’s examine this in more detail.

Snapshotting the Databases

Snapshotting the databases is a technique available in most commercial backup software solutions. Snapshotting creates a read-ony, static view of a SQL Server database and ensures the data is not being updated while the backup is being taken. SharePoint and Stepwise can still access the primary, writable databases, but the snapshot ensures nothing happens during the backup process.

Backup the Databases

The databases contain not only the metadata for the documents in SharePoint, but they also contain the physical paths to the documents on the file system(s) configured for use by Stepwise. By backing up the databases first, you ensure that the links to the documents exist and are valid at the time the backup is taken.

Remove the Snapshots

After the backup has been completed successfully, the snapshots are no longer required. This step is usually done automatically by the backup software or SQL Server backup processes and does not need to be manually completed.

Backup the Filesystem(s)

The filesystems contain the physical files that have been stored by Stepwise on behalf of SharePoint. These need to be backed up as part of your backup/restore solution to ensure you get both your SharePoint data and the externalised files.

Restoring Stepwise

The restore steps are similar, but depend on what data you have lost. You can read more on this in the final piece of the puzzle: disaster recovery scenarios in part 3 of this series.

Remote Blob Storage for SharePoint 2010 with Stepwise – Part 1

May 24, 2012

This post has spent a long time in draft, but as we have released the second version of our product and things are progressing nicely, now is a good time to post this.

For several years my company (Invizion) have been working on a Remote Blob Storage product for SharePoint 2010 called Stepwise. Remote Blob Storage is a technology and API provided by Microsoft that allows you to move documents out of your SQL Server database and store them on file-system based storage (think network drive or cloud storage) by a process called “externalisation”. There are nemerous advantages, but the biggest ones are:

  • Your databases are smaller. Sometimes hugely smaller – up to 95% is not uncommon for SharePoint 2010 content databases in particular
  • The documents don’t travel via SQL Server at all, they are stored directly on a file-system. So SQL doesn’t get slammed with I/O, your temp databases aren’t hammered, and transaction size is smaller (but not shorter – see below)
  • You can utilise existing storage systems, like high-capacity drives which are much cheaper. SQL Server uses (or should use!) high-performance drives, which are expensive. This means organisations spend less on physical storage for their SQL Server environment
  • Backup and restore tasks are in most cases substantially quicker. Ask your backup engineer – would they prefer backing up a 1TB SQL Server database, or a 100GB SQL Server database + 900GB of documents? There are also huge improvements in Data Deduplication with backup software that works more efficiently on file-based data rather that SQL Server databases

There are a lot of other benefits, but these are the main ones.

There are several competitors in the Remote Blob Storage provider-space, but Stepwise has some pretty unique features which I think are worth detailing here.

  1. Stepwise isn’t reliant on SharePoint. If SharePoint is down, can you still access and, more importantly, manage your documents? Stepwise can.
  2. Stepwise uses Microsoft Management Console to control all configuration of the system. No Central Admin features to deploy, no separate website, no timer jobs to run, no impact on SharePoint (beyond reading and writing the documents of course!)
  3. Stepwise manages documents, rather than just storing them. Want to add more storage? Covered. Need to move documents to a new location with no down-time? Stepwise can do that. Want to calculate the cost of cloud storage? Stepwise has inbuilt functionality to show you how much the cloud is going to hit your budget.
  4. Stepwise can integrate with your non-SharePoint applications. Stepwise is a fully-featured Content Addressable Storage (CAS) system based on Microsoft’s Remote Blob Storage technology. That means your custom applications can benefit in the same fully-supported way.

What about the 200GB Content Limit?

This is my favorite topic at the moment. The SharePoint 2010 Boundaries and Limits published by Microsoft has a section on supported content database sizes and what you need to support an infrastructure based on your planned usage of SharePoint. I have had various long and very useful discussions with Microsoft SharePoint engineers both in Australia and the US about what this actually means.

First up – the term “content database” isn’t just about the size of your SQL Server database. It is the sum total of all content that resides in your site collection(s) i.e. if it passes through a SharePoint Web Front-End, it is counted in the size of your “content database”. The reason for this is calculating scalability through Microsoft’s customer feedback and experience, and the amount of data that is processed through SharePoint components.

Some of the clients I have spoken to were concerned about this limit, but most organisations with 500+ users are probably going to have the infrastructure to support >200GB “content databases”. As an example, requirements are as follows:

  • Disk sub-system of 0.25 IOPS (Input/output Operations Per Second), 0.2 IOPS preferred for optimal performance. Decent local disk in a RAID configuration should meet this easily in most cases, and the majority of SAN configurations will also meet this criteria.
  • Have a good backup/restore strategy. Common-sense, often-overlooked, but achievable.
  • SharePoint 2010 Administrators. You need them – get some good ones.
  • Customisation complexity. Needs to be assessed on a case-by-case basis by each organisation.
  • Site collection refactoring. More on this later.
  • Backup and restore. See below.

Site Collection Management

Let’s look at what a site collections should be used for. From Sites and site collections overview:

“The sites in a site collection have shared administration settings, common navigation, and other common features and elements. Each site collection contains a top-level site and (usually) one or more sites below it in a hierarchical structure.”

For me this is critical for how you design your site collection, and I believe one of the factors that gets overlooked the most. It is fairly common practice to create new content databases and/or site collections to manage the size and growth of a SharePoint environment. But this presents several problems:

  • Master pages and UI customisations need to be copies over/modified
  • Administration and permissions need to be copied over
  • Navigation betwen site collections must be manually addressed
  • Site authors may need to redo their work
  • Search needs to be reassessed as well to match the content in each site collection
  • Workflows may need to be redesigned to cope with routing approvals and documents between site collections

These tasks should not be undertaken lightly!

My advice to my clients is to make sure they assess all the impacts of creating additional site collections if they are just doing so to avoid the 200GB boundary for SharePoint 2010. The considerations above are a good starting point to help assess whether supporting 200GB+ content collections is better than sticking to a <200GB size.

Continue on to the next post in this series – Backup and Restore options with Stepwise

SharePoint error: Unable to read full interval from database in ULS Logs

February 29, 2012

We have just implemented Remote Blob Storage using Stepwise and had a .Migrate() crash with the following error:

Exception calling “Migrate” with “0” argument(s): “A new blob could not be created Pool <GUID> ”

Over 20,000 documents had already been migrated, and general Remote Blob Storage entries were successful. Further examination of ULS log files showed the following:

Unable to read full interval from database

Background file fill operation caught exception: System.InvalidOperationException: Operation is not valid due to the current state of the object.     at Microsoft.SharePoint.CoordinatedStreamBuffer.SPBackgroundSqlFileFiller.OnReadComplete(IAsyncResult result)

The file fill it is referring to is document chunking that occurs from SQL when it is extracting data from the AllDocStreams table and trying to push this into Remote Blob Storage.

It turned out that we had a mismatch in the AllDocVersions table to the AllDocStreams table in relation to the number of bytes stored. For example, AllDocVersions said “size = 10,020 kb” but the actual binary content stored in AllDocStreams was “10,003 kb”. The Remote Blob Storage components were expecting 10,020, but only received 10,003.

I fixed this error by *gasp* very carefully modifying the  AllDocVersions entry for “Size” to match the AllDocStreams entry. You MUST BE VERY CAUTIOUS WHEN DOING THIS! In our case there were three documents out of 50,000 that had this issue, and that was also stopping content export from working as well (I assume for the same reason – mismatched size).

After running .Migrate() again, everything worked perfectly and the migration completed.

In this case I could have delete the file from SharePoint as well as from the two-stage recycle bin, but that would have meant potentially losing the versions for the files as well as the problem file. I chose this more dangerous path instead. I don’t know how the mismatch occurred in the first place but will keep an eye on it using this script to identity mismatched documents:

select    docstreams.id  
, docstreams.InternalVersion  
, LEN(docstreams.content) as StreamLength  
, docversions.size 
from dbo.AllDocStreams docstreams 
 inner join dbo.AllDocVersions docversions  
on docstreams.Id=docversions.Id  
and docstreams.InternalVersion=docversions.InternalVersion  
where docstreams.RbsId is null  
and LEN(docstreams.content) <> docversions.Size

Using the ULS Viewer for your custom application

January 8, 2012

The ULS Viewer is a wonderful tool that all SharePoint admins absolutely should have on their servers. There is no other utility that is as quick and easy to use for ULS log viewing – and more importantly, it’s free!

I am helping develop some custom Remote Blob Storage services for SQL Server 2008 and SharePoint 2010 (Stepwise Enterprise Storage – imminent release!) and I wanted to be able to use the ULS log viewer, but I didn’t want to spam the SharePoint logs and our services don’t need to be installed on SharePoint itself.

I set up the Microsoft Enterprise Library Logging Application Block and the Rolling Flat File Trace Listener to do the job for me. The rolling flat file trace listener automatically creates a new file when the current file meets certain criteria, such as size or age.  Note the hard-coded [fileName=”MyService-20200101-0000.log”], this is required because the ULS Log Viewer is hard-coded to look for files matching the format *-????????-?????.log i.e. MyService-yyyymmdd-HHmmss.log. You can set this to something different, but I thought this would be fairly safe.

First I added a Rolling Flat File Trace Listener:

<add name="Rolling Flat File Trace Listener"
fileName="MyService-20200101-0000.log"
footer=""
formatter="ULS SingleLine Text Formatter"
header=""
rollFileExistsBehavior="Increment"
rollInterval="Day"
rollSizeKB="2048"
timeStampPattern="-yyyyMMdd-HHmm"
traceOutputOptions="None"
listenerDataType="Microsoft.Practices.EnterpriseLibrary.Logging.Configuration.RollingFlatFileTraceListenerData, Microsoft.Practices.EnterpriseLibrary.Logging, Version=4.1.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"
filter="All"
type="Microsoft.Practices.EnterpriseLibrary.Logging.TraceListeners.RollingFlatFileTraceListener, Microsoft.Practices.EnterpriseLibrary.Logging, Version=4.1.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"
/>

then I need to make sure the format for the output was correct. ULS wants the following fields supplied:

•Timestamp: Equivalent to the “TimeGenerated” field in the “Application” event Log

•Process: the image name of the process logging its activity followed by its process ID (PID) between parentheses.

•TID: Thread ID

•Area: This maps the “Source” field in the “Application” event Log

•Category: this maps the “Category” field in the “Application” event Log

•EventID: A unique internal Event ID (which you can generate)

•Level: Logging level i.e. Information, Verbose, Error, etc.

•Message: Text message

•Correlation: may contain a link to the the EventID of another logged event, which again you can generate

so this is the formatter I used:

<add
template="{timestamp(local:dd/MM/yyyy HH:mm:ss.fff)} {tab}{appDomain} {tab}{win32ThreadId} {tab}{source} {tab}{category} {tab}{eventid} {tab}{severity} {tab}{message} - {priority} - {dictionary({key} - {value}|)}"           
type="Microsoft.Practices.EnterpriseLibrary.Logging.Formatters.TextFormatter, Microsoft.Practices.EnterpriseLibrary.Logging, Version=4.1.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"           
name="ULS SingleLine Text Formatter" />