Remote Blob Storage for SharePoint 2010 with Stepwise – Part 1

This post has spent a long time in draft, but as we have released the second version of our product and things are progressing nicely, now is a good time to post this.

For several years my company (Invizion) have been working on a Remote Blob Storage product for SharePoint 2010 called Stepwise. Remote Blob Storage is a technology and API provided by Microsoft that allows you to move documents out of your SQL Server database and store them on file-system based storage (think network drive or cloud storage) by a process called “externalisation”. There are nemerous advantages, but the biggest ones are:

  • Your databases are smaller. Sometimes hugely smaller – up to 95% is not uncommon for SharePoint 2010 content databases in particular
  • The documents don’t travel via SQL Server at all, they are stored directly on a file-system. So SQL doesn’t get slammed with I/O, your temp databases aren’t hammered, and transaction size is smaller (but not shorter – see below)
  • You can utilise existing storage systems, like high-capacity drives which are much cheaper. SQL Server uses (or should use!) high-performance drives, which are expensive. This means organisations spend less on physical storage for their SQL Server environment
  • Backup and restore tasks are in most cases substantially quicker. Ask your backup engineer – would they prefer backing up a 1TB SQL Server database, or a 100GB SQL Server database + 900GB of documents? There are also huge improvements in Data Deduplication with backup software that works more efficiently on file-based data rather that SQL Server databases

There are a lot of other benefits, but these are the main ones.

There are several competitors in the Remote Blob Storage provider-space, but Stepwise has some pretty unique features which I think are worth detailing here.

  1. Stepwise isn’t reliant on SharePoint. If SharePoint is down, can you still access and, more importantly, manage your documents? Stepwise can.
  2. Stepwise uses Microsoft Management Console to control all configuration of the system. No Central Admin features to deploy, no separate website, no timer jobs to run, no impact on SharePoint (beyond reading and writing the documents of course!)
  3. Stepwise manages documents, rather than just storing them. Want to add more storage? Covered. Need to move documents to a new location with no down-time? Stepwise can do that. Want to calculate the cost of cloud storage? Stepwise has inbuilt functionality to show you how much the cloud is going to hit your budget.
  4. Stepwise can integrate with your non-SharePoint applications. Stepwise is a fully-featured Content Addressable Storage (CAS) system based on Microsoft’s Remote Blob Storage technology. That means your custom applications can benefit in the same fully-supported way.

What about the 200GB Content Limit?

This is my favorite topic at the moment. The SharePoint 2010 Boundaries and Limits published by Microsoft has a section on supported content database sizes and what you need to support an infrastructure based on your planned usage of SharePoint. I have had various long and very useful discussions with Microsoft SharePoint engineers both in Australia and the US about what this actually means.

First up – the term “content database” isn’t just about the size of your SQL Server database. It is the sum total of all content that resides in your site collection(s) i.e. if it passes through a SharePoint Web Front-End, it is counted in the size of your “content database”. The reason for this is calculating scalability through Microsoft’s customer feedback and experience, and the amount of data that is processed through SharePoint components.

Some of the clients I have spoken to were concerned about this limit, but most organisations with 500+ users are probably going to have the infrastructure to support >200GB “content databases”. As an example, requirements are as follows:

  • Disk sub-system of 0.25 IOPS (Input/output Operations Per Second), 0.2 IOPS preferred for optimal performance. Decent local disk in a RAID configuration should meet this easily in most cases, and the majority of SAN configurations will also meet this criteria.
  • Have a good backup/restore strategy. Common-sense, often-overlooked, but achievable.
  • SharePoint 2010 Administrators. You need them – get some good ones.
  • Customisation complexity. Needs to be assessed on a case-by-case basis by each organisation.
  • Site collection refactoring. More on this later.
  • Backup and restore. See below.

Site Collection Management

Let’s look at what a site collections should be used for. From Sites and site collections overview:

“The sites in a site collection have shared administration settings, common navigation, and other common features and elements. Each site collection contains a top-level site and (usually) one or more sites below it in a hierarchical structure.”

For me this is critical for how you design your site collection, and I believe one of the factors that gets overlooked the most. It is fairly common practice to create new content databases and/or site collections to manage the size and growth of a SharePoint environment. But this presents several problems:

  • Master pages and UI customisations need to be copies over/modified
  • Administration and permissions need to be copied over
  • Navigation betwen site collections must be manually addressed
  • Site authors may need to redo their work
  • Search needs to be reassessed as well to match the content in each site collection
  • Workflows may need to be redesigned to cope with routing approvals and documents between site collections

These tasks should not be undertaken lightly!

My advice to my clients is to make sure they assess all the impacts of creating additional site collections if they are just doing so to avoid the 200GB boundary for SharePoint 2010. The considerations above are a good starting point to help assess whether supporting 200GB+ content collections is better than sticking to a <200GB size.

Continue on to the next post in this series – Backup and Restore options with Stepwise



4 Responses to “Remote Blob Storage for SharePoint 2010 with Stepwise – Part 1”

  1. Remote Blob Storage for SharePoint 2010 with Stepwise – Part 2 Backup and Restore « Back in Hack Says:

    […] in Hack I am not paranoid. But I’m working on it. « Remote Blob Storage for SharePoint 2010 with Stepwise – Part 1 Remote Blob Storage for SharePoint 2010 with Stepwise – Part 3 Disaster Recovery […]

  2. Backing up SharePoint 2010 and Remote Blob Storage with Data Protection Manager 2012 « Back in Hack Says:

    […] a more in-depth discussion of Remote Blob Storage, SharePoint 2010 and Stepwise, see my previous post series. Like this:LikeBe the first to like this […]

  3. Remote Blob Storage « Back in Hack Says:

    […] last point is probably the most important, and one I’ve blogged about previously. Maintaining the same site collection structure instead of using the traditional method of […]

  4. Software boundaries and limits for SharePoint 2013 – Content database limits | Back in Hack Says:

    […] and it continues on to discuss IOPS per GB, high availability, disaster recovery, and so on. There are also references to Remote Blob Storage, which is of huge interest to me and my company’s product, Stepwise. […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: