SharePoint Content Structure – Let a thousand content types bloom?

“How many content types should you have?”

This is the question that came up in a conference call last week on SharePoint architecture. This organization had implemented their corporate portal on SharePoint 2007 and was interested in going forward with more portal sites but had some concerns about the approach to information architecture they had undertaken.

I answered what I would answer no matter what technology it was – “Only as many as you really need to implement the appropriate level of metadata, workflow and templates.” Which is of course vague, as most good consultant-ese is. I followed up with some stats: when we work on web content management implementations, we typically end up with about 10-15 content types for a site of medium complexity. We always try to keep the structure simple and number of content types few for many good reasons, ranging from ease of content structure management to content publisher user experience.

The folks on the phone were quiet for a minute… You see, the previous consultant they had worked with had a bit of a different (read opposite) approach. The philosophy they described was that SharePoint content types should be created to the maximum degree of granularity (e.g. one content type per library) so as to reduce the need for content publishers to select a content type and tag metadata values. For example, if you had a site for human resources forms, you would have one library and content type for medical forms, one library and content type for dental forms, etc. Each content type would be extremely specific and require little tagging. “If you need 30,000 content types, then so be it” is the idea. (insert eye twitch.)

The intent behind this – to reduce uncertainty and effort for content publishers – is noble and good, and in some specific cases might be the right approach. But in general, the overly-granular content types seems to be in the realm of sledgehammer to kill a fly. To help explain why, I thought I’d enlist the help of a couple of friends and colleagues.

First, I emailed content management guru Bob Boiko, author of the Content Management Bible, to see if he agreed. His response?

“How many content types is the right number? The fewest possible to squeeze the most value out of the info you possess. If it were my system, I would create a generic type and put all the info that I could not find a business justification for into that bucket. It’s not worth naming if you can’t say clearly why you are managing it. Then I would start with the info we have decided is most valuable and put real energy into naming the type and fleshing out the metadata behind it. Then on to the next most valuable and so on till I ran out of resources. In that way, the effort of typing is spent on the stuff that is most likely to repay the effort.

Amen to that! But I also wanted to get a tool-specific view from my colleague and SharePoint expert friend Shawn Shell. So I skyped him…

ImageSo, what do you think?

Image Well, having a content type for every document library is certainly an interesting approach, though I think your SharePoint administrators, as well as your users, will go quite mad.

ImageSo, I think the argument is that having this many content types is supposed to make it easier on the users by presetting all choices and removing the potential for error. If you never have to choose a content type because each library has a very specific default that matches the content you are creating, then there’s no confusion, the idea seems to be… From a general content management perspective, this is flawed. But what about from a SharePoint-specific standpoint?

ImageI can understand why this might make sense on the surface.  Unfortunately, I think you end up exchanging one kind of confusion for another.  Further, there’s a huge maintenance implication as well. For example, if you have a content type for each library, you are, for all practical purposes requiring the user to decide where to physically store a document.  This physical storage then implies your classification — regardless of whether a default content type is applied.

ImageSo, you’re basically recreating all the ills of a fileshare folder structure!

ImageIn essence yes. To make matters worse, more complex SharePoint environments will necessarily include multiple applications and multiple site collections. Because content types are site collection bound, administrators will have lots more administration to create, maintain and ensure consistency across the applications and site collection. This would normally be true, but when you have such an overload of content types and libraries, the complexities of management are compounded.

ImageSo, if you have 50 content types, and you need to use them in 2 or 3 site collections, you’d have to create 150 content types. Good argument to keep your use of content types judicious. Is there a hard limit to the number of content types one can manage in a site collection?

ImageThe answer is “sort of.”  There’s no specific hard limit to the number of content types in a site collection, but there are some general “soft limits” in the product around numbers of objects (generally 2000). This particular limit is an interface limit where users will see slower performance if you’re trying to display more than 2000 items.  The condition won’t typically manifest itself for normal users, but it will for administration. The other real limit is the content type schema can’t exceed 2 Gb.  While this seems like a pretty high limit, if you have a content type for each library, loads of libraries in a site collection and robust content types, there’s certainly a chance to hit this limit.

ImageWhat about search? I assume that a plethora of content types would have adverse effects on search.

ImageIt absolutely does.  Like everything we’ve discussed here, the impact is primarily two fold: 1) administration and 2) user experience. Content types, as well as columns, can be used as facets for search.  If you have an overwhelming number of facets in results, the value facets bring is reduced.  Plus, as I mentioned before, having large numbers of content types could also produce performance problems when trying to enumerate all of the type included in the search result.

From an administrative standpoint, we’re back to managing all of these content types across site collections, ensuring that the columns in those content types are mapped to managed columns (a requirement for surfacing the metadata in search results) and, if you have multiple Shared Services providers, that this work is done across all SSPs.

ImageI expect there will also be a usability issue for those trying to create content outside of the SharePoint interface. Wouldn’t users have to choose from the plethora of content types if they started in Word for?

ImageThis is another excellent point.  Often, when discussing solutions within SharePoint, we think only of the web interface. When developing any solution, however, you need to keep both the Office and Windows Explorer interface in mind as well. Interestingly, using multiple document libraries, with a content type for each library, makes a little more sense from the end users perspective, since it’s similar to physical file shares and folders.
However, the same challenges that many organizations are facing related to management of file shares can manifest themselves when using the multiple library and matching content type approach as well — putting these organizations back in the same unmanageable place they started.

ImageGreat, thanks Shawn for your insights! I’ll be sure to spread the word to avoid a content type pandemic.

So there you have it folks. As a general rule, less is more. Standardize, simplify and don’t let your content types multiply needlessly. Your content contributors and SharePoint administrators will thank you.