SharePoint Content Structure – Let a thousand content types bloom?

“How many content types should you have?”

This is the question that came up in a conference call last week on SharePoint architecture. This organization had implemented their corporate portal on SharePoint 2007 and was interested in going forward with more portal sites but had some concerns about the approach to information architecture they had undertaken.

I answered what I would answer no matter what technology it was – “Only as many as you really need to implement the appropriate level of metadata, workflow and templates.” Which is of course vague, as most good consultant-ese is. I followed up with some stats: when we work on web content management implementations, we typically end up with about 10-15 content types for a site of medium complexity. We always try to keep the structure simple and number of content types few for many good reasons, ranging from ease of content structure management to content publisher user experience.

The folks on the phone were quiet for a minute… You see, the previous consultant they had worked with had a bit of a different (read opposite) approach. The philosophy they described was that SharePoint content types should be created to the maximum degree of granularity (e.g. one content type per library) so as to reduce the need for content publishers to select a content type and tag metadata values. For example, if you had a site for human resources forms, you would have one library and content type for medical forms, one library and content type for dental forms, etc. Each content type would be extremely specific and require little tagging. “If you need 30,000 content types, then so be it” is the idea. (insert eye twitch.)

The intent behind this – to reduce uncertainty and effort for content publishers – is noble and good, and in some specific cases might be the right approach. But in general, the overly-granular content types seems to be in the realm of sledgehammer to kill a fly. To help explain why, I thought I’d enlist the help of a couple of friends and colleagues.

First, I emailed content management guru Bob Boiko, author of the Content Management Bible, to see if he agreed. His response?

“How many content types is the right number? The fewest possible to squeeze the most value out of the info you possess. If it were my system, I would create a generic type and put all the info that I could not find a business justification for into that bucket. It’s not worth naming if you can’t say clearly why you are managing it. Then I would start with the info we have decided is most valuable and put real energy into naming the type and fleshing out the metadata behind it. Then on to the next most valuable and so on till I ran out of resources. In that way, the effort of typing is spent on the stuff that is most likely to repay the effort.

Amen to that! But I also wanted to get a tool-specific view from my colleague and SharePoint expert friend Shawn Shell. So I skyped him…

ImageSo, what do you think?

Image Well, having a content type for every document library is certainly an interesting approach, though I think your SharePoint administrators, as well as your users, will go quite mad.

ImageSo, I think the argument is that having this many content types is supposed to make it easier on the users by presetting all choices and removing the potential for error. If you never have to choose a content type because each library has a very specific default that matches the content you are creating, then there’s no confusion, the idea seems to be… From a general content management perspective, this is flawed. But what about from a SharePoint-specific standpoint?

ImageI can understand why this might make sense on the surface.  Unfortunately, I think you end up exchanging one kind of confusion for another.  Further, there’s a huge maintenance implication as well. For example, if you have a content type for each library, you are, for all practical purposes requiring the user to decide where to physically store a document.  This physical storage then implies your classification — regardless of whether a default content type is applied.

ImageSo, you’re basically recreating all the ills of a fileshare folder structure!

ImageIn essence yes. To make matters worse, more complex SharePoint environments will necessarily include multiple applications and multiple site collections. Because content types are site collection bound, administrators will have lots more administration to create, maintain and ensure consistency across the applications and site collection. This would normally be true, but when you have such an overload of content types and libraries, the complexities of management are compounded.

ImageSo, if you have 50 content types, and you need to use them in 2 or 3 site collections, you’d have to create 150 content types. Good argument to keep your use of content types judicious. Is there a hard limit to the number of content types one can manage in a site collection?

ImageThe answer is “sort of.”  There’s no specific hard limit to the number of content types in a site collection, but there are some general “soft limits” in the product around numbers of objects (generally 2000). This particular limit is an interface limit where users will see slower performance if you’re trying to display more than 2000 items.  The condition won’t typically manifest itself for normal users, but it will for administration. The other real limit is the content type schema can’t exceed 2 Gb.  While this seems like a pretty high limit, if you have a content type for each library, loads of libraries in a site collection and robust content types, there’s certainly a chance to hit this limit.

ImageWhat about search? I assume that a plethora of content types would have adverse effects on search.

ImageIt absolutely does.  Like everything we’ve discussed here, the impact is primarily two fold: 1) administration and 2) user experience. Content types, as well as columns, can be used as facets for search.  If you have an overwhelming number of facets in results, the value facets bring is reduced.  Plus, as I mentioned before, having large numbers of content types could also produce performance problems when trying to enumerate all of the type included in the search result.

From an administrative standpoint, we’re back to managing all of these content types across site collections, ensuring that the columns in those content types are mapped to managed columns (a requirement for surfacing the metadata in search results) and, if you have multiple Shared Services providers, that this work is done across all SSPs.

ImageI expect there will also be a usability issue for those trying to create content outside of the SharePoint interface. Wouldn’t users have to choose from the plethora of content types if they started in Word for?

ImageThis is another excellent point.  Often, when discussing solutions within SharePoint, we think only of the web interface. When developing any solution, however, you need to keep both the Office and Windows Explorer interface in mind as well. Interestingly, using multiple document libraries, with a content type for each library, makes a little more sense from the end users perspective, since it’s similar to physical file shares and folders.
However, the same challenges that many organizations are facing related to management of file shares can manifest themselves when using the multiple library and matching content type approach as well — putting these organizations back in the same unmanageable place they started.

ImageGreat, thanks Shawn for your insights! I’ll be sure to spread the word to avoid a content type pandemic.

So there you have it folks. As a general rule, less is more. Standardize, simplify and don’t let your content types multiply needlessly. Your content contributors and SharePoint administrators will thank you.

Advertisements

Taxonomy Bootcamp 2009… A regular smorgasboard

Looking for a good way to spend a week in the California sun and learn more about taxonomy, search and knowledge management? Look no further than the triple-slam event of the fall conference season:

Taxonomy Bootcamp / KM World / Enterprise Search Summit West
Register today with our discount code to save 200$!

Mark your calendars, cause we have a full slate of taxonomy-related presentations this year, including:

Workshop: Taxonomy Implementation & Integration (Seth Earley & Stephanie Lemieux)
Date: November 16, 2009 – 9:00 – 12:00
Come hear Seth & I talk about how some of the companies we’ve worked with have been able to implement their taxonomies and integrate them with WCM, ECM and digital asset management systems among others. Hear about practical applications of taxonomy within different classes of tools as well as technical integration challenges (hierarchy challenges, build-vs-buy issues, etc.).

Workshop: SharePoint Information Architecture: Integrating Taxonomy & Metadata (Stephanie Lemieux & Shawn Shell)
Date: November 16, 2009 – 1:30 – 4:30
My friend Shawn Shell and I will cover the ups and downs of trying to build taxonomy and metadata frameworks in SharePoint – a tool with a distinct handicap when it comes to hierarchical metadata and search relevancy. We’ll talk about 3rd party add-ons that can help with tagging, taxonomy and faceted search.

Session: SharePoint Information Architecture: Integrating Taxonomy & Metadata (Jeff Carr & Stephanie Lemieux)
Date: November 19, 2009 – 1:15 – 2:00
If you can’t make it for the workshop, don’t miss this condensed version giving highlights on how to achieve taxonomy in SharePoint. We’ll cover a couple of case studies here as well, and give a quick overview of add-ons.

Session: Best Bet ROIs: We’ve Seen It All (Panel) (Seth Earley)
Date: November 19, 2009 – 3:30 – 4:15 EST
This panel of content management problem-solvers shares their experiences and perspectives of successfully determining the return on investment for folksonomy, taxonomy, and ontology initiatives

Session: Increasing Traffic by Integrating Taxonomy & SEO (Panel) (Jeff Carr)
Date: November 19, 2009 – 3:15 – 4:00 EST
Jeff is taking part in a fun panel format where speakers get just a few slides and a few minutes to make their point… Hear about how taxonomy is an important factor in many SEO ranking signals.

And if you’re not in info overload yet…

Session: Folksonomies: Beyond the Folks Tales (Panel) (Stephanie Lemieux)
Date: November 20, 2009 – 10:40 – 11:15
Join me for a panel that promises to be fun and informative, where Tom Reamy (KAPS) and I will go head to head on the merits and applications of Folksonomies.

This year promises to be a great show – join us in San Jose this November to chat about all things taxonomy, folksonomy, ontology, and any other “onomy” or “ology” you care to bring to the table. Use this link for a 200$ discount.

Special shout out to the TaxoCoP members – we’ll be sure to organize a get together for those of you who’ll be onsite.

Collaboration, Groove and SharePoint – History Repeating Itself?

I just read that Groove is being renamed as SharePoint Workspace 2010.  For those of you who are not familiar with Groove or its history, I’ll take you back to the early 80’s. 

Ray Ozzie is the visionary behind Groove and currently the Chief Software Architect at Microsoft (a role he took over from Bill Gates).  At University of Illinois (as many know, home to the NCSA  which created Mozilla, the first web browser on which Internet Explorer is based) Ozzie worked early iterations of some of today’s knowledge management,  collaboration and social media applications (discussion forums, message boards, e – learning, e-mail, chat rooms, instant messaging, remote screen sharing, and multi-player games.

He also worked with some of the pioneers in personal computing and products like Visicalc, one of the first spreadsheet programs that ushered in the age of personal productivity.

Ozzie worked for a time at Lotus Development and went out to form a new venture called Iris Associates which developed a collaboration tool called Notes.  Lotus acquired rights to Notes with Iris remaining a separate entity but doing all of the research and development behind the product.

Continue reading

MOSS 2007 Requirements Gathering: Fast and Focused

Since Microsoft Office SharePoint Server is a mature platform for collaboration, content management and portals, companies can implement the package without much planning or even requirements gathering. Too often, the IT department is assigned the task of technically implementing SharePoint, with little context for its use or its potential value to the organization. The individuals in Business Units or Departments, who will use the system, are kept in the dark about the plans and the functionality of SharePoint. Once IT is satisfied that MOSS is technically stable, it rolls the package out to users with little training or follow-up. This approach rarely succeeds.

In this post, I want to examine how to set the foundation for a successful SharePoint implementation by starting with a clear understanding of user requirements and the business results stakeholders want to achieve. Governance, Site construction, etc. can wait until there is a base level of understanding of the business objectives and user requirements. Continue reading

SharePoint 2007 – Implementing and Managing Taxonomy

We’ve been doing a lot of work with SharePoint lately so I thought I’d put together a quick post on some approaches to implementing taxonomies in the new version. As you may or may not know, MOSS 2007 (or Microsoft Office SharePoint Server) is quickly becoming the new platform of choice for many organizations. This newer version of the application is being leveraged in the development of corporate Intranets, Extranets and even public facing Internet websites, providing information workers with enhanced collaboration and document management capability.

With the exponential growth of implementations worldwide (MOSS is the fastest growing server product in the history of the company) come greater challenges and opportunities for improving knowledge management and information access within the enterprise. The need for consistent organizing principles across enterprise information is of ever increasing importance and, when done correctly, can result in leaps and bounds in employee productivity.

Before we get to any of the details however, let’s remind ourselves that the purpose of building and maintaining taxonomies is to improve the findability of information by:

Continue reading