The challenges of content creation vs. content discovery

In our development of the NHS SharePoint Solution Accelerator we hit the common problem of how to design a document library scheme that met the core needs for our clients. At the heart of this is a set of design goals, which can be simplified as:

          A single place to access information

          A simple means for creating/adding information (and a single place to save them to)

          High system performance

In an ideal world, one would store everything in a central, monolithic library, with powerful search, views etc to allow content access and rich metadata to locate the content in the library.

However prevailing wisdom around SharePoint suggests that an approach using many discrete libraries is preferred, using a Content Query web part to present aggregated views where ever they are needed on a site.

So we are left with these two extremes, neither of which is ideal.

The monolithic approach provides a single place to ‘look for’ information and to store information. However it relies on a large number of views being created to ‘slice and dice’ the extensive list of documents (generally measured in tens of thousands) into manageable virtual libraries. The use of Content Types helps support this approach in SharePoint 2007, allowing different content in a single library to have different metadata, security etc. It also makes it easy for content creators, as they only have to know one place to save their documents. Importantly, it addresses the classic problem of hierarchical taxonomies, namely ‘how to store a single document in more than one place in the structure’. Metadata driven views allows a Policy document on Infection Control (for example) to appear in both a ‘Policies’ and a ‘Public Health’ area.
Ultimately this option is scuppered by the performance requirement. SharePoint struggles with this many items in a library.

 

The other extreme largely solves the performance issue, with a relatively small number of items per library. Also any views of that library and functions associated with it are clear and well defined because each library ahs a specific purpose. The danger that this ends up looking like a traditional Directory-like filing scheme, which forces users to look in different libraries for documents (and expecting them to know which library the document they seek is in) can be largely avoided by using Content Query parts to create filtered, sorted views of documents sourced across multiple libraries. Again, metadata is used allow a document to appear in multiple lists.
This approach is undermined by the continued requirement that someone who needs to create or save a new document must know which library it is supposed to go into before it can be saved, and choice of the correct library will determine which content types, templates and metadata are provided.

Inevitably a compromise must be reached; however such a hybrid solution tends to have an element of ‘worst of both worlds’ in the way it works.

 

We have been developing some concepts to mitigate this, based on enhanced search, alternate navigation and pre-emptive actions. Even we don’t have all the answers though. It’s a problem that will continue to vex both informatics professionals and solution developers for many more years, we suspect.

Advertisements

Leave a comment

Filed under Informatics

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s