<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Not Otherwise Categorized... &#187; Indexing</title>
	<atom:link href="http://sethearley.wordpress.com/category/indexing/feed/" rel="self" type="application/rss+xml" />
	<link>http://sethearley.wordpress.com</link>
	<description></description>
	<lastBuildDate>Wed, 04 Nov 2009 14:22:38 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<cloud domain='sethearley.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://www.gravatar.com/blavatar/f17c13f278f2660f3032371111f594e3?s=96&#038;d=http://s.wordpress.com/i/buttonw-com.png</url>
		<title>Not Otherwise Categorized... &#187; Indexing</title>
		<link>http://sethearley.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://sethearley.wordpress.com/osd.xml" title="Not Otherwise Categorized&#8230;" />
		<item>
		<title>Taxonomies and change: the nature of the beast</title>
		<link>http://sethearley.wordpress.com/2006/08/14/taxonomies-and-change-the-nature-of-the-beast/</link>
		<comments>http://sethearley.wordpress.com/2006/08/14/taxonomies-and-change-the-nature-of-the-beast/#comments</comments>
		<pubDate>Mon, 14 Aug 2006 23:28:51 +0000</pubDate>
		<dc:creator>sethearley</dc:creator>
				<category><![CDATA[Content management]]></category>
		<category><![CDATA[Indexing]]></category>
		<category><![CDATA[Taxonomy]]></category>
		<category><![CDATA[User Interfaces]]></category>
		<category><![CDATA[Change management]]></category>
		<category><![CDATA[Navigation]]></category>
		<category><![CDATA[Retrospective indexing]]></category>

		<guid isPermaLink="false">https://sethearley.wordpress.com/2006/08/14/taxonomies-and-change-the-nature-of-the-beast/</guid>
		<description><![CDATA[An interesting problem was posed to a mailing list I am a part of&#8230;
Imagine that you have been using a single hierarchy to structure and organize your information for years, and it has been very successful up until now&#8230;
But now it is time to move to a different content management system, and not only that [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sethearley.wordpress.com&blog=231962&post=28&subd=sethearley&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>An interesting problem was posed to a mailing list I am a part of&#8230;</p>
<p>Imagine that you have been using a single hierarchy to structure and organize your information for years, and it has been very successful up until now&#8230;</p>
<p>But now it is time to move to a different content management system, and not only that &#8211; business has changed (of course), and not every way of organizing and understanding the information could possibly have been anticipated. (Or perhaps you did anticipate some, but for practical matters limited the amount of metadata you might apply to content.) So you have new ways that users want to search and navigate, but never considered these at the start.  What do you do?</p>
<p><span id="more-28"></span></p>
<p>Well, there are (at least) three issues behind this problem:</p>
<ol>
<li>First of all, can your infrastructure expose faceted navigation? (If not, you can consider bolting on a search interface that leverages entity extraction or metadata)</li>
<li>How stable are top level terms and how flexible are the &#8216;core&#8217; organizing<br />
principles?</li>
<li>Can content be retrospectively indexed with metadata?</li>
</ol>
<p>Before a facet was conceived, there was nothing captured that could represent that organizing metaphor.  We now need to map new terms to content or update terms that were already applied to content.<br />
In a meeting today to review a faceted taxonomy for an insurance company, we<br />
were asked about the implications of change.<br />
The answer is that there will always be new and evolving terms, but the high<br />
level characterization of those terms (the top term, facet or meta-data<br />
field) should stay somewhat stable. What do sales people, customer service<br />
reps and claims processors need to understand? Well, certainly Product for<br />
one. Coverage issues for another, etc. There are certain organizing<br />
principles that naturally characterize information and that should stay<br />
somewhat stable.<br />
If there are completely new processes or business characteristics that<br />
emerge over time, can those be described in the current framework (adding a<br />
new hierarchy to an existing facet and selectively exposing that to the<br />
user), or do we need to create new metadata facets and populate those?<br />
The challenge is always about exposing the metadata to the UI both for<br />
tagging and navigation and of course going back and adding the metadata that<br />
was not captured before the requirement was identified.</p>
<p>There are strategies for dealing with this situation.  We need to consider whether metadata can be derived from content or whether it needs to be applied to the content.  In either case, search and navigation tools can then expose that information to help users find what they need to find to accomplish their task.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/sethearley.wordpress.com/28/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/sethearley.wordpress.com/28/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/sethearley.wordpress.com/28/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/sethearley.wordpress.com/28/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/sethearley.wordpress.com/28/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/sethearley.wordpress.com/28/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/sethearley.wordpress.com/28/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/sethearley.wordpress.com/28/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/sethearley.wordpress.com/28/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/sethearley.wordpress.com/28/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/sethearley.wordpress.com/28/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/sethearley.wordpress.com/28/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sethearley.wordpress.com&blog=231962&post=28&subd=sethearley&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://sethearley.wordpress.com/2006/08/14/taxonomies-and-change-the-nature-of-the-beast/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/063f12546a6bd40d0348ae6690d4b4ca?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">sethearley</media:title>
		</media:content>
	</item>
		<item>
		<title>A very bad index&#8230;</title>
		<link>http://sethearley.wordpress.com/2006/06/08/a-very-bad-index/</link>
		<comments>http://sethearley.wordpress.com/2006/06/08/a-very-bad-index/#comments</comments>
		<pubDate>Thu, 08 Jun 2006 16:11:00 +0000</pubDate>
		<dc:creator>sethearley</dc:creator>
				<category><![CDATA[Indexing]]></category>

		<guid isPermaLink="false">https://sethearley.wordpress.com/2005/06/08/a-very-bad-index/</guid>
		<description><![CDATA[Indexing and Taxonomy creation are closely related processes. In the first case we start with a body of content and then pull from it the key ideas, concepts, pieces of knowledge that we think users would like to access and then create pointers to the content. In the second, we look at a body of [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sethearley.wordpress.com&blog=231962&post=10&subd=sethearley&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Indexing and Taxonomy creation are closely related processes. In the first case we start with a body of content and then pull from it the key ideas, concepts, pieces of knowledge that we think users would like to access and then create pointers to the content. In the second, we look at a body of information and determine the categories that can be used to describe the content. (Usually without regard to the pointers to instances of terms).</p>
<p><span id="more-10"></span><br />
I wrote the following up as a good example of a bad index for a discussion list:</p>
<p>I had to determine the correct pressure for my tires siince one was low. One would think that this handy bit of inforamation would be readily accessible. Here is what I went through in searching for this in my car owner&#8217;s manual:</p>
<p>Turn to the index and look for Tire Pressure.</p>
<p>Entries as follow:<br />
Tire<br />
Low Message<br />
Tire Inflation Check<br />
TIRE MON (Tire Inflation Monitor Reset)<br />
Tire Sidewall Labeling<br />
Tire Size<br />
Tire Terminology and Definitions<br />
Tires<br />
Buying New tires<br />
Chains<br />
Changing a Flat Tire<br />
Compact Spare Tire<br />
If a Tire Goes Flat<br />
Inflation &#8211; Tire Pressure</p>
<p>(I realize I should have started with the last entry, but I started with the first one that made sense to me and followed the trail from there&#8230;)</p>
<p>Starting with &#8220;Tire &#8211; Low Message&#8221;, P 3-48 has the valuable insight that Low Tire Pressure Message indicates that the tire pressure is low and to see page 5-64 for information on the &#8220;Tire Pressure Monitor System&#8221; under &#8220;Tires&#8221;</p>
<p>Tire Inflation Check on page 6-9 says to check to be sure tires are inflated to the correct pressure. See &#8220;Tires&#8221; on pate 5-64</p>
<p>I guess we need to be on P 5-64 to learn all about Tires.</p>
<p>Tires on 5-64 tells us that &#8220;the vehicle comes with high quality tires made by a leading tire manufacturer.&#8221;</p>
<p>&#8220;If you have questions about your tire warranty and where to obtain service, see your GM Warranty booklet for details. For additional information refer to the tire manufacturers booklet included with your vehicle&#8217;s Owner&#8217;s Manual.&#8221;</p>
<p>Wait a minute, I thought I was in the Owner&#8217;s Manual&#8230; Just checked. Yup, I was in the Owner&#8217;s Manual.</p>
<p>That page also contains yellow boxes with the words CAUTION and warnings about &#8220;overloading your tires can cause overheating and result in an &#8216;air-out&#8217;&#8230;&#8221; and to see &#8220;Loading Your Vehicle&#8221; in the index.</p>
<p>(Good thing we are not talking about a &#8220;blow-out&#8221;&#8230; an &#8220;air-out&#8221; sounds much less risky. )</p>
<p>There are also warnings of under inflating, over inflating and worn tires&#8230;. Looks like all we need to know now is the TIRE PRESSURE!!!</p>
<p>Obviously I should have first turned to &#8220;Tires &#8211; Inflation &#8211; Tire Pressure&#8221; on page 5-72</p>
<p>OK, this must be it&#8230;. Let&#8217;s see</p>
<p>Inflation &#8211; Tire Pressure P 5-72<br />
&#8220;The tire and loading information label, shows the correct inflation pressures for your tires when they&#8217;re cold. &#8216;Cold&#8217; means means your vehicle has been sitting for at least 3 hours&#8230;. See &#8216;Loading Your Vehicle&#8217; on page 4-32 for the location of your vehicles tire and loading information&#8221;&#8230; This was followed by more warnings about over and under inflation, etc, etc&#8230;</p>
<p>Loading Your Vehicle P 4-32<br />
&#8220;It is very important to know how much weight your vehicle can carry. This weight is called the vehicle capacity weight&#8230; Two labels indicate how much the vehicle can carry&#8230; the Tire and Loading Information label and the Certification label. &#8220;</p>
<p>&#8220;The Tire and Loading Information label shows the seating capacity and the total weight your vehicle can properly carry. This weight is called the vehicle capacity weight. If your vehicle has the Tire and Loading Information label, Example 1, the label is attached to the center pillar, near the drivers door latch. If your vehicle has Tire and Loading Information label, Example 2, the label is on the inside trunk lid.&#8221;</p>
<p>I see, I was looking in the wrong place the entire time. It wasn&#8217;t even in the book after all. No wonder the index did not help&#8230;</p>
<p>This lost a little of its ridiculousness in my abridged translation and I know this is not entirely the indexers fault. There was a tremendous amount of useless and self evident explanation that I had to wade through to get to the next reference point. But talk about a frustrating user experience!</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/sethearley.wordpress.com/10/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/sethearley.wordpress.com/10/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/sethearley.wordpress.com/10/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/sethearley.wordpress.com/10/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/sethearley.wordpress.com/10/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/sethearley.wordpress.com/10/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/sethearley.wordpress.com/10/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/sethearley.wordpress.com/10/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/sethearley.wordpress.com/10/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/sethearley.wordpress.com/10/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/sethearley.wordpress.com/10/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/sethearley.wordpress.com/10/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sethearley.wordpress.com&blog=231962&post=10&subd=sethearley&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://sethearley.wordpress.com/2006/06/08/a-very-bad-index/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/063f12546a6bd40d0348ae6690d4b4ca?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">sethearley</media:title>
		</media:content>
	</item>
		<item>
		<title>Continuum of value of documents</title>
		<link>http://sethearley.wordpress.com/2006/05/21/continuum-of-value-of-documents/</link>
		<comments>http://sethearley.wordpress.com/2006/05/21/continuum-of-value-of-documents/#comments</comments>
		<pubDate>Sun, 21 May 2006 11:48:00 +0000</pubDate>
		<dc:creator>sethearley</dc:creator>
				<category><![CDATA[Indexing]]></category>

		<guid isPermaLink="false">https://sethearley.wordpress.com/2005/05/21/continuum-of-value-of-documents/</guid>
		<description><![CDATA[This is another response to a post about the &#8220;shared drive problem.&#8221; Shiv Singh of Avenue A- Razorfish commented that &#8220;Every document in an organization is not necessarily important enough to tag. Some organizations address this problem by first determining what knowledge/information/data is worth capturing for retrieval and then putting KM mechanisms in place to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sethearley.wordpress.com&blog=231962&post=8&subd=sethearley&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>This is another response to a post about the &#8220;shared drive problem.&#8221; Shiv Singh of <a href="http://www.avenuea-razorfish.com/">Avenue A- Razorfish</a> commented that &#8220;Every document in an organization is not necessarily important enough to tag. Some organizations address this problem by first determining what knowledge/information/data is worth capturing for retrieval and then putting KM mechanisms in place to capture, codify and distribute it.&#8221;</p>
<p>My thought is that  there is a continuum of value of documents. On one end of the spectrum, news feeds, unmoderated discussion, etc. Chaotic but useful in terms of creativity and problem solving &#8211; ongoing discussions like this one. At the other end of the spectrum might be best practices, templates, methodologies &#8211; structured, scrubbed, edited and tagged. Higher value knowledge is more costly to vet, tag, file and maintain. A vast majority of documents fall somewhere in between. Many (perhaps most) are intermediary products. Since the value is context dependant (as others have mentioned) and may have value as a need arises, it&#8217;s very difficult to organize them without some judgment about current and future value. I&#8217;ve seen environments where documents were nominated to be example deliverables &#8211; someone thought the document would be useful to others. There was a process in place to measure submissions and people were somewhat incentivized to do so.</p>
<p><span id="more-8"></span>For those documents that were simply intermediary work product, they were organized with the specific project (this was a consulting firm) so people could browse through the documents if they came across a similar engagement. At some point, someone still needs to make a judgment about the documents and clean them up or reorganize them. People are notoriously bad at filing things, applying metadata or even naming documents in a meaningful way. So someone was responsible for reviewing and organizing documents on the file share. Overall records policies should dictate retention schedules beyond that. But people are loathe to delete documents so there is a huge amorphous mass of relatively useless content. So there is no free lunch. High value information requires an investment of time, energy, money.</p>
<p>Many organizations are looking at this challenge of &#8220;retrospective indexing&#8221; &#8211; tagging documents after the fact. There are tools that are getting better at entity extraction, clustering and so on, but tagging requires judgment and context so these solutions have drawbacks. As Shiv says &#8220;tagging never works if it is an afterthought&#8221;&#8230;</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/sethearley.wordpress.com/8/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/sethearley.wordpress.com/8/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/sethearley.wordpress.com/8/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/sethearley.wordpress.com/8/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/sethearley.wordpress.com/8/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/sethearley.wordpress.com/8/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/sethearley.wordpress.com/8/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/sethearley.wordpress.com/8/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/sethearley.wordpress.com/8/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/sethearley.wordpress.com/8/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/sethearley.wordpress.com/8/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/sethearley.wordpress.com/8/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sethearley.wordpress.com&blog=231962&post=8&subd=sethearley&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://sethearley.wordpress.com/2006/05/21/continuum-of-value-of-documents/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/063f12546a6bd40d0348ae6690d4b4ca?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">sethearley</media:title>
		</media:content>
	</item>
	</channel>
</rss>