A Library Sciences Approach to Tagging Content

By Steve StuderSteve Studer

I’ve often said that enterprise content management (ECM) uses a library sciences approach to organize and manage unstructured information. Go into any library and you’ll find it’s a repository that contains a wide variety of artifacts that are structured, organized, cataloged, and indexed to help individuals locate information quickly and easily. This is also a key benefit of ECM. Whether you’re talking about a library or ECM, the key to managing content and artifacts is tagging information so that it is organized and easily accessible to the knowledge seeker.

As with a library, organizing information in an ECM system requires defining classification schemes that determine categories as well as fields with field values (also known as managing metadata). More than 200,000 libraries categorize books and other artifacts using the Dewey Decimal System (DDS). The DDS starts with ten significant classes, each of which has ten divisions, and within each division are ten sections.Books are placed on the shelf in numerical order but can easily be found through a cross-referenced index based on title, subject, and author. Another popular categorization system for academic libraries is the Library of Congress Classification System (LC) which organizes content by subject, author, then date. Regardless of the method, without some form of tagging, it would be nearly impossible to find books or other content within a library, in context.

This also holds true for ECM and records management systems. However, there are critical differences between ECM and a library. Organizations tend to think there’s a wider variety of categories and the metadata used tends to pertain to the business transactions or activities. Additionally, some of the metadata is critical for records and retention purposes. Another significant difference between a library and ECM solutions is that the end users applying the tags in the system are typically not well versed or well suited to the  practice. Consequently, in many cases, user adoption and the effectiveness of using tags does not achieve the desired corporate outcomes.

To overcome these challenges, IT needs to look for ways to automate the categorization and tagging of content. They also need to look at how to validate the tags that are applied so that they are consistent and universally accepted. To that end, we will cover four tips that I’ve found to be the most effective.

1. Tag content based upon the upstream and downstream processes

Knowing where the content is coming from, where the information gets used, and how the tags define content sensitivity and retention—both internally and externally—is critical. The first thing I evaluate when implementing an ECM project in any organization is how content is used internally and externally. For me, it’s the most fundamental step because it helps to identify what, where, when, how, and why (the five Ws) in categorization and tagging content in order to make ECM effective. Many times, I find organizations create more work within their business, or for their customers and partners, because they do not consider the five Ws of content as they apply to both internal and external processes.

2. Identify and simplify the tagging process through automation

It’s easy to teach someone how to find content in a library, but it’s tough to make everyone a librarian.

Over the last 25 + years, I’ve watched how organizations try to force users into tagging content. Only in cases where tagging was imperative to the safety and quality the operations did I find user acceptance wasn’t a big obstacle. Automating the tagging process is essential in improves consistency and reduces human error.

3. Tag content so it stays with the material

I talk a great deal about frictionless information transfer and for me, that means tags need to flow between people processes and systems. For most, automating the tagging of content typically starts with getting content into a system. Often, this is where it ends because organizations don’t look at how their content affects the people and processes downstream.  

Let me give you a simple example. I wish I had a dollar for every hour I’ve spent pulling travel expense details out of printed documents and receipts. For me, that’s is an excellent example of where an upstream process like sending an electronic folio impacts someone downstream. It’s also a perfect example of how tagging and metadata generated upstream in the printed output (typically PDF) can be incorporated in a way that makes information easier to extract downstream.

In years past, paper and dot matrix printers ruled the world—now it’s electronically generated content. I love helping customers take a hard look at how the content produced upstream is affecting people, processes, and even other organizations downstream. While I tell everyone that printed output will still be incredibly important for a long time, they must remember that it’s the information contained within that’s important. For businesses, this information typically represents a record of an agreement or transaction—but that doesn’t mean it needs to be a burden to the automated records keeping process.

4. Leverage data validation as well as audit and analytics

It’s tax season again, and I can’t think of a better example to share how my first three tips align with my final one. I’ve noticed something new with my W2 this year. In the past, individuals had to wait to get a hard copy of their W2 in the mail, then someone would painstakingly enter the values from the W2 into tax forms like 1040 for individual accounts. The next evolution came when many employers provided a way to electronically download W2s, this sped up the filing process and offered an eco-friendly solution that reduced the mailing expenses and improved processing times. It also made the process of extracting the data from a W2 simpler because the output was electronic text—not just an image of text. In the last couple of years, the government made, what I consider to be, a digital transformation. Many companies now provide the means to transfer W2 data directly into a tax filing application. All that is needed is the employers Tax Identification Number, the user to validate the employee Social Security Number, and one or two other values found on their W2—which can easily be cut and pasted from the PDF version of the W2 form. Once validated, the other tagged content from the W2  is transferred accurately and immediately into the tax filing software thus reducing the tedious and error prone work previously required by the filer. That’s what I call frictionless information transfer—unless, of course, the IRS audit and analytics tools found some discrepancy in the filling—in that case it may spark another form of friction in the filing process. (My attempt at a little humor).

At Zia Consulting, we’re experts at applying a library sciences approach to digitally transform businesses that rely on unstructured content as part of their decision making and records processes. Whether you are looking to extract and tag information from paper or digitally born documents, we can help you identify and classify content so it provides the most value.

We can also help you identify, manage, and automate content tagging between the people, processes, and systems used both upstream and downstream so that information flows frictionlessly and the supporting content is distributed and maintained a cost-effective way.

Please contact us for a free hour of consulting and see how we help organizations like yours structure and process content through tagging. We’d be happy to talk to you about how we can help you assess your digital maturity model.

Pin It on Pinterest

Sharing is caring

Share this post with your friends!