Fixing Enterprise Search
September 4, 2010
· Posted by Greg Lloyd
A few days ago the Enterprise 2.0 Blog published Venkatesh Rao's excellent post The Real Reasons Enterprise Search is Broken. When he hears ironic jokes comparing search on the public Web versus internal enterprise search, Venkatesh notes: "People move on because they seem to think that this is incompetence at work. Search is soo 1.0 right? It's been solved and we're just fumbling the execution, right?" He says: "I have reached a radical conclusion: broken search is the problem, but fixing search is not the solution. Search breaks behind the firewall for social, not technical reasons... Let's start with the blindingly obvious, and then draw some weird conclusions." I think they are perceptive conclusions based on sound analysis, and agree with most, but come at the problem from a different angle.
I strongly agree with Venkatesh's main conclusion: "The fundamental social and information flow assumptions of “search” need to be deconstructed and reconstructed for the enterprise. Local/silo search within single sites/assets is fine. Enterprise-wide search in its naive form is a terrible idea."
I agree that flat and dumb enterprise-wide search is broken for the reasons he points out. The principles that make relevance ranked search work well on the public Web fail miserably in the link-poor, (relatively) small scale, cc spammed, siloed and obscurely hidden environment behind the firewall in most companies.
I agree that "Web 1.0" technology can't address problems of redundant attachments, email hairballs, point-to-point back channel communication that's not even indexed, let alone officialese used to send timed and coded signals the clear! [ I think Venkatesh just outed enterprise steganography. Cool! ]
A few years ago I wrote a note based on similar customer disappointment with enterprise search versus what they saw on the public Web and came to conclusion that the Web 2.0 / Enterprise 2.0 layer over the Web 1.0 layer offers hope for technology-based improvement:
1) Make business context manifest for discovery and search: This means the ability to create spaces that frame natural business context.
A space defines a context for documents, blog posts, wiki pages, status, what we now call activity stream items and more. A space can also carry permission to see or use content within that space - or know that the space exists.
For example a space "Board of Directors" or "Medical Records" may have more restrictive access that a space "Engineering". Although there are good reasons for making spaces as transparent to as many people as possible, a client facing space in a law firm would need to have permission limited to that client and members of the firm - reliably excluding other clients.
Likewise a space in which private medical issues are analyzed and discussed would need to be more private than most other spaces, but provides a perfectly transparent named context for search, faceted navigation and discovery, tagging, activity stream aggregation, or automatic notification - for those with appropriate permission.
A space provides context and can carry permissions, both of which are important for discovery, tagging, navigation, search, and cross-boundary discussions spanning as many spaces as an individual has permission to see (see Borders, Spaces and Places).
2) Searching across business systems is not the major problem. Enterprise search engines such as Attivio do a great job searching across multiple enterprise business systems. An MRP system, Accounting system, CAD/CAE repository may be a "silo", but it's a silo with a well defined purpose that functions like a "bounding space" by providing context that an enterprise search engine can map and use.
For example, searching for a part number might return hits in MRP, Accounting, and CAD/CAE design databases. These different sources can be modeled as facets, which makes it pretty easy to choose from among MRP, Accounting or CAD/CAE hits when you enter a part number depending on content: Are your interested in manufacturing, accounting, or design related part information?
In my opinion, the real problem pops up when you have one, two or a myriad of generic Content Management Systems or email repositories that are used as general purpose buckets where data and documents are dumped:
- Without a bit of context (think of the Ark at the end of Raiders of the Lost Ark)
- Without hope of making stored content discoverable and linkable.
- With redundant copies of email conversations and attachments scattered over tens, hundreds or thousands of email accounts and email servers
I don't know a technical solution that disentagles this mess once it has been created. The context was never recorded or was destroyed when it was stored.
Doing a content search for "Acme Briefing" and finding thousands of hits in redundant copies of different versions of the same .ppt attached to email that has been cc:'d and re:'d to death for months is very discouraging.
Intelligent indexing of context as well as content in an Enterprise 2.0 system such as Traction TeamPage solves the nasty problem of correlating messy human details that people rely on beyond the neatly ordered world of functionally siloed business systems. I realize "Neatly ordered" business systems is a relative. A part number is a lot less ambiguous than "a really big contract problem with important customer" when you're trying to discover something.
The answer to a messily human search for "what's the contract problem" is likely to be found in the Enterprise 2.0 record of work, conversation, tagging and tracking by finding either the answer or people who know the answer (and who are now a click away)
I believe that the transactions and content in functionally siloed business systems should be observable and addressable using standard Web protocols (with appropriate identity and security to allow transparent linking). It would also be best to avoid creating separate social silos that act as walled gardens embedded within each business system, see Intertwingled Work.
3) Make "who links to what and who talks to whom about what" indexable for search and discovery. Indexing author, date, space and other metadata as well as content associated with links and attachments adds valuable context for facet content navigation, discovery and search.
When Mr. Dithers shouts: "Bumstead! Where are we on the Acme Account?", the most timely, frequently discussed and contextually relevant version of Dagwood's slide set could pop closer to the top of the result list, along with the cloud of tags and people who have touched or talked about that account (see Why Enterprise Search Sucks).
4) Be clear what you mean by Enterprise Search. No technology or search engine will be able to find its way through the twisty maze of social relationships and pathological behavior that people can use to hide information that they want to keep private and off the record. The only exception is the technology proposed by Prof Germain Gervais in his Green Chameleon video.
What if you really want the inside scoop on what's happening with the Acme Account?
4a) Use the social network. Find someone in the know and ask them. Enterprise social networking can be a lifesaver when the big barrier is finding something which wants to be widely know in the company but is hidden by an infrastructure that sucks. If what you want to know is the unofficial word (often the truth), find people in the know who are part of a network of trust. If you're been in the military, you'd ask a trusted Sergent or Chief who's well connected in the NCO network.
4b) Hire a private detective. It take human intelligence and experience to find something that doesn't want to be widely known in the company (even if it should be widely known by policy). You could also become a whistle blower and get professional investigators on the case if the issue is really serious.
4c) Find a better place to work. If your just plain frustrated with an inability to find what you need to get your job done and feel happy about it, rise to a position of power and fix it, or find a job in an organization that's not so damaged.
For an example of how Traction TeamPage handles boundary aware navigation, tagging and search, see TeamPage Attivio® Search Module and Michael Sampson's Currents: "TeamPage - the One System to Rule It All".