Latest Entries »

In the past, search was viewed as a feature; a nice to have; or an additional capability. An organisation would have some requirement, say, that a certain group needs to be able to more efficiently find documents pertaining to a particular subject, and stored across a set of digital storage locations. It would then be a case of providing the software and service capabilities to said organisation in such a way that mitigated the difficulty in finding the information required. Search is now an application in its own right, and has come a long way as it has evolved from the role of ‘bolt on feature’ to that of ‘core application’. The reasons for this evolution are many, but let’s look at user interaction as a starting point.

Historically, search has generally been a non-interactive process when viewed from the perspective of the user. It’s a simple process; enter your key word, hit search. Whichever search technology the user happened to be working with would find matching text in the index, and return a set of results in an order which is determined by an analysis based on various criteria (to be discussed in a later post). The user would then inspect the set of results, notice that there is some irrelevant content, and then go back and refine the query. The refinement process is usually not a simple task. The user would have needed some knowledge of 1) the syntax used by the search technology in question 2) some knowledge of the workings of Boolean logic 3) a command advanced enough in the language concerned AND in the field of study concerned to be able to generate synonyms efficiently. Given those three (uncommon to John Q Public) knowledge sets, the user would be able to successfully embark on a search refinement process, provided that the information the user was looking for actually existed in the first place. A query submitted to an academic search engine might look like this:

((“The Photo Electric Effect”) AND (“Physics”) AND (“Classical Mechanics” OR “Netwtonian Mechanics”)) NOT (“Quanta” OR “Planck”)

While the balance between precision and recall is made patently obvious to the user in this situation (and the user can be as precise as he or she is capable of being) there are several inefficiencies to this approach:

  1. It is time consuming
  2. Syntax is always a concern
  3. It leaves all the work on the users plate, in an age of information!

The contemporary approach to search lets all the dirty work be handled by a combination of advances in the field of linguistic processing and some clever user interface design. No longer is search a back-button-thrashing exercise. Information can be processed as it’s fed into the index so that its value is increased. Each document can be analysed so that key phrases are extracted and added to each document’s ‘meta information’. At the same time we can identify the names of people and places, telephone numbers, email addresses, physical street addresses, the names of companies, subjects, sub disciplines, field specific jargon, all automatically and store it along with each and every document at the moment it is fed into the index. This means that the contemporary process analogous to the above would be as follows:

  1. Enter search terms: ‘the photoelectric effect’
  2. View search results, click ‘Physics’, click ‘Classical Mechanics’, click ‘Newtonian Mechanics’, click the X next to ‘Quanta’ and the one next to ‘Planck’

Through the whole process, the result set is dynamic, shifting according to the will of the user and adjusting to their whim as the user selects refinements and filters. All the while, in the background the user actually fires off a series of queries but the value add to the raw information has taken out the effort that would have been required in the past. This is all enabled by the fact that computers can group sets of information together much faster than we can. It would have taken the user a long while to inspect the result set only to realise that the quantum explanation for the photoelectric effect is in the result set when the historical research really required an account of how it was explained prior to that.

Another side effect of the fact that there is so much additional meta information is that it enables an entire portal to be developed in a write once use many approach. A single template can be used which aggregated disparate information from a variety of sources, say, financial history source, current events news source and an encyclopaedic source. This disparate information can then be presented (after some user interface sorcery) in an informative, easy to read, concise and clear manner. As soon as the user clicks on a name, the whole page shifts to the new context.

The advantages to this context driven, interactive user experience that modern search technologies enable are:

  1. It allows all the precision and recall tuning of the past
    1. This is all driven by the metadata attached to each document.
  2. It allows anyone to refine a query without having to think about synonyms, or field of study
    1. The engine can index the original content in such a way that it converts all words to their synonyms, or the query can be processed to be submitted with synonyms.
    2. Entity extraction allows drilldown to narrow result sets, and increase precision
    3. Processing content into a taxonomy can group similar items together, also increasing precision
  3. The user does not need prior knowledge of the syntactical conventions of the search engine they may be using.
    1. In fact, it’s even possible to swap the backend search engine without the users noticing!

So you can see that an interactive contextual search experience really can lead to the relevant information, and deliver it on time too!

Digital media has become one of the most valuable assets in modern times. Even though it is not a physical object, people have built their empires around these digital files, and it has become just as important as the building that the company operates from. Production houses, Film studios, online news sites, etc. all have digital media that is a core asset to their business. But how do you protect your digital media assets from falling into the wrong hands?

One of the biggest problems for media owners is that their media assets land in the wrong hands and because of that, they lose a lot of money.  The amount of pirated DVD’s and CD’s are growing each day and the criminals are making  a lot of money which should have gone to the owner of the media. Think about popular movies that have not yet reached the cinemas, which has have built up so much hype and anticipation through viral marketing. The marketing team has spent hours and piles of money trying to spread the word about this film. Everyone you know tells you about how eager they are to go watch this movie. Two months before the movie releases, you notice that someone on the street corner is already selling the DVD for a very low price.

Media ‘Pirates’ are getting more inventive and sly with each passing day, as the technology they have to their disposal is ever improving. The fight against piracy is endless  and the people who supply these pirates with the media are never caught, which then allows them to do the same thing with the next big movie.

So how do you as a media owner manage to safeguard your media assets? A good place to start is to determine how the media reaches the outside world. At which point in production was your media leaked to the outside world? If you manage to find the leak, you can take action to plug the hole and make it more difficult for the media to leak out.

Digital Fingerprinting of your media is a good technology to use in the production phases of your media. One of numerous uses of this technology, is to add a  small “fingerprint” to your media which can serve as a trail of ‘breadcrumbs’, which will allow you to know at what point in time your media was leaked.

Here is an example: You send your media assets to pre-production, which will in turn move on to the production phase and then post-production. Somewhere between production ad post-production, your media gets leaked to the outside world. By adding a digital fingerprint to the media before it changes hands can store information about where the media was last used. Let’s say you find a pirated copy of your film. You can then go back to your studio and examine the fingerprint to see who was the perpetrator behind the illegal distribution of your media. Now you have all the information that you need to take action against this person, either personally or in court, where some damagescould be recovered. Your media will still have reached the outside world, but the cause of the leak will be plugged and future business will be protected.

This alone will not stop piracy, but will aid you to minimize the impact that piracy will have on your company. By protecting your media from the parasites who make money off of you, your business will thrive and remain competitive in the already brutal market.

When discussing the Abyss that is digital archiving and archiving in general, the average person is terrified and scared of where their intellectual property, material, content, meta-data disappears to when archived. This can occur to such a large degree that duplication can occur as users and managers generate backups prior to them being submitted to archive. This in turn can lead to serious version and rights management issues relating to this practice.

Any business process that is not clear and easily understood by those required to apply or utilise it will always be  problematic. Let us attempt to throw some light on the Abyss and the processes to make it a less scary prospect.

Archiving is a process that requires meticulous attention to detail. When documents, tapes, clips or whatever material it may be, are archived, they require relevant information pertaining to the material to be stored as well. It is thus not just a case of archiving the material but also the crucial meta-data around the material. Part of the problem with archiving solutions is that they cater for this meta-data, but are rigid and fixed. Secondly this data is often stored in a database, and have the ability to store loads of information however they are usually pretty poor at searching over this data in an intuitive manner.

The meta-data structures need to be adaptable to the material type. A document for example may have multiple authors, publishers and editors. A TV documentary on the other hand requires usage rights information, broadcast dates, directors, narrators and more. Film productions require fields relating to the cast. In each of these there may be requirements to add meta data to portions (chapters in a short story novel for example, or a scene in a movie).

The primary reason for adding rich and accurate meta-data relates to the simple problem of retrieving the material when required, and the richer the data set the higher the probability of retrieving it becomes.

Further complications to archiving that create the “Abyss”  effect include search technology, typically in most search engines the stop words (common words like and, the, if, to, be etc) are not indexed as they lead to lower precision when searching. Imagine if you will then how difficult it would be to find Hamlet when searching on “to be, or not to be”. Search engines come in many varieties with differing advantages that they can bring to the solution, it is crucial to select a search engine and solution that best solves your requirements, consulting with experienced search specialists such as Knowledge Focus is crucial in addressing this aspect.

Till now I have also not addressed the human factor and how errors in archiving the data, errors in the search clues or simply not storing the original book, tape or DVD in the right place as per the “location” reference can easily slip into the process. Regardless of how good the technology supporting your archive may be, the personnel managing the archive need to be precise and meticulous to the highest degree.

In summary you thus require advanced storage, database and MAM (Media Asset Management) from a technology perspective to assist you in transforming the Abyss into a rich and resourceful asset in your organization, driving capitalisation of opportunities, resources and of course business.

Invest in good technology, digitize your assets, train your archive staff well, manage them well and suddenly the knowledge within your organization can be unlocked and used to drive profit and revenue.

Powered by WordPress | Theme: Motion by 85ideas.