Research and Measurement Challenges in an Interconnected World
That is, it aims to weave a common thread between the challenges of reconciling disparate information needs in an environment complicated by fragmented or disparate data sources. Underpinning the relevance of this investigation are several emerging trends that should begin to challenge traditional notions of our research field:
. Proliferation of communications channels. Web, wireless, and e-mail; none of these communications channels were in wide usage only 15 years ago. The rapid growth of the Internet, and now wireless devices and services, underpins the necessity to understand the impact of communication channel expansion on both information consumers and providers. Companies and consumers adopted these technologies quickly, sometimes in an almost ad hoc manner, and as a result data are stored in multiple stand-alone information silos – getting at, integrating and deriving useful information from organisational data stores can be an enormous undertaking.
. Increasing information overload. How many e-mails, phone calls and other sources of information are information players exposed to in a day? With 700 billion documents on the Web and employees receiving an average of 30 e-mails per day (Adams, 2003), “information overload” is becoming a serious and potentially expensive issue. Not only do information users have to worry about the amount of data, but also data quality. Some estimates suggest that 60-80 per cent of organisational communications are not understood, resulting in $650 million to $1.3 billion in associated costs (Maitland, 2002). Given the demands placed on the modern information seeker’s attention, it is likely that he or she is not even aware of how much of their information is outdated or is of poor quality.
. Increased need and ability to store information. In the past few years, the ability to gather, store and retrieve information has progressed significantly. In the same time, data storage costs have dropped rapidly and the understanding of information processing has increased immensely. These trends are creating markets and products that capitalise on the new capabilities. For example, in the US the annual sales of digital surveillance products and services are expected to reach $8.5 billion by the end of 2005, up from $5.7 billion in 2002 (Flynn, 2003). One UK company, National Car Parks, has installed 400 digital surveillance cameras in its car parks across Britain. This information gathering creates enormous data stores that need to be classified, catalogued and readily accessible to be useful. Without sound information retrieval taxonomies, much of the data will remain useless or at the very least, under-utilised.
. Need for faster information processing. Tracking of international terrorism and last year’s outbreak of SARS in the far-east (and its rapid spread to the west) underpins the importance of being able to gather and process large amounts of information in a short period of time. In an effort to contain the spread of SARS, Hong Kong police had to keep track of massive amounts of information – including the three “w’s” (who, where and when) for all patients, family and close contacts of those that felt ill with the disease (Bradsher, 2003). Assuming that there were 6,000 cases in a 14 day exposure period, and that an average person comes into causal contact with just 20 people in that time, the three “w’s” need to be gathered, analysed and acted upon for 120,000 people! This is certainly not possible using a detective’s notebook.
. Need for auditability and traceability. In the wake of the Enron, Worldcom and HealthSouth scandals, regulation is being introduced to govern how corporate data are handled and stored. In the US, the Sarbanes- Oxley Act (so-called after the bill’s sponsors, but officially called the Public Company Reform and Investor Protection Act of 2002) requires that CEO’s and chief financial officers sign and publicly attest to the validity of annual reports. This requirement has huge implications for the keepers of corporate information stores. By raising the issue of the validity of information to the corporate board level (and making the penalty for noncompliance jail), information processing must now be fully auditable, and the information flow from source to printed report must be traceable – including authorisations and signoffs. According to a recent CIO.com article (Worthen, 2003), 47 per cent of companies use standalone spreadsheets (personal information stores) for planning and budgeting. Clearly the use of personal information stores can lead to significant problems – data are not backed up or widely accessible and are prone to human error.
. Integration of organisational data-stores. Through company mergers and acquisitions and the rapid channel proliferation described above, the necessity to integrate organisational data-stores is becoming paramount.
According to the analyst firm Gartner, Enterprise Business Integration is slated to grow to a 6.7 billion dollar business by 2006 (Everett, 2002). However, integration is not just a technological problem, as employee work habits, organisational culture and organisational processes must also change as part of the effort. Software application and business integration is as much a technology issue as it is an information problem.
. The emergence of standards and new technologies. In every technology sector from the systemscentric deployments of the 1960s through the mid-1980s, to the rise of the PC and networks in the past 20 years, the emergence of standards has been a harbinger of a new era in computing. Systems standards led to the divergence of hardware and software, making the personal computer a reality, and PC standards have since enabled the spread of networking computing. And now data standards, such as XML, are fuelling a new age of computing based on information and content (Moschella, 2003). These emerging standards and new technologies such as Web services are having a huge impact on the way that firms think about data, integration, retrieval and analysis. Information security and privacy. Barely a week goes by without a new virus, worm or Trojan horse plaguing the Internet. What motivates a h4x0r (“hacker”), to destroy information and restrict others’ right of access? Every interaction that occurs on the Web is logged and tracked somewhere. What are the ethical considerations in using these digital footprints for research purposes? For commercial purposes? For military and national security?
. E-governance. Although we live in a multilateral world, particularly on the Internet, the governance of the World Wide Web is country specific. Online gambling is legal in the UK and Europe whereas it is not in the US, data privacy legislation differs by country and enforcement of copyright protection varies. Even what is patentable varies by country. In an interconnected and global world, the very definition of criminality is defined at the point of access. While businesses struggle to protect their brands and intellectual property rights and government pass increasingly irrelevant laws, this disparity creates fertile ground for researchers interested in the nature of information seeking and retrieval.
Each of the contributors to this special issue begin to address the notion of information disparity in their own way. First, Hastings investigates the disparate stakeholder information needs in public service broadcasting. She argues that the dual objectives of competing for the information consumer’s narrowing attention while addressing society’s social need – coupled with the disparate information needs of wide and divergent constituencies – requires new thinking and new research techniques. This debate becomes all the more relevant in the context of the BBC’s upcoming charter review in 2006. Boyd’s article presents a practical application of goal based methodologies to optimise multi-channel information retrieval. This case study highlights some of the challenges that arise when seeking to measure disparate information channel usage. Levitt presents a new research technique for analysing and deriving meaning from disparate information sources. Coupled with the properties inherent to digitally searchable text (either documents or the World wide web), the multifaceted/ magpie approach seeks to extrapolate composite understanding from seemingly disconnected units of knowledge that have been developed in other contexts. This issue concludes with two articles that investigate information disparity from a technical angle. Evans looks at emerging technology that aims to satisfy the information user’s need to consolidate their communications technologies and mediums. With relevance for technical and non-technically inclined readers, Smith’s application of consequential theory serves as a thorough primer to e-security issues.
Clearly the trends outlined above are larger than what a single issue can handle. As such, rather than a presenting a definitive statement on the problem domain, it was this editor’s intention to present the early thinking that begins to frame the research issues.
Previously published in: Aslib Proceedings, Volume 56, Number 5, 2004
55 pages; ISBN 9781845441913
, or download in
Title: Information Disparity
Author: Andrew Boyd
Hadoop: The Definitive Guide 2012 US$ 39.99 688 pages
Alan Turing: The Enigma 2014 US$ 16.95 777 pages