nextupprevious
Next: Economics of Information Up: Information Policy Previous: Introduction

Definition of Information

 

Both data and information have acquired additional meanings with the computer becoming a ubiquitous feature of offices, factories and now homes. For example the word `data' which is now used both as a singular and plural form can be defined: facts, information, statistics, or the like, either historical or derived by calculation or experimentation. Alternatively data can be defined as information, especially information organized for analysis or used as the basis of for a decision.

To deal with the evolution of information technology we shall make a strong distinction between data and information. Data will be defined as any type of representation of an object of an event. Such a representation could be in the form of numbers, text, symbols, voice, or static or dynamic images. Most data represents only a very small number of the attributes of the object or the event in question. In a few cases, such as a book, data can be a complete representation. Let us consider several examples. A numeric example of data in the singular would be the quarterly revenues of a division of a corporation. Another example would be a scientific observation such as the position of a comet. A text example of data would be the description of a house in a real estate ad. A static image example would be a picture of an automobile. A dynamic image example of data would be a video such as a security video.

Originally most computers were used for business decisions and scientific analysis. Thus most data processed by computers was numerical. Data organized into data files and data banks was primarily numerical and was used for analysis and decision making. With the advance of information technology computers can store and process increasing amounts of all types of data. With the advent of multimedia computers and massive storage, data files can be any combination of numbers, text, symbols, sound, both still and dynamic images.

Indeed, any computer data file constitutes data under our definition. Our primary concern will be data for analysis and decision making. Many computer files are likely to be used only very occasionally for analysis or decision making. For example, a music professor making a study of the evolution of rock music could analyze a data file of rock music sound. Likewise a data file of digitized movies could be used to study the evolution of cinematography. Nevertheless, in most cases such files will be used for recreational purposes.

Information will be defined as useful data for a particular analysis or decision task. What is useful is both specific to the task and the knowledge of the analyst( or decision maker). For many tasks useful data is observations of behavior recorded as numerical measurements, text, symbols, voice messages, or images. Such observations enable the analyst to determine the current status of the observed behavior and to predict future behavior. The value of observations is frequently enhanced by processing. In comparing automobiles a consumer does not want the wind tunnel test data used to determine the drag of a potential design but rather the actual gas mileage. An employer in evaluating recent graduates frequently wants the earned GPA rather than a complete list of every grade. In analysis and decision making much information is processed data.

It should be pointed out that the purpose for collecting data and their potential usefulness as information are not identical. For example, much data collected for administrative purposes by the government is used by academics for analysis. To some extent science is turned around. Because proprietary rights and privacy greatly limit observations of the political economy, academics when examining a data file collected for administrative purposes ask what questions can this data base resolve. Thus data can become information for a large number of analysis other than was the intent of the creators of the data base.

With the advent of multimedia documents files in data bases, these too can become information for particular types of studies. For example, all data released by corporations whether as numerical data or multimedia corporate reports are analyzed by investment analysts.

An important point concerning information is that information is generally a small subset of the data which could be collected in a data file. Consider buying an automobile. Observations could be made to the limit of the Heisenberg uncertainty principle on the position and motion of every atom. The potential buyer could be supplied with the exact composition of the paint and detail specification of every component on the car. Besides the details of the design the potential buyer could be supplied with all the production details and every test report. Such a huge data file would totally overwhelm most buyers and might well confuse rather than illuminate their analysis for possible purchase.

The relationship between information and knowledge needs to be clarified. The knowledge of behavior possessed by the analyst determines what data is information and what data is unintelligible noise. As in the example above, most consumers have the knowledge to meaningful interpret only a tiny fraction of all the possible data that could be assembled concerning an automobile. Effective decision making requires the analyst to cull the information from the data with low cost.

The purpose of defining information as useful data is that it centers on the conflict between privacy and other social concerns such as efficiency. From an economic perspective the issue is what is the social value of the information to the analyst versus what is the social value of not being able to obtain the information.

A well known definition of information is Shannon's information statistic of a message. Shannon's information statistic measures the randomness of a message. This statistic is very useful for designing information channels of the proper size to communicate messages without error. It also provides a lower bound in how far digital messages can be compressed.

However, the focus on information in this book is not on Shannon's definition. Shannon's definition has no relationship to the meaning of the message. While this may seem a bit odd to readers not familiar to communication theory, it is valid. The communication engineer can not concern himself or herself with whether the sender's message is a declaration of love or a string of nonsense syllables. The communication engineer uses Shannon's statistic to create a channel large enough to send the message, regardless of whether the message has any meaning whatsoever.


nextupprevious
Next: Economics of Information Up: Information Policy Previous: Introduction

Fred Norman
Mon 14 Dec 98