Entropy has a different meaning in information theory than it does in physics. Within the framework of information, entropy is just a method of testing information content. People often encounter two issues in learning this concept. The first issue lies in confusing information entropy with the thermodynamic entropy that you may have encountered in elementary physics courses. In such courses, entropy is often described as “the measure of disorder in a system.” This is usually followed by a discussion of the second law of thermodynamics, which states that “in a closed system, entropy tends to increase.” In other words, without the addition of energy, a close system will become more disordered over time. Before you can understand information entropy, however, you need to firmly understand that information entropy and thermodynamic entropy are not the same thing.

The second problem you might have with understanding information theory is that many references define the concept in different ways, some of which can seem contradictory. We will demonstrate some of the typical ways that information entropy is explained so that you can gain a complete understanding of these seemingly disparate explanations.

In information theory, entropy is the amount of information in a given message. This is simple, easy to understand, and, as you will see, essentially synonymous with other definitions you may encounter. Information entropy is sometimes described as “the number of bits required to communicate information.” So if you wish to communicate a message that contains information, if you represent the information in binary format, entropy is how many bits of information the message contains. It is entirely possible that a message might contain some redundancy, or even data you already have (which, by definition, would not be information); thus the number of bits required to communicate information could be less than the total bits in the message.

This is actually the basis for lossless data compression. Lossless data compression seeks to remove redundancy in a message to compress the message. The initial step is to establish the minimum number of bits required to communicate the information within message—or, put another way, to calculate the information entropy. Many texts describe entropy as “the measure of uncertainty in a message.” You may be wondering, how can both of these definitions be true? Actually, they are both saying the same thing, as I’ll explain.

Let’s examine the definition that is most likely causing you some consternation: entropy as a measure of uncertainty. It might help you to think of it in the following manner: only uncertainty can provide information. For example, if I tell you that right now you are reading this article, this does not provide you with any new information. You already knew that, and there was absolutely no uncertainty regarding that issue. However, the content you are about to read in other articles is uncertain, and you don’t know what you will encounter—at least, not exactly. There is, therefore, some level of uncertainty, and thus information content. Put even more directly, uncertainty is information. If you are already certain about a given fact, no information is communicated to you. New information clearly requires uncertainty that the information you received cleared up. Thus, the measure of uncertainty in a message is the measure of information in a message.

In addition to information entropy are a few closely related concepts that you should understand:

**Joint entropy**

This is the entropy of two variables or two messages. If the two messages are truly independent, their joint entropy is just the sum of the two entropies.**Conditional entropy**

If the two variables are not independent, but rather variable Y depends on variable X, then instead of joint entropy you have conditional entropy.**Differential entropy**

This is a more complex topic and involves extending Shannon’s concept of entropy to cover continuous random variables, rather than discrete random variables.

]]>

For example, even prehistoric people required information, such as locations for obtaining food and game. That information was peripheral to the tangible commodity of food. In this example, the food was the goal—the actual commodity. In the information age, the information itself is the commodity. If you reflect on this even briefly, I think you will concur that in modern times information itself is a product. Consider, for example, this book you now hold in your hands. Certainly the paper and ink used was not worth the price of the book. It is the information encoded on the pages that you pay for. In fact, you may have an electronic copy and not actually have purchased any pages and ink at all. If you are reading this book as part of a class, you paid tuition for the class. The commodity you purchased was the information transmitted to you by the professor or instructor (and, of course, augmented by the information in this book). So, clearly, information as a commodity can exist separately from computer technology. The efficient and effective transmission and storage of information, however, requires computer technology.

Still another perspective on the information age is the proliferation of information. Just a few decades ago, news meant a daily paper, or perhaps a 30-minute evening news broadcast. Today news is 24 hours a day on several cable channels and on various websites. In my own childhood, research meant going to the local library and consulting a limited number of publications which were, hopefully, not more than ten years outdated. Now, with the click of a mouse button, you have access to scholarly journals, research web sites, almanacs, dictionaries, encyclopedias—an avalanche of information. So we could view the information age as the age in which most people have ready access to a wide range of information.

Younger readers who have grown up with the Internet and cell phones and who have been absorbed in a sea of instant information may not fully realize how much information has exploded. Once you appreciate the magnitude of the information explosion, the more you can fully appreciate the need for information theory. To give you some perspective on just how much information is being transmitted and consumed in our modern civilization, consider the following facts: As early as 2003, experts estimated that humanity had accumulated a little over 12 exabytes of data during the entire course of human history. Modern media, such as magnetic storage, print, and film, had produced 5 exabytes in just the first two years of the 21st century. In 2009, researchers claim that in a single year, Americans consumed more than 3 zettabytes of information. 1 As of 2013, the World Wide Web is said to hold 4 zettabytes, or 4,000 exabytes, of data.

These incredible scales of data can be daunting to grasp but should give you an idea as to why information theory is so important. It should also clearly demonstrate that whether you measure data by the amount of information we access, or the fact that we value information itself as a commodity, we are truly in the Information Age.

]]>