Quote:
|
Originally Posted by Alan Anderson
If a file has redundancy, or repeating patterns, it contains less information than if it had no such patterns. The redundancy can be removed and the file can be represented in a smaller form. Text tends to exhibit statistical patterns. Executable programs and pictures and audio have other patterns. Truly random data by definition has no redundancy, and thus can't be compressed without losing information.
|
That isn't entirely true. Of course, as stated many times in this thread, no compression scheme can reduce the size of all possible files, or, alternately, consistently reduce the size of a stream of random data. The problem with Mr. Anderson's explanation is that the idea of "truly random data" doesn't really have any effect on the matter. Given a finite piece of data, how does one decide if its random or has patterns? Well, it's impossible, but one way we can sort finite pieces of data by randomness is to choose a compression scheme and check if it can compress the piece of data or not. Of course, this will not yield the same classification of randomness as with any other compression scheme.
The reason we can sort of decide if data is random is that we can easily some types of patterns and use that as our "compression scheme."