|
|
|
![]() |
|
|||||||
|
||||||||
|
|
Thread Tools | Rate Thread | Display Modes |
|
#13
|
||||||
|
||||||
|
Re: New compression method
Quote:
A few definitions first: Let c be the compression function, so c(f) represents the compressed version of f. Let d be the decompression function, so d(e) represents the decompressed version of e. Also, since the compression is lossless, we have d(c(p)) = p for all p. Lemma 1: In a lossless compression scheme, at most 2^k different files can be represented by k-bits. Proof: Assume for sake of contradiction that 2^k + 1 distinct files can be represented in k-bits. Since there are only 2^k different possible bit strings of length k, by the pigeon hole principle, we can conclude that two distinct files (say, f1 and f2) compress to the same output e. Formally, we have: c(f1) = c(f2) = e, and f1 <> f2 (I use <> for not equal) applying the function d to both sides, we get: d(c(f1)) = d(c(f2)) = d(e) by our cancellation law, we get: f1 = f2, but this contradicts the assumption that f1 <> f2. Thus, the lemma holds. Applying lemma 1 to the range of bit values given, we trivially conclude that at most 2^(8*508) + 2^(8*509) + 2^(8*510) + 2^(8*511) + 2^(8*512) + 2^(8*513) + 2^(8*514) + 2^(8*515) = 2^(8*508)(1 + 2^8 + 2^16 + 2^24 + 2^32 + 2^40 + 2^48+ 2^56) = 2^(8*508) < 2^(8*508) 2^57 = 2^(4121) distinct files can be compressed to sizes between 508 and 515 bytes. Now, there are 2^(8*1024*1024) files of size 1MB. With a little division, we see that at most, one in every 2^(8388608)/2^4121 = 2^8384487 can be compressed to be within that range. AT THE MOST. For reference, there are approximately 2^265 atoms in the entire universe. On the other hand, it is perfectly possible that your compression is amazing for certain types of files. It just isn't mathematically possible to be _that_ amazing for an arbitary file. EDIT: Before someone starts talking about how the lemma breaks down because it doesn't take into account the fact that you're running the algorithm multiple times, consider the following: Let c be the one-pass compressor, and d be the one-pass decompressor. Then we can write (in pseudocode): function cmult(f){ while(prev_file_size > curr_file_size) f = c(f) endwhile } Thus, cmult will continually compress using whatever algorithm of choice until it stops getting smaller. Then just replace c with cmult in the proof of the lemma. Last edited by rbayer : 07-09-2004 at 23:37. Reason: repeated function application note |
| Thread Tools | |
| Display Modes | Rate This Thread |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Compression Crisis | reisser | 3D Animation and Competition | 9 | 22-02-2004 11:23 |
| IRI Elimination Round Method | D.J. Fluck | Off-Season Events | 19 | 23-07-2003 18:56 |
| Scouting method suggestiongs | punarhero | Scouting | 0 | 26-01-2003 04:27 |
| What is your favorite method for attaching gears to shafts? | archiver | 2001 | 13 | 24-06-2002 04:00 |
| CRYSTAL METHOD CONCERT | drksdofthemoon | Chit-Chat | 7 | 30-04-2002 16:58 |