Dangerous Minds
Main Menu
Help Support This Site
Site Sections
Sponsors
Please click
Harringtons Compression Method (HCM)

Posted by : Einstein on Monday, August 25, 2008 - 08:51 PM PST (Hits:1737)

The Harrington Compression Method (HCM) The HCM creates a unique filing system which differentiates data into different temporary files which will have a wildly divergent ratio of 1's to 0's in the binary contents. This allows the HCM to compress Random Binary Data, also known as Entropic Data.



Harrington Compression Method



The Harrington Compression Method, henceforth known as HCM, is a repeatable, self tabulating, statistical compression method like no other compression system in use today. HCM incorporates a built-in dictionary which allows the user to repeatedly run this system on individual files, or subfiles, via triggers for certain events, and includes command sections for each built-in. As a result, HCM allows for nearly endless variations and possibilities. And, most importantly of all, the degree of file compression is far greater than any existing compression software currently available. In short, the HCM is a revolutionary compression system.


This is a White Paper intended for peer review of the basic fundamentals behind the system. Michael Hugh Harrington reserves all rights®.

The method:

1: Using a standard Huffman table:


11 = 111

10 = 110

01 = 10

00 = 0


Additional variations as needed can be utilized.


The Huffman produces 9 bits for every 8 initial bits.



2: Using HCM, we make different 'files' out of information. Any 1's in a line (until three are used or a 0 occurs) will lead to a different 'file' being granted the following information from that string. In this context each 'file' stands for a temporary file used to hold the information.


The following Table 1 is divided for the standard conversion system noted above.


File #1

1

1

1

0


File #2

1

1

0

File #3

1

0

Table 1


Note the ratios in each 'file'. File #1 has a 75% ratio of 1's to a 25% ratio of 0's. File #2 has a 66.7% ratio of 1's to a 33.3% ratio of 0's. File #3 has a 50% to 50% ratio. Of course this is purely a statistical representation as we will have a large file being compressed with many more results added to the files in question.


If originally we had a preexisting ratio imbalance of 1's to 0's we could conceivably have an imbalance


that would affect the whole. We will presume that the imbalance is to the favor of the 1's for our current example, that we favor 1's in the current system (1's and 0's can be interchanged at will), and that if 0's were predominant a single bit in a command section could switch 1's for 0's and make 1's predominant again. This imbalance will lead to even greater imbalances than 75% in the File #1 and File #2, and file #3 will then show an imbalance equal to the original imbalance.


3: Repeat step #2 on the #1 files content.The results will astonish you. Using a sub-filing system of 1.1, 1.2, 1.3, 2, 3 we end up with very interesting results across the whole spectrum.


File 1.1 will have 19.75% of the results and a ratio imbalance of 93.75% to 6.25%.


File 1.2 will have 14.81% of the results and a ratio imbalance of 80% to 20%


File 1.3 will have 9.87% of the results and a ratio imbalance of 75% to 25%


File 2 will have 33.33% of the results and a ratio imbalance of 66.7% to 33.3%


File 3 will have 22.22% of the results and a ratio imbalance of 50% to 50%


These variables are critically important due to the mathematics involved in how a Huffman can create compression.


Take the example below:


11 = 1

10 = 01

01 = 001

00 = 000


This table relies upon a statistical imbalance of 1's to 0's. The formula is, if percentage of 1's = A, and percentage of 0's = B, ( (A*A) + (A*B*2) + (B*A*3) + (B*B*3) ) /2 = C. C is the percentage remaining of the original file size. Using File 1.1 as an example, we have: ( ( .9375 * .9375) + (.9375 * .0625 * 2) + (.0625 * .9375 * 3) + (.0625 * .0625) ) / 2 = 0.591796875, or in percentage 59.1796875% of the original file size of 1.1. This is of course, statistically speaking, due to the fact that all outcomes have variances. However, in very large files processed this will be the end result's approximate range, +/- .1%. This represents a dramatic decrease in the size of File #1.1.



4: Using a replacement method as follows, except for File #3 (unless ratio was previously in excess of 65% to 35%, a command section notation can allow this):



11 = 1

10 = 01

01 = 001

00 = 000


We get the following sizes per file, statistically:


File 1.1 = 1.05 bits

File 1.2 = 1.04 bits

File 1.3 = .75 bits

File 2 = 2.83 bits

File 3 = 2 bits


Statistically speaking there are 7.67 bits for every original 8 bits. This is based upon a 50% to 50% ratio of 1's to 0's in the original code. This represents a 4% reduction in the size of our original file prior to the next portion below. No other compression system can do this on 'random' or 'entropic' data.



5: Command Section

We now need a command section to handle all the different files, changes, etc. Each file will have a specific number of bits. This can be easily represented with a simplistic counting system allowing for maximum space savings. If < 1 kilobyte then 00, if less than 1 megabyte then 01, if less than 1 gigabyte then 10, if greater than 1 gigabyte then 11.


This can be expanded for custom applications if needed to enter terrabyte or larger sizes. If 00 then 10 bits to count the number of bits in file 1.1. If 01 then 20 bits to count the bits in file 1.2. If 10 then 30 bits, and if 00 then 40 bits. Repeat for each file. Then attach each file to each other, they are now accounted for exactly. We might also include a switch for compressing file #3. We might also include a switch to flip 0 for 1, and/or 11 for 00 results. We might include other switches as needed. We should also include a command section to allow for a number of compression cycles to be completed.


6: Reversing the Compression

Our compression results can be undone by exactly going in reverse of our system. The command section will point the way accurately. There can be no errors whatsoever so long as the code is written correctly. It has an absolute lossless function built-in.



How is this all possible? There are several factors to consider.


Look at only the numbers listed in Table 2 below:


Source

111

110

10

0

File #1

1

1

1

0


File #2

1

1

0

File #3

1

0

Table 2


As you can see, any imbalance in the actual source will NOT hurt the whole. If we are missing all 110 results, File #3 can have a command section note to only count the 1's. If there is a single 110 leading to a 0 in a slew of 1's then we still can obtain a near 50% reduction in size via a standard replacement as


noted in section 4. Other replacement schemes can make our file as low as desired if the ratio is truly imbalanced. This can be hard-coded and triggered as needed. If 10 results are missing then File #2 compresses to a much higher extent. Even if 111 results are totally off we can simply flip it in the beginning with a 11 switch with a 00 command section note. Many modifications can be installed to effect the initial base code which will make any issues of preexisting imbalances a nonissue. All of this can be easily coded into a command section. Our command section can be large, or small, based upon our needs. Tree triggers can be incorporated as needed to allow for unusual contingencies which will lead to even better compression ratios. In short, this method can kick all the current compression systems out on its own. Any variation of the two bits ratios will not, cannot, hinder the system. This leads to a situation where only the number of cycles, and the command section, truly limits a binary sequence from being compressed to the fullest extent.


The claim of the pigeon hole problem naturally arises at this point. However, all the steps are reversible and lead IMMEDIATELY back to the same original code. No two results will be exactly the same. It is mathematically not possible. The HCM system completely leads to the data as inputted originally.


What this means is that currently thought of entropic data is not entropic for our purposes. Entropy exists at much lower levels than previously expected for this system. This does not mean 1 = all data. Neither does 0. It means that a new flexible view of data needs to be researched and examined.


I liken the compression to being a camera in a wall in a fog filled super-sized room. If you try to look too far you won't see to the end. However, if you move the wall in, the fog thickens, but you can see further than you did before. Eventually there is a point where the fog is so thick that any forward movement will not gain you any viewing distance, and even eventually the fog grows so thick you cannot see at all.


~~~~~~~~~~~



Another imperfect analogy is to think of the following:


If there was a way to make a dictionary with no cost, and have these possible results of 6 bits or less inside of it . . .


# # # # # #

# # # # #

# # # #

# # #

# #

#


. . . then we have the following outcomes possible:


64

32

16

8

4

2


This adds up to 126, or in another form (2^7)-2

This is from 1 to 6 bits in length, and seems adequate. Statistically we will range between 5 and 6 bits for this example.


Any bit size can be used in this scheme.


Finally, if our dictionary could tell multiple 2 bit and 1 bit results separately

# #

#

This is 6 points.

This is 6^3 or 216 results. And it can vary as 3 bits to 6 bits identifying this huge monstrous section. Statistically we will range between 4 to 6 bits for this identification of nearly 8 bits!




Read more...
[Edit | Delete ]

Comments: 5 Send this story to a friend Send Printer friendly page Print
 
Related links
Rate article
This article has not been rated

Harringtons Compression Method (HCM) | Login/Create an account | 2 Comments
Comments are owned by their poster. We aren't responsible for their content.
Re: Harringtons Compression Method (HCM) (Score: 1)
by Einstein (michaelhh@gmail.com) on Aug 25, 2008 - 08:57 PM
(User info | Send a Message) HTTP://Security1.free2host.net/Compress/Compressstart.php)
Oh and by the way:



PATENT PENDING!



All logos and trademarks in this site are property of their respective owner. The comments are property of their posters, all the rest © 2002 by me
This web site was made with MD-Pro, a web portal system written in PHP. MD-Pro is Free Software released under the GNU/GPL license.
You can syndicate our news using the file backend.php

Powered by MD-Pro