home | programming | software | artwork | literature | poetry contact me | host
c++ | excel

Programming, C++

[data compression]
RLE - run length encoding
MTF - move to front
BWT - burrows wheeler
Diatomic encoding
The static model

[bit io routines]
Bits & bytes
Bit io demo

'The Static Model'

The one & only Static Model 'structure to store (single & pair) symbol occurances, the spread or range of symbols & the total number of bytes in / out, fully sortable using qsort'

The Static Model Explained:

I use the C++ Static Model for byte analysis of large files. Basically it consists of 2 structures and some variables, contained within a 'Mother' structure. The first structure is a single array to store the byte or char in symbol matching the ascii code 0-255 and its count (incremented for each different byte of matching type). The second structure is a pair multi-dimensional array which is similar to the single structure although it stores both byte or char pair in symbols matching the ascii codes 0-255 and their count (incremented for each different pair of bytes of matching type). The actual 'Mother' structure then has individual members to hold information such as the range of different single or pair combinations and the total number of bytes in and out.

To understand how the model works for pairs of bytes lets say the first 2 bytes read in are 'a' and 'b'. After declaring the new Static Model, StaticModel SM; and filling it will the relevant symbols for ascii codes (see the code example below) we would then increment the pair count by 1 for the array location (which is also the ascii code for both symbols) thus, SM.Pair[nch][ch].count++; (where nch equals second char in & ch is the the first). To reference individual members of any of the nested structures use the '.' (dot) operator i.e. to assign the total number of file bytes in I simply use SM.bytes_in=ftell(stdin); which stores the value in the bytes_in member of the Static Model SM. To reference the count member of the nested single array structure use SM.Single[ch].count++; or SM.Single[97].count++;.

So that's my Static Model. I use it within a thread and sort both single and pair array structures using qsort (see code example below). To sort the pair array structure use qsort((void *)SM.Pair, 65536, sizeof(SM.Pair[256][256]), comp );. The qsort routine requires a comparison function to be passed in from your source code. Open the example code below to see how I use it to sort even multi-dimensional array structures very efficiently. To use the code in a threaded environment you will need to include a declaration TMyThread *MyThread; in your Header File.

 Click to view code as text file...   Click to download file... 

The Static Model in operation after reading in an MPEG movie file and then sorting the single / pair counts

C++ Structure, Initialisation, Fill Array, Reset Memory & Qsort
  copy code to clipboard.


/* declaration */

struct StaticModel{

 struct{
  long count;      // count of single byte occurances
  int symbol;      // symbol matching ascii code
 }Single[256];     // Single[256] single, sorted, symbol counts

 struct{
  long count;      // count of pair byte occurances
  int symbol_a;    // symbol matching ascii code
  int symbol_b;    // symbol matching ascii code
 }Pair[256][256];  // Pair[256][256] pairs, sorted, symbol counts

 int single_range; // range of single occurances
 int pair_range;   // range of pair occurances
 long bytes_in;    // count of all bytes in
 long bytes_out;   // count of all bytes out

};

/* required - place in main body of code */

// init Static Model SM
StaticModel SM;

// clear SM memory
memset(&SM,0,sizeof(SM));

// fill symbol arrays
for(int i=0; i<256; i++)
 SM.Single[i].symbol=i;

for(int i=0; i<256; i++){
 for(int j=0; j<256; j++){
  SM.Pair[i][j].symbol_a=i;
  SM.Pair[i][j].symbol_b=j;
 }                                            
}

// sort Single & Pair arrays using quick sort
qsort((void *)SM.Single, 256, sizeof(SM.Single[256]), comp );

// powerful sort routine using pointer to array
qsort((void *)SM.Pair, 65536, sizeof(SM.Pair[256][256]), comp );

// NB: sizeof array must be passed