Reprinted, please indicate the source:http://blog.csdn.net/ns_code/article/details/19174553
Huffman Tree Introduction
Huffman Tree, also known as the best binary tree, is a tree with the shortest path length. Suppose there is n rights {w1, w2, …, wn}. If a binary tree with n leaf nodes is constructed, the weight of the n leaves nodes is {w1, w2, …, wn, wn, wn }, The smallest binary tree constructed is called the Hofman tree.
Here the concept of the length of the right to the tree. The length of the tree belt path of the tree refers to the sum of the path length of all the leaves in the tree to the root node and the product value of the leaf node. For rights, li means the length of the path from the leaf node to the root node, then the binary tree’s right path length wpl = w1*l1 + w2*l2 + … wn*ln.
Depending on the number of nodes and power, the shape of Herfaman tree is also different. Hiffman tree has the following characteristics:
- For the same group of weights, the Hefman tree that can be obtained is not necessarily unique.
- The left and right sub -trees of the Hefman tree can be interchangeable because this does not affect the length of the tree’s right.
- nodes with weight value are leaf nodes, and nodes without power are the root nodes of a binary tree.
- The larger the node, the closer to the root node of the Hofman tree. The smaller the node, the less the node is away from the root node of the Hefman tree.
The
- Hefman tree is only the nodes with a leaf node and a degree of 2, and there is no node with a degree of 1.
- Hefman tree with n leaf nodes has a total of 2N-1 nodes.
Huffman Tree Construction
huffman coding
Huffman encoded C implementation
- /*
- The storage structure of Hefman tree, it is also a binary tree structure,
- This storage structure is suitable for both indicating trees and suitable for forests.
- */
- typedef struct Node
- {
- int weight; // rights value
- int parent; // The serial number of the parent node is -1 is the root node
- int lchild,rchild; // The serial number of the child node is -1 is the leaf node
- }HTNode,*HuffmanTree; // Used to store all nodes in Hefman Tree
- typedef char **HuffmanCode; // Hofman coding used to store each leaf node
According to the steps of the construction of Hefman tree, we can write the code to build a Hofman tree as follows:
- /*
- Based on the Nable value of the given naigo, constructing a Herfman tree, and the nable value is stored in the WET
- */
- HuffmanTree create_HuffmanTree(int *wet,int n)
- {
- // A Hofman tree with N leaf nodes has a total of 2N-1 nodes
- int total = 2*n-1;
- HuffmanTree HT = (HuffmanTree)malloc(total*sizeof(HTNode));
- if(!HT)
- {
- printf(“HuffmanTree malloc faild!”);
- exit(-1);
- }
- int i;
- // The following initialization serial numbers are all represented by -1,
- // When a serial number of Parent or LChild or Rchild in the encoding function,
- // will not be confused with any one in the HT array
- /ht [0], htinger and weekend.
- for(i=0;i<n;i++)
- {
- HT[i].parent = -1;
- HT[i].lchild = -1;
- HT[i].rchild = -1;
- HT[i].weight = *wet;
- wet++;
- }
- //ht [n and htinger, htinger, 1]. In the root nodes of each binary tree constructed in the middle
- for(;i<total;i++)
- {
- HT[i].parent = -1;
- HT[i].lchild = -1;
- HT[i].rchild = -1;
- HT[i].weight = 0;
- }
- int min1,min2; // It is used to save the two weight of each round of the smallest and Parent with 0
- // After each round of comparison, choose MIN1 and MIN2 to form a lesson binary tree, and finally form a Hefman tree
- for(i=n;i<total;i++)
- {
- select_minium(HT,i,min1,min2);
- HT[min1].parent = i;
- HT[min2].parent = i;
- // Here the left child and right child can reverse, and it constitutes a Hefman tree, but the codes obtained are different
- HT[i].lchild = min1;
- HT[i].rchild = min2;
- HT[i].weight =HT[min1].weight + HT[min2].weight;
- }
- return HT;
- }
- /*
- Select two Weight of Weight from the former K element of the HT array and two with -1 Parent, and save it in min1 and min2 respectively.
- */
- void select_minium(HuffmanTree HT,int k,int &min1,int &min2)
- {
- min1 = min(HT,k);
- min2 = min(HT,k);
- }
The min () function code called here is as follows:
- /*
- Select the minimum and Parent element of -1 from the former K element of the HT array, and return the serial number of the element
- */
- int min(HuffmanTree HT,int k)
- {
- int i = 0;
- int min; // The serial number of the element of the minimum and Parent of Weight
- int min_weight; // Weight values for storing Weight with the smallest and Parent element of -1
- // First give the first Parent element of -1 weight value to min_weight, and stay for comparison in the future.
- // Note that you cannot follow the general practice here, first give ht [0] .weight to min_weight,
- // Because if the value of HT [0] .weight is relatively small, then it will be selected when the binary tree is constructed for the first time,
- // and the comparison of each round of the minimum power constructed binary tree in the subsequent round is still used to judge the value of ht [0] .weight.
- // This will be selected again, which will cause logical errors.
- while(HT[i].parent != -1)
- i++;
- min_weight = HT[i].weight;
- min = i;
- // Select the element of the smallest Weight and Parent to -1, and give it the serial number to the min
- for(;i<k;i++)
- {
- if(HT[i].weight<min_weight && HT[i].parent==-1)
- {
- min_weight = HT[i].weight;
- min = i;
- }
- }
- // After selecting the smallest element of Weight, place it with 1, so that the next comparison will be excluded.
The
- HT[min].parent = 1;
- return min;
- }
- /*
- From the leaf nodes to the root node, the Hefman encoded in the N -leaf node in HT HT, HT, and stored it in HC
- */
- void HuffmanCoding(HuffmanTree HT,HuffmanCode &HC,int n)
- {
- // Poor pointer used to preserve the pointer to each Hefman coding strings
- HC = (HuffmanCode)malloc(n*sizeof(char *));
- if(!HC)
- {
- printf(“HuffmanCode malloc faild!”);
- exit(-1);
- }
- // Temporary space, used to save the Hofman coding string every time
- // For the Hefman trees with n leaf nodes, the coding length of each leaf node does not exceed N-1
- // Add a ‘\ 0’ ending character, so the length of the array of allocated is N
- char *code = (char *)malloc(n*sizeof(char));
- if(!code)
- {
- printf(“code malloc faild!”);
- exit(-1);
- }
- code[n-1] = ‘\0’; // Code ending symbol, which is also the end of the character array
- // Seeking the Hofman coding of each character
- int i;
- for(i=0;i<n;i++)
- {
- int current = i; // Define the node of the current access
- int father = HT[i].parent; // parent node of the current node
- int start = n-1; // The position of each coding is initially for the position of the coding ending character
- // Traversing Hefman tree from the leaf node until the root node
- while(father != -1)
- {
- if(HT[father].lchild == current) // If it is a left child, the encoding is 0
- code[–start] = ‘0’;
- else // If it is a right child, the encoding is 1
- code[–start] = ‘1’;
- current = father;
- father = HT[father].parent;
- }
- // Code storage space for the coding string of the first character
- HC[i] = (char *)malloc((n-start)*sizeof(char));
- if(!HC[i])
- {
- printf(“HC[i] malloc faild!”);
- exit(-1);
- }
- // Copy the encoding string from code to HC
- strcpy(HC[i],code+start);
- }
- free(code); // Release the temporary space of saving coding string
- }
- /*
- from the root node to the leaf nodes without stack non -recursively traversing the Hutman tree HT, find the Hofman coding with n leaf nodes, and store it in HC
- */
- void HuffmanCoding2(HuffmanTree HT,HuffmanCode &HC,int n)
- {
- // Poor pointer used to preserve the pointer to each Hefman coding strings
- HC = (HuffmanCode)malloc(n*sizeof(char *));
- if(!HC)
- {
- printf(“HuffmanCode malloc faild!”);
- exit(-1);
- }
- // Temporary space, used to save the Hofman coding string every time
- // For the Hefman trees with n leaf nodes, the coding length of each leaf node does not exceed N-1
- // Add a ‘\ 0’ ending character, so the length of the array of allocated is N
- char *code = (char *)malloc(n*sizeof(char));
- if(!code)
- {
- printf(“code malloc faild!”);
- exit(-1);
- }
- int cur = 2*n-2; // The serial number of the node currently traversed, the initial node serial number
- int code_len = 0; // Define the length of the encoding
- // After building a good Hofman tree, use Weight to be used as the status logo of each node when traversing the tree
- // weight = 0 indicates that the left and right children of the current node have not been traveled yet
- // weight = 1 means that the left child of the current node has been traveled, and the right child has not yet been traveled
- // weight = 2 means that the left and right children of the current node have been traversed
- int i;
- for(i=0;i<cur+1;i++)
- {
- HT[i].weight = 0;
- }
- // Start from the root node, and finally return to the root node ends
- // When the CUR is the parent of the root node, exit the loop
- while(cur != -1)
- {
- // The children are not traveled around, and they traversed to the left first
- if(HT[cur].weight == 0)
- {
- HT[cur].weight = 1; // indicates that the left child has been traveled through
- if(HT[cur].lchild != -1)
- { // If the current node is not a leaf node, write down the encoding and continue to traverse it to the left
- code[code_len++] = ‘0’;
- cur = HT[cur].lchild;
- }
- else
- { // If the current node is a leaf node, terminate the encoding and save it
- code[code_len] = ‘\0’;
- HC[cur] = (char *)malloc((code_len+1)*sizeof(char));
- if(!HC[cur])
- {
- printf(“HC[cur] malloc faild!”);
- exit(-1);
- }
- strcpy(HC[cur],code); // Copy the encoding string
- }
- }
- // The left child has been traveled, and the right child began to go to the right
- else if(HT[cur].weight == 1)
- {
- HT[cur].weight = 2; // indicates that their children have been traveled through
- if(HT[cur].rchild != -1)
- { // If the current node is not a leaf node, write down the code and continue to traverse to the right
- code[code_len++] = ‘1’;
- cur = HT[cur].rchild;
- }
- }
- // The children have been traveled around, returned to the parent node, and the coding length is reduced by 1
- else
- {
- HT[cur].weight = 0;
- cur = HT[cur].parent;
- –code_len;
- }
- }
- free(code);
- }