Calculate the information entropy (Shannon entropy) of a given input string.

Entropy is the expected value of the measure of information content in a system. In general, the Shannon entropy of a variable *X* is defined as:

where the information content *I*(*x*) = − log_{b}*P*(*x*). If the base of the logarithm *b* = 2, the result is expressed in *bits*, a unit of information. Therefore, given a string *S*of length *n* where *P*(*s*_{i}) is the relative frequency of each character, the entropy of a string in bits is:

For this task, use “`1223334444`” as an example. The result should be around 1.84644 bits.

#include <string> #include <map> #include <iostream> #include <algorithm> #include <cmath> double log2( double number ) { return log( number ) / log( 2 ) ; } int main( int argc , char *argv[ ] ) { std::string teststring( argv[ 1 ] ) ; std::map<char , int> frequencies ; for ( char c : teststring ) frequencies[ c ] ++ ; int numlen = teststring.length( ) ; double infocontent = 0 ; for ( std::pair<char , int> p : frequencies ) { double freq = static_cast<double>( p.second ) / numlen ; infocontent += freq * log2( freq ) ; } infocontent *= -1 ; std::cout << "The information content of " << teststring << " is " << infocontent << " !\n" ; return 0 ; }

- Output:

The information content of 1223334444 is 1.84644 !

Content is available under GNU Free Documentation License 1.2.