Histograms

The Hist class gives a simple implementation of one-dimensional histograms, useful for quick-and-dirty testing, without the need to link to more sophisticated packages. For this reason it is used in many of the sample main programs found in the examples subdirectory.

Basic principles

We here provide a simple overview of what is involved. As a first step you need to declare a histogram, with name, title, number of bins and x range (from, to).
 
   Hist ZpT( "Z0 pT spectrum", 100, 0., 100.); 
Alternatively you can first declare it and later define it:
 
   Hist ZpT; 
   ZpT.book( "Z0 pT spectrum", 100, 0., 100.); 
Once declared, its contents can be added by repeated calls to fill,
 
   ZpT.fill( 22.7, 1.); 
where the first argument is the x value and the second the weight. Since the weight defaults to 1 the last argument could have been omitted in this case.

A set of overloaded operators have been defined, so that histograms can be added, subtracted, divided or multiplied by each other. Then the contents are modified accordingly bin by bin. Thus the relative deviation between two histograms data and theory can be found as

 
  diff = (data - theory) / (data + theory); 
assuming that diff, data and theory have been booked with the same number of bins and x range. That responsibility rests on the user; some checks are made for compatibility, but not enough to catch all possible mistakes.

Also overloaded operations with double real numbers are available. Again these four operations are defined bin by bin, i.e. the corresponding amount is added to, subtracted from, multiplied by or divided by each bin. The double number can come before or after the histograms, with obvious results. Thus the inverse of a histogram result is given by 1. / result. The two kind of operations can be combined, e.g.

 
  allpT = ZpT + 2. * WpT 
Finally, also the +=, -+, *=, /= are overloaded, with the right-hand side being either a histogram or a real number.

Output format

A histogram can be printed by making use of the overloaded << operator, e.g.:

 
   cout << ZpT; 
The printout format is inspired by the old HBOOK one. To understand how to read this format, consider the simplified example
 
 
        3.50*10^ 2  9 
        3.00*10^ 2  X   7 
        2.50*10^ 2  X  1X 
        2.00*10^ 2  X6 XX 
        1.50*10^ 2  XX5XX 
        1.00*10^ 2  XXXXX 
        0.50*10^ 2  XXXXX 
 
          Contents 
            *10^ 2  31122 
            *10^ 1  47208 
            *10^ 0  79373 
 
          Low edge  -- 
            *10^ 1  10001 
            *10^ 0  05050 
The key feature is that the Contents and Low edge have to be read vertically. For instance, the first bin has the contents 3 * 10^2 + 4 * 10^1 + 7 * 10^0 = 347. Correspondingly, the other bins have contents 179, 123, 207 and 283. The first bin stretches from -(1 * 10^1 + 0 * 10^0) = -10 to the beginning of the second bin, at -(0 * 10^1 + 5 * 10^0) = -5.

The visual representation above the contents give a simple impression of the shape. An X means that the contents are filled up to this level, a digit in the topmost row the fraction to which the last level is filled. So the 9 of the first column indicates this bin is filled 9/10 of the way from 3.00*10^2 = 300 to 3.50*10^2 = 350, i.e. somewhere close to 345, or more precisely in the range 342.5 to 347.5.

The printout also provides some other information, such as the number of entries, i.e. how many times the histogram has been filled, the total weight inside the histogram, the total weight in underflow and overflow, and the mean value and root-mean-square width (disregarding underflow and overflow). The mean and width assumes that all the contents is in the middle of the respective bin. This is especially relevant when you plot a integer quantity, such as a multiplicity. Then it makes sense to book with limits that are half-integers, e.g.

 
   Hist multMPI( "number of multiparton interactions", 20, -0.5, 19.5); 
so that the bins are centered at 0, 1, 2, ..., respectively. This also avoids ambiguities which bin gets to be filled if entries are exactly at the border between two bins. Also note that the fill( xValue) method automatically performs a cast to double precision where necessary, i.e. xValue can be an integer.

The methods

We here collect a more complete and formal overview of the methods.

Hist::Hist()  
declare a histogram, but does not define it.

Hist::Hist(string title, int numberOfBins, double xMin, double xMax)  
declare and define a histogram, where
argument title : is a string with the title of the histogram at output,
argument numberOfBins : is the number of bin the x range will be subdivided into, limited to be at most 1000,
argument xMin : is the lower edge of the histogram,
argument xMax : is the upper edge of the histogram.

Hist::Hist(const Hist& h)  
creates an identical copy of the histogram in the argument, including bin contents.

Hist::Hist(string title, const Hist& h)  
creates an identical copy of the histogram in the argument, including bin contents, except that a new title is provided as first argument.

Hist& Hist::operator=(const Hist& h)  
copies all properties of the histogram in the argument, except that the original histogram title is retained.

void Hist::book(string title, int numberOfBins, double xMin, double xMax)  
define a histogram that previously was only declared; see above for the meaning of the arguments.

void Hist::title(string title)  
change the title of a histogram, but keep other properties unchanged.

void Hist::null()  
reset bin contents, but keep other histogram properties unchanged.

void Hist::fill(double xValue, double weight)  
fill the histogram, where
argument xValue : is the x position where the filling should occur, and
argument weight (default = 1.) : is the amount of weight to be added at this x value.

friend ostream& operator<<(ostream& os, const Hist& h)  
appends a simple histogram printout (see above for format) to the ostream, while leaving the histogram object itself unchanged. At most 100 columns are allowed to be displayed. If the number of bins is larger than 100 then the contents of adjacent bins are added to give the value in each column. (Two by two up to 200 bins, three by three up to 300, and so on, with the very last column possibly summing fewer rows than the others.)

void Hist::table(ostream& os = cout, bool printOverUnder = false, bool xMidBin = true)  
void Hist::table(string fileName, bool printOverUnder = false, bool xMidBin = true)  
print a two-column table, where the first column gives the center of each bin and the second one the corresponding bin contents. The table may be useful for plotting e.g. with Gnuplot.
The desired output stream or file name can be provided as argument. The former is more flexible (e.g., it allows easy append to an existing file), whereas the latter is simpler for the case that each histogram should be a file of its own.
An optional printOverUnder = true argument allows also underflow and overflow contents to be printed. (The arbitrary x coordinates for these are placed as if corresponding to same-size bins just below or above the regular histogram bins.)
An optional xMidBin = false argument will have the x value at the beginning of each bin printed, rather than the default midpoint value.

void Hist::rivetTable(ostream& os = cout, bool printError = false)  
void Hist::rivetTable(string fileName, bool printError = false)  
print a five-column table, where the first two columns give the lower and upper borders of each bin, the third one the bin contents, and the fourth and fifth the error (up and down) associated with the contents. This format matches the one that Rivet uses for its histograms. The choice between the two methods is the same as above for the table methods.
The error bins are put to zero by default, since the PYTHIA histogramming is not sophisticated enough to compensate for rescalings or other operations, or for weighted events. With the optional printError = true the error will be taken as the square root of the bin content, as is relevant if this content has the same unit weight for each entry to it.

friend void table(const Hist& h1, const Hist& h2, ostream& os = cout, bool printOverUnder = false, bool xMidBin = true)  
friend void table(const Hist& h1, const Hist& h2, string fileName, bool printOverUnder = false, bool xMidBin = true)  
print a three-column table, where the first column gives the center of each bin and the second and third ones the corresponding bin contents of the two histograms. Only works if the two histograms have the same x axis (within a tiny tolerance), else nothing will be done. The optional last two arguments allows also underflow and overflow contents to be printed, and the x to refer to the beginning of the bin rather than the center; see above.

string Hist::getTitle()  
return the title of the histogram.

double Hist::getBinContent(int iBin)  
return the value in bin iBin, ranging from 1 through numberOfBins, with 0 for underflow and numberOfBins + 1 for overflow.

int Hist::getEntries()  
return the number of entries, i.e. the number of time that fill(...) has been called.

bool Hist::sameSize(const Hist& h)  
checks that the number of bins and upper and lower limits are the same as in the histogram in the argument.

void Hist::takeLog(bool tenLog = true)  
by default take 10-logarithm of current contents bin by bin. With optional argument false instead take e-logarithm of contents bin by bin. If to be used, then right before the histogram is output.

void Hist::takeSqrt()  
take square root of current contents bin by bin, with negative contents set to zero.

Hist& Hist::operator+=(const Hist& h)  
Hist& Hist::operator-=(const Hist& h)  
adds or subtracts the current histogram by the contents of the histogram in the argument if sameSize(...) is true, else does nothing.

Hist& Hist::operator*=(const Hist& h)  
Hist& Hist::operator/=(const Hist& h)  
multiplies or divides the current histogram by the contents of the histogram in the argument if sameSize(...) is true, else does nothing.

Hist& Hist::operator+=(double f)  
Hist& Hist::operator-=(double f)  
adds or subtracts each bin content by the common offset f.

Hist& Hist::operator*=(double f)  
Hist& Hist::operator*=(double f)  
multiplies or divides each bin content by the common factor f.

friend Hist operator+(double f, const Hist& h1)  
friend Hist operator+(const Hist& h1, double f)  
friend Hist operator+(const Hist& h1, const Hist h2)  
add a constant to a histogram or two histograms to each other, bin by bin.

friend Hist operator-(double f, const Hist& h1)  
friend Hist operator-(const Hist& h1, double f)  
friend Hist operator-(const Hist& h1, const Hist h2)  
subtract a histogram from a constant, a constant from a histogram, or two histograms from each other, bin by bin.

friend Hist operator*(double f, const Hist& h1)  
friend Hist operator*(const Hist& h1, double f)  
friend Hist operator*(const Hist& h1, const Hist h2)  
multiply a constant by a histogram or two histograms by each other, bin by bin.

friend Hist operator/(double f, const Hist& h1)  
friend Hist operator/(const Hist& h1, double f)  
friend Hist operator/(const Hist& h1, const Hist h2)  
divide a constant by a histogram, a histogram by a constant, or two histograms by each other, bin by bin.