Histograms
The Hist
class gives a simple implementation of
one-dimensional histograms, useful for quick-and-dirty testing,
without the need to link to more sophisticated packages.
For this reason it is used in many of the
sample main programs
found in the examples
subdirectory.
Basic principles
We here provide a simple overview of what is involved.
As a first step you need to declare a histogram, with name,
title, number of bins and x range (from, to).
Hist ZpT( "Z0 pT spectrum", 100, 0., 100.);
Alternatively you can first declare it and later define it:
Hist ZpT;
ZpT.book( "Z0 pT spectrum", 100, 0., 100.);
Once declared, its contents can be added by repeated calls to
fill
,
ZpT.fill( 22.7, 1.);
where the first argument is the x value and the second the
weight. Since the weight defaults to 1 the last argument could have
been omitted in this case.
A set of overloaded operators have been defined, so that histograms
can be added, subtracted, divided or multiplied by each other. Then the
contents are modified accordingly bin by bin. Thus the relative
deviation between two histograms data
and
theory
can be found as
diff = (data - theory) / (data + theory);
assuming that diff
, data
and theory
have been booked with the same number of bins and x range. That
responsibility rests on the user; some checks are made for compatibility,
but not enough to catch all possible mistakes.
Also overloaded operations with double real numbers are available.
Again these four operations are defined bin by bin, i.e. the
corresponding amount is added to, subtracted from, multiplied by or
divided by each bin. The double number can come before or after the
histograms, with obvious results. Thus the inverse of a histogram
result
is given by 1. / result
.
The two kind of operations can be combined, e.g.
allpT = ZpT + 2. * WpT
Finally, also the +=, -+, *=, /=
are overloaded, with
the right-hand side being either a histogram or a real number.
Output format
A histogram can be printed by making use of the overloaded <<
operator, e.g.:
cout << ZpT;
The printout format is inspired by the old HBOOK one. To understand
how to read this format, consider the simplified example
3.50*10^ 2 9
3.00*10^ 2 X 7
2.50*10^ 2 X 1X
2.00*10^ 2 X6 XX
1.50*10^ 2 XX5XX
1.00*10^ 2 XXXXX
0.50*10^ 2 XXXXX
Contents
*10^ 2 31122
*10^ 1 47208
*10^ 0 79373
Low edge --
*10^ 1 10001
*10^ 0 05050
The key feature is that the Contents
and
Low edge
have to be read vertically. For instance,
the first bin has the contents
3 * 10^2 + 4 * 10^1 + 7 * 10^0 = 347
. Correspondingly,
the other bins have contents 179, 123, 207 and 283. The first bin
stretches from -(1 * 10^1 + 0 * 10^0) = -10
to the
beginning of the second bin, at -(0 * 10^1 + 5 * 10^0) = -5
.
The visual representation above the contents give a simple impression
of the shape. An X
means that the contents are filled up
to this level, a digit in the topmost row the fraction to which the
last level is filled. So the 9 of the first column indicates this bin
is filled 9/10 of the way from 3.00*10^2 = 300
to
3.50*10^2 = 350
, i.e. somewhere close to 345,
or more precisely in the range 342.5 to 347.5.
The printout also provides some other information, such as the
number of entries, i.e. how many times the histogram has been filled,
the total weight inside the histogram, the total weight in underflow
and overflow, and the mean value and root-mean-square width (disregarding
underflow and overflow). The mean and width assumes that all the
contents is in the middle of the respective bin. This is especially
relevant when you plot a integer quantity, such as a multiplicity.
Then it makes sense to book with limits that are half-integers, e.g.
Hist multMPI( "number of multiparton interactions", 20, -0.5, 19.5);
so that the bins are centered at 0, 1, 2, ..., respectively. This also
avoids ambiguities which bin gets to be filled if entries are
exactly at the border between two bins. Also note that the
fill( xValue)
method automatically performs a cast
to double precision where necessary, i.e. xValue
can be an integer.
The methods
We here collect a more complete and formal overview of the methods.
Hist::Hist()
declare a histogram, but does not define it.
Hist::Hist(string title, int numberOfBins, double xMin, double xMax)
declare and define a histogram, where
argument
title :
is a string with the title of the histogram at output,
argument
numberOfBins :
is the number of bin the x range will be subdivided into,
limited to be at most 1000,
argument
xMin :
is the lower edge of the histogram,
argument
xMax :
is the upper edge of the histogram.
Hist::Hist(const Hist& h)
creates an identical copy of the histogram in the argument,
including bin contents.
Hist::Hist(string title, const Hist& h)
creates an identical copy of the histogram in the argument,
including bin contents, except that a new title is provided
as first argument.
Hist& Hist::operator=(const Hist& h)
copies all properties of the histogram in the argument,
except that the original histogram title is retained.
void Hist::book(string title, int numberOfBins, double xMin, double xMax)
define a histogram that previously was only declared;
see above for the meaning of the arguments.
void Hist::title(string title)
change the title of a histogram, but keep other properties unchanged.
void Hist::null()
reset bin contents, but keep other histogram properties unchanged.
void Hist::fill(double xValue, double weight)
fill the histogram, where
argument
xValue :
is the x position where the filling should occur, and
argument
weight (default = 1.
) :
is the amount of weight to be added at this x value.
friend ostream& operator<<(ostream& os, const Hist& h)
appends a simple histogram printout (see above for format) to the
ostream
, while leaving the histogram object itself
unchanged. At most 100 columns are allowed to be displayed.
If the number of bins is larger than 100 then the contents of
adjacent bins are added to give the value in each column. (Two by two
up to 200 bins, three by three up to 300, and so on, with the very
last column possibly summing fewer rows than the others.)
void Hist::table(ostream& os = cout, bool printOverUnder = false, bool xMidBin = true)
void Hist::table(string fileName, bool printOverUnder = false, bool xMidBin = true)
print a two-column table, where the first column gives the center of
each bin and the second one the corresponding bin contents. The table
may be useful for plotting e.g. with Gnuplot.
The desired output stream or file name can be provided as argument.
The former is more flexible (e.g., it allows easy append to an existing
file), whereas the latter is simpler for the case that each histogram
should be a file of its own.
An optional printOverUnder = true
argument allows also
underflow and overflow contents to be printed. (The arbitrary x
coordinates for these are placed as if corresponding to same-size bins
just below or above the regular histogram bins.)
An optional xMidBin = false
argument will have the
x value at the beginning of each bin printed, rather than the
default midpoint value.
void Hist::rivetTable(ostream& os = cout, bool printError = false)
void Hist::rivetTable(string fileName, bool printError = false)
print a five-column table, where the first two columns give the lower
and upper borders of each bin, the third one the bin contents, and the
fourth and fifth the error (up and down) associated with the contents.
This format matches the one that Rivet uses for its histograms.
The choice between the two methods is the same as above for the
table
methods.
The error bins are put to zero by default, since the PYTHIA
histogramming is not sophisticated enough to compensate for rescalings
or other operations, or for weighted events. With the optional
printError = true
the error will be taken as the
square root of the bin content, as is relevant if this content has
the same unit weight for each entry to it.
friend void table(const Hist& h1, const Hist& h2, ostream& os = cout, bool printOverUnder = false, bool xMidBin = true)
friend void table(const Hist& h1, const Hist& h2, string fileName, bool printOverUnder = false, bool xMidBin = true)
print a three-column table, where the first column gives the center of
each bin and the second and third ones the corresponding bin contents
of the two histograms. Only works if the two histograms have the same
x axis (within a tiny tolerance), else nothing will be done.
The optional last two arguments allows also underflow and overflow
contents to be printed, and the x to refer to the beginning
of the bin rather than the center; see above.
string Hist::getTitle()
return the title of the histogram.
double Hist::getBinContent(int iBin)
return the value in bin iBin
, ranging from 1 through
numberOfBins
, with 0
for underflow and
numberOfBins + 1
for overflow.
int Hist::getEntries()
return the number of entries, i.e. the number of time that
fill(...)
has been called.
bool Hist::sameSize(const Hist& h)
checks that the number of bins and upper and lower limits are the
same as in the histogram in the argument.
void Hist::takeLog(bool tenLog = true)
by default take 10-logarithm of current contents bin by bin. With
optional argument false
instead take e-logarithm
of contents bin by bin. If to be used, then right before the
histogram is output.
void Hist::takeSqrt()
take square root of current contents bin by bin, with negative contents
set to zero.
Hist& Hist::operator+=(const Hist& h)
Hist& Hist::operator-=(const Hist& h)
adds or subtracts the current histogram by the contents of the
histogram in the argument if sameSize(...)
is true,
else does nothing.
Hist& Hist::operator*=(const Hist& h)
Hist& Hist::operator/=(const Hist& h)
multiplies or divides the current histogram by the contents of the
histogram in the argument if sameSize(...)
is true,
else does nothing.
Hist& Hist::operator+=(double f)
Hist& Hist::operator-=(double f)
adds or subtracts each bin content by the common offset f.
Hist& Hist::operator*=(double f)
Hist& Hist::operator*=(double f)
multiplies or divides each bin content by the common factor f.
friend Hist operator+(double f, const Hist& h1)
friend Hist operator+(const Hist& h1, double f)
friend Hist operator+(const Hist& h1, const Hist h2)
add a constant to a histogram or two histograms to each other, bin by bin.
friend Hist operator-(double f, const Hist& h1)
friend Hist operator-(const Hist& h1, double f)
friend Hist operator-(const Hist& h1, const Hist h2)
subtract a histogram from a constant, a constant from a histogram,
or two histograms from each other, bin by bin.
friend Hist operator*(double f, const Hist& h1)
friend Hist operator*(const Hist& h1, double f)
friend Hist operator*(const Hist& h1, const Hist h2)
multiply a constant by a histogram or two histograms by each other,
bin by bin.
friend Hist operator/(double f, const Hist& h1)
friend Hist operator/(const Hist& h1, double f)
friend Hist operator/(const Hist& h1, const Hist h2)
divide a constant by a histogram, a histogram by a constant,
or two histograms by each other, bin by bin.