Home > Statistics > Histogram


EXAMPLES ON MY SAGE PAGES: Basic Descriptive Statistics or Uniform Distributions plus Histograms


Syntax: plt.hist(datalist, histogram options)

(Notice the plt. in front of this command. We import matplotlib.pyplot as plt. See examples below and on pages.)

    datalist – a list of values (numbers).


  • A Histogram or Frequency Diagram is a statistical diagram of list of data where the range of the data is divided into a number of bins (sequential intervals) and their frequency (that is, the number of data in the list that falls into each bin).
    • Histograms are for numeric data; Bar Charts are for category data.
  • As of Jan 2013, there is no useful way to plot a histogram in Sage. (See the bootom of THIS SAGE PAGE to see what is currently available)
  • However, we can import matplotlib and get good results reasonably easily.

Example 1 - using matplotlib - minimum code required to get a proper histogram

my_data=[random() for j in range(100)]
import matplotlib.pyplot as plt
plt.hist(my_data, bins=5, range=(0.,1.))
plt.savefig('Histogram.png')
plt.close()

   Deciphering the code

my_data=[random() for j in range(100)]    #this generates a list of 100 random numbers between [0,1]
import matplotlib.pyplot as plt                   #this imports the "library" with a useful histogram command
plt.hist(my_data, bins=5, range=(0.,1.))     #this creates a plot of a histogram of the data (more below on the options here)
plt.savefig('Histogram.png')                       #these last 2 lines are related to a bug in matplotlib; they force the plot to display
plt.close()

Histogram Options

  • bins          ex. bins=10 is the number of equal, sequential intervals that the data should be divided into by their frequency
  • range        ex. range=(-3.,4.5) is an interval of decimals; default is "slightly" longer than the min and max of the datalist.
  • normed     ex. normed=1 determines whether the frequencies are normed so that the area under the histogram is 0; default is normed=0
  • facecolor  ex. facecolor='red' is the color inside the point (marker); default: '#fec7b8' (something blueish)
  • alpha        ex. alpha=0.5 is the opacity of the facecolor; default: 1 (completely opaque)

Plot Options: I really don't know all the options for plt, i.e. matplotlib. I included some in the SAGE PAGES (links at top).


Keywords: histograms, statistics, matplotlib,
<hr />
<p><a href="/Home" rel="nofollow">Home</a> &gt; <a href="/Statistics" rel="nofollow">Statistics</a> &gt; <strong>Histogram</strong></p>
<table border="1" cellpadding="2" cellspacing="2" width=1025>
<tr>
<td><hr />
<h3>EXAMPLES ON MY SAGE PAGES: <a href="http://sage.math.canterbury.ac.nz/home/pub/254">Basic Descriptive Statistics</a> or <a href="http://sage.math.canterbury.ac.nz/home/pub/255"><a href="http://sage.math.canterbury.ac.nz/home/pub/255">Uniform Distributions plus Histograms</a></h3>
<hr />
<p ><strong>Syntax: </strong><font color="#800080">plt.hist(</font><em><font color="#660000">datalist</font></em>, <a href="#histogram"><em><font color="#003300">histogram options</font></em></a><font color="#800080">)</font></p>
<p style="margin-top:-3px; margin-left:25px">(Notice the <strong>plt.</strong> in front of this command. We <strong>import matplotlib.pyplot as plt</strong>. See examples below and on pages.)</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;<em><font color="#660000"><strong>datalist</strong></font></em> – a list of values (numbers).</p>
<hr />
<ul>
<li>A <strong>Histogram</strong> or <strong>Frequency Diagram</strong> is a statistical diagram of <em>list of data</em> where the range of the data is divided into a number of bins (sequential intervals) and their frequency (that is, the number of data in the list that falls into each bin).
<ul>
<li>Histograms are for numeric data; Bar Charts are for category data.</li>
</ul>
</li>
<li>As of Jan 2013, there is no useful way to plot a histogram in Sage. (See the bootom of <a href="http://sage.math.canterbury.ac.nz/home/pub/255">THIS SAGE PAGE</a> to see what is currently available)
<li>However, we can import <strong><font color="#996600">matplotlib</font></strong> and get good results reasonably easily.</li>
</ul></td>
</tr>
</table>
<hr />
<p><strong><font color="#FF0066">Example 1</font> - using <font color="#996600">matplotlib </font> - </strong>minimum code required</p>
<div style="margin-left:20px; width:800px; border: thin dashed #000000">my_data=[random() for j in range(100)]<br />
import matplotlib.pyplot as plt<br />
plt.hist(my_data, bins=5, range=(0.,1.))<br />
plt.savefig('Histogram.png')<br />
plt.close()</div>
<p>&nbsp;&nbsp;&nbsp;<strong><font color="#FF0066">Deciphering the code</font> </strong></p>
<div style="margin-left:20px; width:800px; border: thin dashed #000000">my_data=[random() for j in range(100)] &nbsp;&nbsp;&nbsp;#this generates a <a href="\Lists">list</a> of 100 random numbers between [0,1]<br />
import matplotlib.pyplot as plt&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#this imports the &quot;library&quot; with a useful histogram command <br />
plt.hist(my_data, bins=5, range=(0.,1.))&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#this creates a plot of a histogram of the data (more below on the options here)<br />
plt.savefig('Histogram.png')&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#these last 2 lines are related to a bug in matplotlib; they force the plot to display<br />
plt.close()</div>

<p><a name="histogram" id="histogram"></a><strong><font color="#006600">Histogram Options</font></strong></p>
<ul style=" margin-top:-3px">
<li>bins &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ex. <strong>bins=10 </strong>is the number of equal, sequential intervals that the data should be divided into by their frequency</li>
<li>range&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ex. <strong>range=(-3.,4.5)</strong> is an interval of decimals; default is &quot;slightly&quot; longer than the min and max of the datalist.</li>
<li>normed &nbsp;&nbsp;&nbsp;&nbsp;ex. <strong>normed=1</strong> determines whether the frequencies are normed so that the area under the histogram is 0; default is normed=0 </li>
<li>facecolor&nbsp;&nbsp;ex. <strong>facecolor='red'</strong> is the <a href="\colors">color</a> inside the point (marker); default: '#fec7b8' (something blueish)</li>
<li>alpha &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;ex. <strong>alpha=0.5</strong> is the <a href="\opacity">opacity</a> of the facecolor; default: 1 (completely opaque)</li>
</ul>
<p><strong>Plot Options</strong>: I really don't know all the options for plt, i.e. matplotlib. I included some in the SAGE PAGES (links at top).</p>