The lower two subplots show the easiest way to do spatial binning, with hexbin. The lower left plot shows just the number of data points in each hexagonal cell (zero for most cells, hence the monotonous red), using

hexbin(ra,dec,cmap=cm.hsv) colorbar()The lower right plot shows how to average some other quantity in the hexagonal cells, with

hexbin(ra,dec,C=mag,gridsize=25,cmap=cm.hsv) colorbar()The number of hex cells is now reduced from the default 100, so that I sometimes get more than one data point in a cell! The absence of data is shown at white; I probably should have used a denser data set for this example. If you don't like hexagonal bins, scroll down to "More Spatial Binning" to see an alternative, which takes more than one line of code.

To do outlier rejection, try plotting the median rather than the mean
of the quantities in each
cell: `hexbin(ra,dec,C=mag,gridsize=25,cmap=cm.hsv,reduce_C_function=numpy.median))`

Example data for making these plots. If
you want to reproduce these plots with the commands above, first do

data = numpy.loadtxt('example.fiat') ra = data[:,3] dec = data[:,4] mag = data[:,5](after your normal import statements, of course). Note: on very old installations of matplotlib, the hexbin function does not exist. If you need it, update your system!

The way I prefer to accommodate crowded scatterplots, which I inherited from sm, the plotting package I used prior to matplotlib, is to make points "open," that is, they have a perimeter but nothing in the center so that it is easy to see exactly how markers overlap. In matplotlib this is accomplished with

plot(v1, v2, 'o', markerfacecolor='None')Before a reader told me about this solution, my workaround was to keep the markers big enough to overlap, but make them partially transparent with the option

In summary, you can separately control the color of the face and the perimeter of the marker, including by specifying None for either color. This was not obvious to me just by looking at the example gallery.

A challenge for matplotlib experts out there: how do I make a plot which is a scatterplot when the density of points is low, but transitions to a contour plot when the density of points is high? I've seen these types of plot prepared by other graphics systems, and they seem to me to be the optimal way to represent density when both high and low density regions are interesting.

Here is the simplest test case. It generates Gaussian distributions
with different offsets and variances in x and y so you can verify that
it's really doing the right thing. Notice the use of the
transpose, *hist.T*.

from pylab import * import numpy # the x distribution will be centered at -1, the y distro # at +1 with twice the width. x = numpy.random.randn(3000)-1 y = numpy.random.randn(3000)*2+1 hist,xedges,yedges = numpy.histogram2d(x,y,bins=40,range=[[-6,4],[-4,6]]) extent = [xedges[0], xedges[-1], yedges[0], yedges[-1] ] imshow(hist.T,extent=extent,interpolation='nearest',origin='lower') colorbar() show()On my machine at least, it produces this image:

maps=[m for m in cm.datad if not m.endswith("_r")] print mapsEach (some?) map name also has a _r version to reverse it.

To do this, you need to also set the tick formatter on the relevant axes to null. For example, for the upper right subplot you want to do it on both axes:

from matplotlib.ticker import NullFormatter ... ax = subplot(222) ax.xaxis.set_major_formatter( NullFormatter() ) ax.yaxis.set_major_formatter( NullFormatter() )You may also need to adjust the

ax.xaxis.set_major_locator(MultipleLocator(0.01))To make it clear that the label "exposure residual (mag)" applies to all subplots, I used

Note that this plot also makes use of the alpha transparency option discussed briefly above. You can use alpha with lots of routines.

To see how all these bits fit together, read the full script used to make the above plot.

xlabel(string,fontdict={'fontsize':20})Here is a comparison of a shrunken figure with and without this optional argument:

Even 20-point font may not be big enough for the y label in this case! There are ways of changing your default font size, but personally I think the numeric tick labels

**Fiddling with which ticks are labeled:** sometimes the default
choice of which ticks are labeled is awkward, for example when labels
at the corners run into each other. You can control this with:

from matplotlib.ticker import MultipleLocator ... majorLocator = MultipleLocator(0.01) ax=subplot(223) ax.xaxis.set_major_locator( majorLocator )This puts a label every 0.01 units as in the exposure residual example above, in the Subplots section. In that plot, for consistency I used the same tick locator in multiple subplots.

**Math symbols in labels:** everyone knows that you can put TeX in
the labels, but it was not obvious to me how to quote a string to make
this happen correctly. Here's how:

ylabel(r'$D_l D_{ls}/D_s\ (Mpc)$',fontdict={'fontsize':20})This is exactly the code used in the above plot.

**Legends:** these are nice when plotting multiple curves, and are
set up to work basically automatically. Here is the result of

for i in range(len(priors)): plot(z,y[i,:],label='%d%%' % (priors[i]*100)) legend(title='Prior on lens mass')

(Read the whole script if necessary.) A few things to note here:

- Each plot command needs to have the optional
*label*argument to build up the information necessary for the legend. Note that*by itself*adding*label*to the*plot*command does nothing! (Similarly,*legend()*is useless unless you've set the*label*s.) - The legend is built up in the same order in which you invoked the
plot commands. Because the curve corresponding to the 10% prior is on
top (followed by successively smaller priors) in the actual data, I wanted
the same order in the legend, and I was careful to
*plot()*them in that order. You could also set the order by explicitly giving the order to*legend()*; see the manual for how to do that. - The
*title*is optional for the legend. If your*label*s are descriptive enough, you just don't need it. - As always in Python, "%%" is necessary to encode a percent symbol.

plot((x1,x2),(y1,y2),'k-')This still doesn't work as nicely as sm's draw command, because it makes matplotlib think these are datapoints, and thus expands the plot if the points you give are outside the range of the other (real) data. So you have to pay more attention to the points you give. Being lazy, I would like a command that says "draw a line from this point to that point, but still use the range of real data to determine the cropping."