Often I find myself needing to visualise an array, such as bunch of pixel or audio channel values. A nice way to do this is via a histogram.
When building histograms you have two options: numpy’s histogram or matplotlib’s hist. As you may expect numpy is faster when you just need the data rather than the visualisation. Matplotlib is easier to apply to get a nice bar chart.
So I remember, here is a quick post with an example.
# First import numpy and matplotlib import numpy as np import matplotlib.pyplot as plt
I started with a data volume of size 256 x 256 x 8 x 300, corresponding to 300 frames of video at a resolution of 256 by 256 with 8 different image processing operations. The data values were 8-bit, i.e. 0 to 255. I wanted to visualise the distribution of pixel values within this data volume.
Using numpy, you can easily pass in the whole data volume and it will flatten the arrays within the function. Hence, to get a set of histogram values and integer bins between 0 and 255 you can run:
values, bins = np.histogram(data_vol, np.arange(0, 255))
You can then use matplotlib’s bar chart to plot this:
plt.bar(bins[:-1], values, width = 1)
Using matplotlib’s hist function, we need to flatten the data first:
results = plt.hist(data_vol.ravel(), bins=np.arange(0, 255)) plt.show()
The result of both approaches is the same. If we are being more professional, we can also use more of matplotlib’s functionality:
fig, ax = plt.subplots() results = ax.hist(data_vol.ravel(), bins=np.arange(0, 255)) ax.set_title('Pixel Value Histogram') ax.set_xlabel('Pixel Value') ax.set_ylabel('Count') plt.show()
Things get a little more tricky when we start changing our bin sizes. A good run through is found here. In this case, the slower Matplotlib function becomes easier:
fig, ax = plt.subplots() results = ax.hist( data_vol.ravel(), bins=np.linspace(0, 255, 16) ) ax.set_title('Pixel Value Histogram (4-bit)') ax.set_xlabel('Pixel Value') ax.set_ylabel('Count') plt.show()