I have a large list of data (3.2 million real numbers), and I would like to plot a histogram of it. The built-in Histogram
function is very nice, but on my computer, it is often extremely slow when trying to chart histograms of lists that are very long (~1 million real numbers).
So, I would like to pre-bin the data, put it into {x, y}
form (i.e., a list of ordered pairs), and plot it with ListPlot
-- with the hope that this will be a workaround to using Histogram[list, PerformanceGoal -> "Speed"]
directly.
The BinCounts
function is very nice: it takes a list, followed by a bin specification, and outputs the number of elements found within each bin. For example, consider one of the examples given in the documentation:
BinCounts[{1, 3, 2, 1, 4, 5, 6, 2}, {0, 10, 1}]
(* {0, 2, 2, 1, 1, 1, 1, 0, 0, 0} *)
where the bin specification $\{x_\min, x_\max, \text{dx}\}$ tells Mathematica to use bins which satisfy the relation $${x_\min + (i-1) \text{ dx} \leq x < x_\min + i \text{ dx}}$$ for bin $i$.
But, while BinCounts
efficiently and effectively outputs the "y" values (the counts), it does not output the "x" values (the bin positions). This is probably the case because there is some ambiguity in the term "bin position," especially for lists containing a small number of elements. But, for a list of many elements, the term "bin position" becomes less important, I think.
Is there any way to automatically print both the "x" and the "y" values for a "histogram" to be plotted using ListPlot
? Or should I write my own function? I can write my own function, but I just wanted to ask, because it seems somewhat odd that there does not seem to be a way to use Histogram
to simply output the data (and suppress display of the fancy, time- and memory-consuming chart graphic).
As far as what to use as the working "bin position," I guess that I would like to use the midpoint of the bounds of each bin. I guess this would be $$\frac{(x_\min + (i-1) \text{ dx}) + (x_\min + i \text{ dx})}{2} = \frac{1}{2}(2x_\min + (2 i - 1) \text{ dx})$$.