There is no math for meaningfully aggregating percentiles. Once you've summarized things as percentiles (and discarded the raw data or histogram distribution behind them) there is no way to aggregate the summarized percentiles into anything useful for the same percentile levels. And yes, this means that those "average percentile" legend numbers that show in various percentile monitoring charts are completely bogus.
A simple way to demonstrate why any attempt at aggregating percentiles by averaging them (weighted or not) is useless, try it with a simple to reason about percentile: the 100%'ile (the max).
E.g. If I had the following 100%'iles reported for each one minute interval, each with the same overall event count:
[1, 0, 3, 1, 601, 4, 2, 8, 0, 3, 3, 1, 1, 0, 2]
The (weighted or not) average of this sequence is 42. And it has as much relation to the overall 100%'ile as the phase of the moon does. No amount of fancy averaging (weighted or not) will produce a correct answer for "what is the 100%'ile of the overall 15 minute period?". There is only one correct answer: 601 was the 100%'ile seen during the 15 minutes period.
There are only two percentiles for which you can actually find math that works for accurate aggregation across intervals:
- the 100%'ile (for which the answer is "the max is the max of the maxes")
- 0%'ile (for which the answer is "the min is the min of the mins")
For all other percentiles, the only correct answer is "The aggregate N%'ile is somewhere between the lowest and highest N%'ile seen in any interval in the aggregate time period". And that's not a very useful answer. Especially when the range for those can cover the entire spectrum. In many real world data sets, it often amounts to something close to "it's somewhere between the overall min and overall max".
For more ranting on this subject:
http://latencytipoftheday.blogspot.com/2014/06/latencytipoftheday-q-whats-wrong-with_21.html
http://latencytipoftheday.blogspot.com/2014/06/latencytipoftheday-you-cant-average.html