Specifically I have a time series of a disease passing through a horse population. What I want to do is create a frequency based not on time but on cases, such that the df maintains its current order but lists 1000 cases for each entry. If an entry is too high it will create a new row, and if too low it will merge with following row averaging the inputs based on the constituent case numbers so that I can get heteroscedasticity out of the data. I realize I could do this with a massive loop, but am wondering if there are any less computationally intensive apply methods that I can use to accomplish the same task. So in the example below Time 0 would create four new rows, the last of which having 699 entries would merge with the 230 from Time 1 plus 71 from Time 2 averaging their severity and states by the number of input cases.
Time Severity Cases States
0 4 3699 39
1 7 230 15
2 2 1300 27
3 3 740 13
4 2 3000 23