22
$\begingroup$

How can Mathematica be used to check if Easter days are normally distributed?

By definition, Easter day is the first Sunday after the first full moon after the spring equinox (so between March 23d and April 25th).

This is a self-answered question but of course the purpose is to discover new ideas/commands/etc. so feel free to post your own answer!

$\endgroup$
8
  • $\begingroup$ See JulianEasterSunday on FindRepeat documentation for one starting point. $\endgroup$
    – kirma
    Commented Apr 1, 2018 at 16:09
  • $\begingroup$ @kirma Mmh, the function seems to be wrong (it returns March 26th in both Julian and Gregorian): link only valid for today:. $\endgroup$
    – anderstood
    Commented Apr 1, 2018 at 16:16
  • $\begingroup$ Send a bug report to WRI! ;) $\endgroup$
    – kirma
    Commented Apr 1, 2018 at 16:17
  • 2
    $\begingroup$ I have filed bugs even on terminological inaccuracies on political geography of examples on documentation and those have been corrected! $\endgroup$
    – kirma
    Commented Apr 1, 2018 at 16:29
  • 1
    $\begingroup$ @kirma I filled a report, ref 4039192. $\endgroup$
    – anderstood
    Commented Apr 1, 2018 at 16:49

4 Answers 4

14
$\begingroup$

Instead of FindFit[], DistributionFitTest[] is the function to use to test the distribution followed by your data. Using easter[] from this answer, generate the number of days Easter is counted from March 23 of that same year:

data = Table[DayCount[DateObject[{k, 3, 23}], DateObject[easter[k]]], {k, 1700, 2018}];

Then,

hh = DistributionFitTest[data // N, Automatic, "HypothesisTestData"];

See the test results:

hh["TestDataTable", All]

test results

The $p$-values of all except one test are quite tiny. More starkly put,

hh["TestConclusion", All]
   {"The null hypothesis that the data is distributed according to the
     NormalDistribution[x, y] is rejected at the 5 percent level based on
     the Anderson-Darling test.",

    ...

    "The null hypothesis that the data is distributed according to the
     NormalDistribution[x, y] is rejected at the 5 percent level based on
     the Shapiro-Wilk test."}

where I have omitted some of the output.

$\endgroup$
11
  • 2
    $\begingroup$ (I'm pretty sure Jim has railed about (ab)using FindFit[] to fit against probability distributions at least twice on this site...) $\endgroup$ Commented Apr 1, 2018 at 16:56
  • 2
    $\begingroup$ Is it really so that a bit over three centuries is enough data for analysis? Julian Easter cycle was over half a millennia long. $\endgroup$
    – kirma
    Commented Apr 1, 2018 at 17:44
  • 1
    $\begingroup$ @kirma, I'll leave it to someone else to sample further into the past, as I'm running something else at the moment. ;) At least I've demo'd how to analyze this. $\endgroup$ Commented Apr 1, 2018 at 17:47
  • 1
    $\begingroup$ @J.M. I was mostly thinking of sampling towards the future, preferably with the cycle length - I assume there is one. (Too lazy to do much about it now, though!) $\endgroup$
    – kirma
    Commented Apr 1, 2018 at 17:49
  • 2
    $\begingroup$ @J.M. I'm pretty sure it's been more than twice and I see no signs of the abuse stopping anytime soon. There must be some textbook or professor out there telling folks to fit probability distributions with regression routines. I do have a constructive answer to the above question but it will likely not be for a day or two from now. $\endgroup$
    – JimB
    Commented Apr 1, 2018 at 22:55
12
$\begingroup$

This is code to compute Easter Sunday for each year (the "Computus"), from Gauss, in the proleptic Gregorian calendar:

computusGauss[year_Integer] := 
  Module[{a, b, c, d, e, f, g, h, i, j, k, month, day},
   a = Mod[year, 19];
   {b, c} = QuotientRemainder[year, 100];
   {d, e} = QuotientRemainder[b, 4];
   f = Quotient[8 b + 13, 25];
   g = Mod[19 a + b - d - f + 15, 30];
   {h, i} = QuotientRemainder[c, 4];
   j = Quotient[a + 11 g, 319];
   k = Mod[2 e + 2 h - i - g + j + 32, 7];
   month = Quotient[g - j + k + 90, 25];
   day = Mod[g - j + k + month + 19, 32];
   {year, month, day}
   ];

The result is periodic with period 5700000 years.

Compute a full period:

data = computusGauss /@ Range[5700000];

Project the results to a common year (it's irrelevant which one you choose):

data[[All, 1]] = 2018;

A date histogram shows this trapezoidal structure, between March 22 and April 25, both included:

DateListStepPlot[Tally[data]]

enter image description here

$\endgroup$
4
  • $\begingroup$ Could you add the code for the image too please? $\endgroup$
    – anderstood
    Commented Apr 3, 2018 at 20:24
  • 1
    $\begingroup$ I've added code to generate the final graphics. $\endgroup$
    – jose
    Commented Apr 4, 2018 at 2:31
  • $\begingroup$ Just to confirm that your function computeGauss gives the same result as J.M.'s easter. $\endgroup$
    – anderstood
    Commented Apr 4, 2018 at 14:19
  • $\begingroup$ @anderstood, it's actually very interesting to note the similarities and differences between the algorithms by Gauss and Anonymous. ;) $\endgroup$ Commented Apr 5, 2018 at 22:43
12
$\begingroup$

First, we need to get the data, here from the website tlarsen2.tripod.com:

data = Import["http://tlarsen2.tripod.com/thomaslarsen/easterdates.html", "Table"];
dates = Reverse /@ SortBy[Partition[Flatten@Select[data, 
         MemberQ[#, "April"] || MemberQ[#, "March"] &], 3], Last] /. 
    "April" -> 4 /. "March" -> 3;
dates = Table[date[[1 ;; 2]]~Join~{ToExpression@
      StringReplace[date[[3]], LetterCharacter -> ""]}, {date, dates}];
(* {{1700, 4, 11}, {1701, 3, 27}, {1702, 4, 16}, ... *)

Then, convert the dates into the number of days from the first possible date (March 23d), and use Tally to count:

monthsDays = dates[[All, {3, 2}]];
days = If[#[[2]] == 4, #[[1]] + 31 - 22., (#[[1]] - 22.)] & /@ monthsDays;
dist = Sort@Tally[days];

That's the date (counted in days after March 23d) as a function of the year (counted from 1700).

ListPlot[days]

enter image description here

Then, we can fit a Gaussian curve on the distribution (note: to see how to do this properly, check J.M.'s answer)

model[x_] = ampl Evaluate[PDF[NormalDistribution[mu, sigma], x]];
fit = FindFit[dist, model[x], {ampl, mu, sigma}, x]
Show[ListPlot[dist], 
 Plot[model[x] /. fit, {x, 0, 35}, PlotStyle -> Red]]

enter image description here

The result is not really well approximated by a Gaussian.

$\endgroup$
8
$\begingroup$

An article from the Irish Astronomical Journal, "The Frequency Distribution of the Dates of Easter", explains a way to find the frequency of occurrence for each date without computing Easter for the 5,700,000 years in the period. Here's code for the method.

{p, q, r} = {27550, 27075, 26600};
freq = {p, 2 q, 3 q, 4 p, 5 r, 6 p, 7 r, 7 p, 7 q, 7 q, 7 p, 7 r, 7 p,
    7 r, 7 p, 7 q, 7 q, 7 p, 7 r, 7 p, 7 r, 7 p, 7 q, 7 q, 7 p, 7 r, 
   7 p, 141/19 r, 8 p, 7 q, 6 q, 5 p, 4 r, 3 p, 30/19 r};
hist = Partition[Riffle[DayRange["Mar 22", "Apr 25"], freq], 2];
$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.