0
$\begingroup$

The way in which scientists should deal with errors in observations of natural phenomena was a subject of much debate over a period of about 150 years between around 1720 and 1870. The history is well documented in numerous publications by Oscar Sheynin (also Oskar Cheinine, Л. Б. Шейнин) .

Two issues of central importance to the debate were:

  • whether the mean of a series of observations provided a better estimate of the true value than a did a single "good" observation (see for example, the paper by Simpson), and
  • how best to characterize the distribution of errors (see for example, the paper on Gauss and the Theory of Errors by Sheynin)

Although I have looked at several papers of Sheynin's, and tracked down some of the papers to which he refers, I have been unable to find any actual set of observations (of anything ... astronomical, geodetic, or otherwise) that dates from the period and were used in any writing on the topic either of the mean or of error distributions.

Can anyone suggest where I might locate such a data set, even if of only a handful of observations? ... It would be especially useful to me if I can find a set of repeated (i.e., resampling) observations, as opposed to a time series.

But for his death only a few weeks ago on 3 January 2024, and of which I only just learnt, I might have emailed Shenyin himself. As it stands, this forum seemed the best place to ask because of the subject of interest, although obviously SE Opendata is an alternative if my question is not considered appropriate here.

$\endgroup$

1 Answer 1

2
$\begingroup$

As described in this paper, Gauss became famous with his prediction of the orbit of Ceres based on observations by the astronomer Piazzi. Piazzi's astronomical data (from January and February 1801) were published in von Zach's journal Monatliche Correspondenz zur Beförderung der Erd- und Himmelskunde in the September issue of 1801. Gauss's calculations are described in the December issue. Luckily, the journal is available online (in particular Volume 4, 1801), via the state library of Thuringia, Germany.

Edit: I'm adding a screenshot with Piazzi's observations ("data") from loc. cit. (Sept. 1801). Gauss used different subsets of these to predict the later position of Ceres.

Piazzi's observations Jan/Feb 1801

To deal with more data than mathematically required, hence to find an optimal solution for an overdetermined problem, Gauss invented the method of least squares. He describes this method in the third section of his book Theoria Motus (English translation online here).

The method of least squares was developed independently by Legendre and apparently put to use in the context of the meridian measurements by Delambre and Mechain. This story is told in Chapter 11 of Ken Alder's book cited on Mechain's Wikipedia page. With some effort, one should be able to find the data collected by Delambre and Mechain as well. A good source is Delambre's 1806 text Base du système métrique décimal, ou Mesure de l'arc du méridien compris entre les parallèles de Dunkerque et Barcelone, in particular the section on geodesic observations here.

$\endgroup$
4
  • $\begingroup$ Thank you, and especially for the detail. Having looked at the papers you mention, I only then thought clearly enough to realise that it is a resampling set that I will find especially useful. I have amended my question accordingly. But most definitely +1 ! Sadly, Simpson approaches the problem from the point of view of a hypothetical (with a hypothesized error distribution) but does not include an actual observational data set. $\endgroup$ Commented Feb 23 at 4:53
  • $\begingroup$ I've added a screenshot of Piazzi's "data" consisting of astronomical positions. $\endgroup$
    – Tom Heinzl
    Commented Feb 23 at 9:58
  • $\begingroup$ Thank you. I saw Piazzi's data ... unfortunately it is a time series with the major part of the change in the measure being attributable to the motion of Ceres and not to sampling error. I would be different if he were observing a (notionally) "fixed" star on successive nights . $\endgroup$ Commented Feb 23 at 11:27
  • $\begingroup$ I see - but note that Gauss only needed three observations (of a total of 22) to predict the elliptical orbit. The December article says he used two distinct subsets of size 3 to make two slightly different predictions. This obviously raises the general question of how to choose/find the optimal prediction from a "sample" of predictions. $\endgroup$
    – Tom Heinzl
    Commented Feb 23 at 12:24

Not the answer you're looking for? Browse other questions tagged or ask your own question.