9
$\begingroup$

I want to use the numerical approximation of the integral of a function given a list of data:

$$\int_a^bf(x)dx\approx\sum_{k=1}^N\frac{f(x_{k})+f(x_{k-1})}{2}(x_{k}-x_{k-1}),$$

where $f(x_0)=f(a)$ and $f(x_N)=f(b)$. I have the following list of data $(x_k,f(x_k))$:

Data1:={{0.0708461, 3.81099}, {0.152771, 6.5486}, {0.334735,4.56075}, {0.361943,3.44458}, {0.734831, -13.7032}, {0.881342, -15.2212}, {0.9695,-14.8131}, {1.23953, -13.3162}, {1.26711, -13.4099}, {1.33873, -13.9118}, {1.34442, -13.9659}, {1.50195,-15.9091}, {1.54436,-16.4444}, {1.66381, -17.5282}, {1.99997, -15.3791}, {2.12293,-13.5003}, {2.32644, -11.0439}, {2.42306, -10.3641}, {2.81243,-8.57635}, {3.02891, -6.8037}, {3.07009, -6.42764}, {4.06738,-2.08671}, {4.31106, -1.35298}, {4.49278, -1.02712}, {4.50307,-1.01467}, {4.63782, -0.892636}, {4.66641, -0.873371}, {4.68833,-0.859377}, {4.87293, -0.740146}, {4.95529, -0.67409}}

which I'd like to take to construct the result. I've tried RecurrenceTable and Interpolation, but I want to use only the given data.

EDIT:

I've tried

Sum[(Data1[[k, 2]] + Data1[[k - 1, 2]])/2(Data1[[k, 1]]-Data1[[k-1,1]]),{k,2,Length[Data1]}]
$\endgroup$
2
  • 1
    $\begingroup$ Mathematica v12.2 evalutes your code correctly! $\endgroup$ Commented Jun 24 at 7:06
  • $\begingroup$ Your code gives the correct answer in V14.0.0, too! $\endgroup$
    – Michael E2
    Commented Jun 24 at 23:44

8 Answers 8

5
$\begingroup$

To compute the numerical approximation of the integral of a function given a list of data using the trapezoidal rule, you may need to follow the approach you described. Your method is essentially correct; however, there are a few minor issues in your code. The expression (Data1[[k, 2]] + Data1[[k - 1, 2]])/2(Data1[[k, 1]]-Data1[[k-1,1]]) is missing a multiplication operator between 1/2 and the interval length.

Here is the improved code:

Data1 = {{0.0708461, 3.81099}, {0.152771, 6.5486}, {0.334735, 4.56075}, 
         {0.361943, 3.44458}, {0.734831, -13.7032}, {0.881342, -15.2212}, 
         {0.9695, -14.8131}, {1.23953, -13.3162}, {1.26711, -13.4099}, 
         {1.33873, -13.9118}, {1.34442, -13.9659}, {1.50195, -15.9091}, 
         {1.54436, -16.4444}, {1.66381, -17.5282}, {1.99997, -15.3791}, 
         {2.12293, -13.5003}, {2.32644, -11.0439}, {2.42306, -10.3641}, 
         {2.81243, -8.57635}, {3.02891, -6.8037}, {3.07009, -6.42764}, 
         {4.06738, -2.08671}, {4.31106, -1.35298}, {4.49278, -1.02712}, 
         {4.50307, -1.01467}, {4.63782, -0.892636}, {4.66641, -0.873371}, 
         {4.68833, -0.859377}, {4.87293, -0.740146}, {4.95529, -0.67409}};

integral = Sum[(Data1[[k, 2]] + Data1[[k - 1, 2]])/2 * (Data1[[k, 1]] - Data1[[k - 1, 1]]), {k, 2, Length[Data1]}]

Print["The integral approximation is: ", integral]

enter image description here

$\endgroup$
1
  • $\begingroup$ "(Data1[[k, 2]] + Data1[[k - 1, 2]])/2(Data1[[k, 1]]-Data1[[k-1,1]]) is missing a multiplication operator between 1/2 and the interval length" — This statement is false. There is no need to insert the multiplication operator in Mathematica though there often is such a need in other languages. InputForm[] will show all multiplication operators, though. Check InputForm[ Hold[(Data1[[k, 2]] + Data1[[k - 1, 2]])/2 (Data1[[k, 1]] - Data1[[k - 1, 1]])]] to see that the multiplication is automatically interpreted. $\endgroup$
    – Michael E2
    Commented Jun 24 at 23:48
11
$\begingroup$

Another approach is to combine Interpolation with NIntegrate.

Data1 = {{0.0708461, 3.81099}, {0.152771, 6.5486}, {0.334735, 
4.56075}, {0.361943, 
3.44458}, {0.734831, -13.7032}, {0.881342, -15.2212}, {0.9695, \
-14.8131}, {1.23953, -13.3162}, {1.26711, -13.4099}, {1.33873, \
-13.9118}, {1.34442, -13.9659}, {1.50195, -15.9091}, {1.54436, \
-16.4444}, {1.66381, -17.5282}, {1.99997, -15.3791}, {2.12293, \
-13.5003}, {2.32644, -11.0439}, {2.42306, -10.3641}, {2.81243, \
-8.57635}, {3.02891, -6.8037}, {3.07009, -6.42764}, {4.06738, \
-2.08671}, {4.31106, -1.35298}, {4.49278, -1.02712}, {4.50307, \
-1.01467}, {4.63782, -0.892636}, {4.66641, -0.873371}, {4.68833, \
-0.859377}, {4.87293, -0.740146}, {4.95529, -0.67409}};

f = Interpolation[Data1, InterpolationOrder -> 1];

Int = NIntegrate[f[x], Flatten@{x, MinMax[Data1[[All, 1]]]}]

Print["The integral approximation is: ", Int]

enter image description here

$\endgroup$
3
  • 3
    $\begingroup$ I'd suggest you use Integrate instead of NIntegrate on interpolations. The results will be more accurate because no sampling is involved. Integrate[f[x], Flatten@{x, f["Domain"]}] $\endgroup$
    – Roman
    Commented Jun 24 at 10:41
  • 2
    $\begingroup$ What @Roman suggests is in effect the built-in trapezoidal rule and should be the accepted answer. In fact, Integrate[f[x], x] produces an InterpolatingFunction antiderivative of an InterpolatingFunction f[x] over its domain (with f[a] == 0, where a is the lower limit of the domain). This useful fact in not sufficiently emphasized in the documentation, it seems, although it is now pointed out, once for an indefinite integral and once for this type of definite integral in the documentation for Integrate[]. $\endgroup$
    – Michael E2
    Commented Jun 24 at 14:16
  • $\begingroup$ Yes, exactly. Very well put, @MichaelE2! $\endgroup$
    – Roman
    Commented Jun 24 at 15:23
6
$\begingroup$

Using:

Data1 = {{0.0708461, 3.81099}, {0.152771, 6.5486}, {0.334735, 4.56075}, 
         {0.361943, 3.44458}, {0.734831, -13.7032}, {0.881342, -15.2212}, 
         {0.9695, -14.8131}, {1.23953, -13.3162}, {1.26711, -13.4099}, 
         {1.33873, -13.9118}, {1.34442, -13.9659}, {1.50195, -15.9091}, 
         {1.54436, -16.4444}, {1.66381, -17.5282}, {1.99997, -15.3791}, 
         {2.12293, -13.5003}, {2.32644, -11.0439}, {2.42306, -10.3641}, 
         {2.81243, -8.57635}, {3.02891, -6.8037}, {3.07009, -6.42764}, 
         {4.06738, -2.08671}, {4.31106, -1.35298}, {4.49278, -1.02712}, 
         {4.50307, -1.01467}, {4.63782, -0.892636}, {4.66641, -0.873371}, 
         {4.68833, -0.859377}, {4.87293, -0.740146}, {4.95529, -0.67409}};

You can use Differences and BlockMap, e.g.

Differences[#1] . BlockMap[Mean, #2, 2, 1] & @@ Transpose[Data1]

-> -35.8367

$\endgroup$
1
  • 2
    $\begingroup$ Nice answer. Not posting as my own answer because it's not unique enough, but you could also swap BlockMap for MovingAverage if you wanted to. $\endgroup$
    – ydd
    Commented Jun 24 at 12:16
6
$\begingroup$

I think @Roman's modification of @JocelynMinini's use of Interpolation + Integrate is easy to code, easy to understand, and accurate; and on small data sets, it is probably the best way to code the solution.

If performance matters, then the following is fast and presented in functional style, with each step following the postfix operator //:

Developer`ToPackedArray@data1 // (* see below about packing *)
  Transpose //
 Apply[{x, y} |-> (Most@y + Rest@y).Differences@x / 2] //

(* OR *)

Developer`ToPackedArray@data2 //
  Transpose //
 Apply[{x, y} |-> ListConvolve[{0.5, 0.5}, y] . Differences@x]

Developer`ToPackedArray may be omitted if the data is known to be packed. If the data has mixed types, but can be represented in machine floating-point, then use Developer`ToPackedArray@N@data1 or Developer`ToPackedArray[data1, Real].

ListConvolve[{0.5, 0.5}, y] is nearly as efficient as Most@y + Rest@y, sometimes beating it in timings. I omit it from the discussion of performance below. It can be sped up slightly by packing the kernel {0.5, 0.5}. And ListConvolve[{0.5, 0.5}, y].Differences[x] may (or may not) be a familiar trapezoidal rule formula.

Performance issues.

Developer`ToPackedArray@data1 //
   Transpose //
  Apply[{x, y} |-> (Most@y + Rest@y).Differences@x / 2] //
 RepeatedTiming

Interpolation[data1, InterpolationOrder -> 1] // (* @Roman/@Jocelyn *)
  Integrate[#[x], Flatten@{x, #["Domain"]}] & //
 RepeatedTiming

Sum[                                             (* @zeraoulia rafik *)
 (data1[[k, 2]] + data1[[k - 1, 2]])/
    2*(data1[[k, 1]] - data1[[k - 1, 1]]), {k, 2, Length[data1]}] //
 RepeatedTiming

Total[(data1[[;; -2, 2]] + data1[[2 ;;, 2]])/    (* @so_sure *)
    2  (data1[[2 ;;, 1]] - data1[[;; -2, 1]])] //
 RepeatedTiming

Differences[#1] . BlockMap[Mean, #2, 2, 1] & @@  (* @ubpdqn *)
  Transpose[data1] //
 RepeatedTiming
(*
{5.72664*10^-6, -35.8367}  <-- Dot[]
{0.0000518135, -35.8367}   <-- My rec of @Jocelyn's: slowest, but not slow
{0.0000450536, -35.8367}   <-- Sum[]
{0.000042, -35.8367}       <-- Total[] is usually faster than Sum[]
{0.0000493315, -35.8367}   <-- BlockMap[]
*)

A larger data set, in which performance is more evident:

SeedRandom[0];
data2 = SortBy[RandomReal[{-1, 1.1}, {10^4, 2}], First];

Developer`ToPackedArray@data2 // (* ToPackedArray[] not needed *)
   Transpose //
  Apply[{x, y} |-> (Most@y + Rest@y).Differences@x / 2] //
RepeatedTiming

Interpolation[data2, InterpolationOrder -> 1] //
  Integrate[#[x], Flatten@{x, #["Domain"]}] & //
 RepeatedTiming

Sum[(data2[[k, 2]] + data2[[k - 1, 2]])/
    2*(data2[[k, 1]] - data2[[k - 1, 1]]), {k, 2, Length[data2]}] //
 RepeatedTiming

Total[(data2[[;; -2, 2]] + data2[[2 ;;, 2]])/
    2  (data2[[2 ;;, 1]] - data2[[;; -2, 1]])] // RepeatedTiming

Differences[#1] . BlockMap[Mean, #2, 2, 1] & @@ Transpose[data2] //
 RepeatedTiming
(*    |
{0.0000663423, 0.125917}     <-- Very fast   (1st pl.)
{0.0158068, 0.125917}        <-- My rec is starting to look slow
{0.0239918, 0.125917}        <-- Sum[], quite the slowest
{0.000302133, 0.125917}      <-- Total[]     (2nd pl.)
{0.00250427, 0.125917}       <-- BlockMap[]  (3rd pl.)
      |                                                       *)

Note on Sum[]. Sum[] is principally a symbolic summation tool. It will symbolically analyze its argument and try to choose a summation method. In the case of a definite sum, it will choose the "Procedural" method if the number of terms is small, which method sums terms step by step. It is not very fast at this, at present, as one can see in the timings. If the number of terms is large, Sum[] may try to find a general formula it can apply instead of adding up all the terms (for instance, Sum[2. i, {i, 10^12}]). Sum[] falls back on "Procedural" if all else fails, which is what happens in the cases above. (In fact, it probably does that quickly here, since the summand depends on the data list.) In any case, the function provided for summing data in Mathematica is Total[]. If one thinks the advantage of Sum[] is to use a formula for the summand, then Total@Table[formula, {k, a, b}] will do the same thing, usually much faster. For instance, compare 0.0005 sec. of Total@Table[] below with the 0.024 sec. of Sum[] above, almost 50 times the speed:

Total@Table[
   (data2[[k, 2]] + data2[[k - 1, 2]])/
     2*(data2[[k, 1]] - data2[[k - 1, 1]]),
   {k, 2, Length[data2]}] // RepeatedTiming

(*  {0.000506291, 0.125917}  *)

The documentation for Sum[] points out the equivalence of Sum[..] and Total[Table[..]], but it does not give advice on use-case for each. Hence this somewhat lengthy answer for an elementary problem.

One can find the bits of advice given in this answer in other posts here over the last decade, which is how I learned most of it. We have many new members since then, and with almost a hundred thousand Q&As, I'm not sure how to share the accumulated knowledge contained in the site. Over 2,000 have a score of 20 or more; 8,000 with 10+ votes. It's getting hard for me to find things. I can hardly expect others to get up to speed in a short time by going through the site.

$\endgroup$
2
  • $\begingroup$ Very well-written answer. It seems to me that WRI need to be more explicit about their implementation details, or at least rewrite some of the documentation with an eye towards best practices for performance. $\endgroup$ Commented Jun 24 at 19:49
  • $\begingroup$ @MichaelE2 thank you for this practical insight. I had noticed my answer had poor performance. Your answer provides the insights. +1 of course. What is “under the hood” affects choices and I need to think about it more. $\endgroup$
    – ubpdqn
    Commented Jun 26 at 23:12
2
$\begingroup$

Here's a faster approach using Total.

Data1 = {{0.0708461, 3.81099}, {0.152771, 6.5486}, {0.334735, 
    4.56075}, {0.361943, 
    3.44458}, {0.734831, -13.7032}, {0.881342, -15.2212}, {0.9695, \
-14.8131}, {1.23953, -13.3162}, {1.26711, -13.4099}, {1.33873, \
-13.9118}, {1.34442, -13.9659}, {1.50195, -15.9091}, {1.54436, \
-16.4444}, {1.66381, -17.5282}, {1.99997, -15.3791}, {2.12293, \
-13.5003}, {2.32644, -11.0439}, {2.42306, -10.3641}, {2.81243, \
-8.57635}, {3.02891, -6.8037}, {3.07009, -6.42764}, {4.06738, \
-2.08671}, {4.31106, -1.35298}, {4.49278, -1.02712}, {4.50307, \
-1.01467}, {4.63782, -0.892636}, {4.66641, -0.873371}, {4.68833, \
-0.859377}, {4.87293, -0.740146}, {4.95529, -0.67409}};
 

Total[(Data1[[;; -2, 2]] + Data1[[2 ;;, 2]])/
    2 (Data1[[2 ;;, 1]] - Data1[[;; -2, 1]])] // AbsoluteTiming

which gives

{0.00003, -35.8367}

In comparison,

Sum[(Data1[[k, 2]] + Data1[[k - 1, 2]])/
     2 (Data1[[k, 1]] - Data1[[k - 1, 1]]), {k, 2, Length[Data1]}] // 
  AbsoluteTiming

gives

{0.000065, -35.8367}

If you are dealing with large datasets, using Total will be more efficient.

$\endgroup$
2
$\begingroup$
Data1 = {{0.0708461, 3.81099}, {0.152771, 6.5486}, {0.334735, 4.56075}, 
     {0.361943, 3.44458}, {0.734831, -13.7032}, {0.881342, -15.2212}, 
     {0.9695, -14.8131}, {1.23953, -13.3162}, {1.26711, -13.4099}, 
     {1.33873, -13.9118}, {1.34442, -13.9659}, {1.50195, -15.9091}, 
     {1.54436, -16.4444}, {1.66381, -17.5282}, {1.99997, -15.3791}, 
     {2.12293, -13.5003}, {2.32644, -11.0439}, {2.42306, -10.3641}, 
     {2.81243, -8.57635}, {3.02891, -6.8037}, {3.07009, -6.42764}, 
     {4.06738, -2.08671}, {4.31106, -1.35298}, {4.49278, -1.02712}, 
     {4.50307, -1.01467}, {4.63782, -0.892636}, {4.66641, -0.873371}, 
     {4.68833, -0.859377}, {4.87293, -0.740146}, {4.95529, -0.67409}};

Following @ubpdqn's idea, you can also use Differences and MovingMap, e.g.

Differences[#1] . MovingMap[Mean, #2, 1] & @@ Transpose[Data1]

-35.8367

Or, as @ydd points out, you can also use MovingAverage as follows:

With[{diff = Differences[Data1[[All, 1]]],                              
      aver = MovingAverage[Data1[[All, 2]], 2]
      }, Total[MapThread[(#1*#2) &, {diff, aver}]]]

-35.8367

$\endgroup$
2
$\begingroup$
data = {{0.0708461, 3.81099}, {0.152771, 6.5486}, {0.334735, 4.56075}, 
     {0.361943, 3.44458}, {0.734831, -13.7032}, {0.881342, -15.2212}, 
     {0.9695, -14.8131}, {1.23953, -13.3162}, {1.26711, -13.4099}, 
     {1.33873, -13.9118}, {1.34442, -13.9659}, {1.50195, -15.9091}, 
     {1.54436, -16.4444}, {1.66381, -17.5282}, {1.99997, -15.3791}, 
     {2.12293, -13.5003}, {2.32644, -11.0439}, {2.42306, -10.3641}, 
     {2.81243, -8.57635}, {3.02891, -6.8037}, {3.07009, -6.42764}, 
     {4.06738, -2.08671}, {4.31106, -1.35298}, {4.49278, -1.02712}, 
     {4.50307, -1.01467}, {4.63782, -0.892636}, {4.66641, -0.873371}, 
     {4.68833, -0.859377}, {4.87293, -0.740146}, {4.95529, -0.67409}};

A variant of E. Chan-López' second answer using Query

Dot @@ Query[{1 -> Differences, 2 -> (MovingAverage[#, 2] &)}] @ Transpose[data]

-35.8367

$\endgroup$
1
  • $\begingroup$ This variant is the nicest! ;) $\endgroup$ Commented Jun 25 at 1:31
1
$\begingroup$
data = {{0.0708461, 3.81099}, {0.152771, 6.5486}, {0.334735, 4.56075}, 
     {0.361943, 3.44458}, {0.734831, -13.7032}, {0.881342, -15.2212}, 
     {0.9695, -14.8131}, {1.23953, -13.3162}, {1.26711, -13.4099}, 
     {1.33873, -13.9118}, {1.34442, -13.9659}, {1.50195, -15.9091}, 
     {1.54436, -16.4444}, {1.66381, -17.5282}, {1.99997, -15.3791}, 
     {2.12293, -13.5003}, {2.32644, -11.0439}, {2.42306, -10.3641}, 
     {2.81243, -8.57635}, {3.02891, -6.8037}, {3.07009, -6.42764}, 
     {4.06738, -2.08671}, {4.31106, -1.35298}, {4.49278, -1.02712}, 
     {4.50307, -1.01467}, {4.63782, -0.892636}, {4.66641, -0.873371}, 
     {4.68833, -0.859377}, {4.87293, -0.740146}, {4.95529, -0.67409}};

Using DiscreteIntegralPlot by Dennis M Schneider

DiscreteIntegralPlot = ResourceFunction["DiscreteIntegralPlot"];

DiscreteIntegralPlot[data, "Trap"]

enter image description here

To only get the area value:

DiscreteIntegralPlot[data, "Trap", "DrawGraph" -> False]

-35.8367

One of the other methods:

DiscreteIntegralPlot[data, "Left"]

enter image description here

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.