10
$\begingroup$

this question relates to interpolation of 3D data and Improved interpolation of mostly-structured 3d data

I have an irregular 3d data set that is quite large (around 70,000 triplets---let's call it Data1). ListPointPlot3D[Data1] gives

list point plot of data

Clearly, the characteristic feature of this set is that its resolution differs vastly from one direction to the other so that main information for the data lies in thirty stripes of $> 2000$ points each. I have issues with creating an interpolation for this dataset for the reasons mentioned in the other two posts so I would like to "normalise" my grid so that this becomes interpolatable.

At this stage, I'd be happy to get even linear interpolation along the non-dense direction but running Interpolation[Data1] just crashes the Kernel.

What I thought of doing was split the strips via FindClusters, find the one with the least elements and then reduce the elements of the rest by dropping elements (I can afford to lose some information at the edges) but I don't really know how to implement the right distance in FindClusters.

The default Euclidean distance obviously doesn't work: the $(x, y)$ coordinates of the set forced to split in thirty clusters gives

Take[#, 2] & /@ Data1;
FindClusters[%, 30];
ListPlot[%]
clustered_data

So my question is two-fold:

  • How to implement the distance that would cluster the data properly?

  • Can you think of a better way to regularise the grid so as to be able to interpolate this data set?

Here's a link to part of the data.

----EDIT----

I managed to solve my problem but I'll keep the question open in case anyone would care to improve.

For the record, the Non-Grid Interpolation Package looks really good but didn't really help here - I gave up on the Delaunay triangulation after 15' of evaluation time and the nearest neighbour method didn't look all that good.

So primarily what I did was I straightened the data set (an answer explains one way to obtain the angle) and then sorted it and split it to clusters by the $ x $-coordinate:

Rot[x_] := {{Cos[x], Sin[x], 0}, {-Sin[x], Cos[x], 0}, {0, 0, 1}};
RotData = Rot[.10795 Pi].# & /@ Data1;
RotData = SortBy[#, #[[2]] &] & /@ (RotData~GatherBy~(Round[First@#, 165]&)~SortBy~First);

(don't ask how I chose to round to 165!).I then added a missing stripe by averaging:

Mean@(RotData[[1]] + RotData[[2]]);
RotData = Insert[RotData, %, 2];

Following that, I calculated the domain where all data take values and defined a resolution and step for future reference:

lowlimit = Max@(Min /@ (Map[#[[2]] &, RotData, {2}]));
upplimit = Min@(Max /@ (Map[#[[2]] &, RotData, {2}]));
resolution = 2200;
step = (upplimit - lowlimit)/resolution;

And I finally created interpolating functions along each cluster and ran a loop to get them to evaluate to points on a regular grid. This wouldn't work if I didn't average over the x-coordinate but for this dataset this gives a very small error.

Table[Module[{TempXData, TempInterpData, t},
    TempXData = Evaluate@RotData[[j]];
    TempInterpData = Drop[#, 1] & /@ TempXData;
    t = Interpolation[TempInterpData, Method -> "Spline"];
    Table[{Mean@(First /@ TempXData), y, t[y]}, {y, lowlimit,upplimit,step}]], {j, 1, Length[RotData]}];
IntData = Flatten@%~Partition~3;

Now, IntData is an interpolatable set:

enter image description here

So (in the hope this can be useful to someone OTHER than myself) if you run into a dataset you can't interpolate, you might want to think about "striping it", Interpolating the stripes and then using the interpolation to populate a regular grid.

$\endgroup$
7
  • 1
    $\begingroup$ Can you post some sample data to play with please? $\endgroup$
    – Szabolcs
    Commented Jul 18, 2012 at 16:22
  • 1
    $\begingroup$ There was a question where I had to "straighten" a lattice. I'm wondering if we can do the same here, then interpolate, then transform back to the original cooridnates. $\endgroup$
    – Szabolcs
    Commented Jul 18, 2012 at 16:24
  • 1
    $\begingroup$ Yes, giving a link to the data set would be very good, then I can check if the crash is fixed in the development version. $\endgroup$
    – user21
    Commented Jul 18, 2012 at 16:27
  • $\begingroup$ Yep - I've uploaded a sample of the data. I remember trying to straighten it at some point but that wasn't really the issue. In any case, the angle was a little less than $ \pi /10 $ $\endgroup$
    – gpap
    Commented Jul 18, 2012 at 16:55
  • 3
    $\begingroup$ I filed the crash as a bug and we hopefully have this resolved soon. Thanks for reporting. $\endgroup$
    – user21
    Commented Jul 18, 2012 at 17:32

2 Answers 2

5
$\begingroup$

This will create the clusters by hand (but quickly), calculate the mean slope and use it to rotate ("straighten") your plot. Sorry for the unclean code.

(*read your file*)l = ReadList["c:\\long.dat", Number];
l1 = Partition[l, 3];
(*xy coordinates*)
l3 = l1[[All, 1 ;; 2]];
(*Clusterization by hand
using the fact that points are sequentially listed*)
pp = Mean[EuclideanDistance[#[[1]], #[[2]]] & /@ Partition[l3, 2, 1]];
i = 1; j = 2;
s = {};
AppendTo[s, {l3[[1]]}];
While[j <= Length@l3, 
  If[Norm[l3[[j - 1]] - l3[[j]]] < 20 pp, AppendTo[s[[i]], l3[[j++]]],
    i++; AppendTo[s, {l3[[j++]]}]]];
ListPlot@s
(*Mean Slope*)
ms = ArcTan@Mean@(Coefficient[#, x, 1] & /@ Fit[#, {1, x}, x] & /@ s);
l4 = Map[RotationMatrix[-ms].# &, s, {2}];
ListPlot@l4

enter image description hereenter image description here

$\endgroup$
1
  • $\begingroup$ Thanks (+1). I actually managed to do this just now too. Once I got the angle (which for my accuracy was $0.10795 \pi$ I used (rot[0.10795 \[Pi]].# & /@ Data1)~GatherBy~(Round[First@#, 165] &). It's really clumsy as rounding to 165 was chosen as the number that produces 30 sublists, but it works. I will make an edit tomorrow as I am not sure how to proceed for interpolating. I am favouring creating an interpolation for each line and then using this to populate a regular grid. $\endgroup$
    – gpap
    Commented Jul 18, 2012 at 20:35
7
$\begingroup$

I was going to update my other answer, but just found this:

Non-Grid Interpolation Package

Seems nice ... let's see if it can cope with your large dataset.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.