I have two shapefiles. One is Point Data (Call it Adj). The other is a polygon(s) shape file. I want to clip the data to keep only the points that are not in the polygons. Is there an easy way to do this? The example .clip() function on geopandas keeps everything that is in the polygon.
2 Answers
You may use the overlay
function to do this by using the difference
option.
Example:
from shapely.geometry import Point, Polygon
import geopandas
poly = geopandas.GeoSeries([Polygon([(0,0), (2,0), (2,2), (0,2)])])
gdf = geopandas.GeoDataFrame({'geometry': poly, 'df':[1]})
points = geopandas.GeoSeries([Point([0, 1]), Point([3, 0])])
gdf2 = geopandas.GeoDataFrame({'geometry':points, 'df':[1,2]})
gdf2.overlay(gdf, how='difference')
Which should result in:
geometry df
0 POINT (3.00000 0.00000) 2
Using geopandas, the typical way to determine spatial relations efficiently (= using a spatial index) is by using sjoin. You are looking for the predicate=disjoint
relation though, which is not directly supported (reference).
However, you can easily bypass this by doing an sjoin
first with predicate=intersects
, and then selecting the rows that were not joined.
This sample script illustrates how:
import geopandas as gpd
from shapely import box, Point
import matplotlib.pyplot as plt
# Some test data
polys = gpd.GeoDataFrame(
data={
"poly_name": ["poly_left", "poly_right"],
"geometry": [box(0, 0, 2, 2), box(4, 0, 6, 2)],
}
)
points = gpd.GeoDataFrame(
{
"point_name": ["point1", "point2", "point3", "point4"],
"geometry": [Point(3, 0), Point(1, 1), Point(7, 1), Point(5, 1)],
},
)
# Join the points that intersect with the polygons.
# This uses an rtree under the hood, so is fast.
points_intersects = points.sjoin(polys, predicate="intersects")
# Now only retain the points that don't intersect.
points_disjoint = points.loc[~points.index.isin(points_intersects.index)]
# Print result
print(points_disjoint)
# Plot input data and result
fig, ax = plt.subplots()
polys.plot(ax=ax, edgecolor="red", facecolor="none")
points.plot(
ax=ax,
color="blue",
facecolor="none",
)
points_disjoint.plot(ax=ax, edgecolor="green", facecolor="none")
plt.show()
Result: the points in the polygons (intersects) are blue, the ones outside the polygons (disjoint):