12

I am writing a post-processing flow on Python. I am trying to come up with an algorithm to remove the small holes from the polygon.

I am using osgeo OGR right now, but I'm ok to do this part with shapely instead.

Currently, my solution is using Buffer like: geometry.Buffer(0.2 * 1 / 111111)

This works most of the time, but sometimes GDAL eats the whole memory on the machine and just crashes. I guess this is not optional at all. How can I accomplish my desired result?

The long gap on the right must persist, only the holes that are smaller the certain threshold have to be removed.

Polython with small holes

5
  • Is the geometry a polygon or a multipolygon? Commented Aug 27, 2021 at 21:02
  • 2
    If the polygons should have 0 holes, you could get the exterior ring and make a polygon from that. This will remove any holes in the interior of the polygon.
    – jbalk
    Commented Aug 28, 2021 at 0:59
  • @KadirŞahbaz the geometry is polygon
    – PYPL
    Commented Aug 28, 2021 at 7:03
  • Either do as @jbalk suggests or iterate through the rings and only keep the holes that are bigger than some threshold
    – Ian Turton
    Commented Aug 28, 2021 at 7:56
  • 1
    The "long line" isn't a "line", it's a gap between large exterior rings.
    – Vince
    Commented Aug 28, 2021 at 12:03

2 Answers 2

28

You can use the following example script. The script removes all holes from a polygon whose area is smaller than eps.

from shapely import wkt
from shapely.geometry import Polygon

# sample polygon
polygon_wkt = 'Polygon ((670651.58235156117007136 4087356.72129086637869477, 670941.20806862262543291 4087351.57990169199183583, 670944.42234629089944065 4087244.82082350691780448, 670648.40824284462723881 4087249.48037469061091542, 670651.58235156117007136 4087356.72129086637869477),(670669.30897117499262094 4087340.80642584012821317, 670671.69532915309537202 4087328.12050161883234978, 670709.06604568683542311 4087324.62686039274558425, 670709.12268763419706374 4087339.48419455718249083, 670669.30897117499262094 4087340.80642584012821317),(670670.63015638350043446 4087310.41313102655112743, 670670.24676115158945322 4087294.13430501008406281, 670711.11178680183365941 4087293.54062445322051644, 670711.46667919436004013 4087311.23376588197425008, 670670.63015638350043446 4087310.41313102655112743),(670669.74969205493107438 4087283.51269924966618419, 670668.85500870971009135 4087257.31934182625263929, 670755.23019460134673864 4087254.81062116287648678, 670754.28513388405553997 4087284.15052131982520223, 670669.74969205493107438 4087283.51269924966618419),(670796.18594406009651721 4087338.05096173053607345, 670797.80696410103701055 4087257.43525168858468533, 670807.4089543444570154 4087256.92088755592703819, 670806.12872222356963903 4087338.25089854281395674, 670796.18594406009651721 4087338.05096173053607345),(670821.05711140809580684 4087337.8436735225841403, 670824.12722124985884875 4087255.84220110531896353, 670929.67811466054990888 4087253.72090833634138107, 670930.47750282462220639 4087337.56890620524063706, 670821.05711140809580684 4087337.8436735225841403))'
polygon = wkt.loads(polygon_wkt)

list_interiors = []
eps = 1000

for interior in polygon.interiors:
    p = Polygon(interior)    
    if p.area > eps:
        list_interiors.append(interior)

new_polygon = Polygon(polygon.exterior.coords, holes=list_interiors)

Before:
enter image description here

After:
enter image description here


For a multipolygon, you should iterate over the list of its polygons as well.

from shapely import wkt
from shapely.geometry import Polygon, MultiPolygon

multipolygon_wkt = 'MultiPolygon (((670651.58235156117007136 4087356.72129086637869477, 670941.20806862262543291 4087351.57990169199183583, 670944.42234629089944065 4087244.82082350691780448, 670648.40824284462723881 4087249.48037469061091542, 670651.58235156117007136 4087356.72129086637869477),(670669.30897117499262094 4087340.80642584012821317, 670671.69532915309537202 4087328.12050161883234978, 670709.06604568683542311 4087324.62686039274558425, 670709.12268763419706374 4087339.48419455718249083, 670669.30897117499262094 4087340.80642584012821317),(670670.63015638350043446 4087310.41313102655112743, 670670.24676115158945322 4087294.13430501008406281, 670711.11178680183365941 4087293.54062445322051644, 670711.46667919436004013 4087311.23376588197425008, 670670.63015638350043446 4087310.41313102655112743),(670669.74969205493107438 4087283.51269924966618419, 670668.85500870971009135 4087257.31934182625263929, 670755.23019460134673864 4087254.81062116287648678, 670754.28513388405553997 4087284.15052131982520223, 670669.74969205493107438 4087283.51269924966618419),(670796.18594406009651721 4087338.05096173053607345, 670797.80696410103701055 4087257.43525168858468533, 670807.4089543444570154 4087256.92088755592703819, 670806.12872222356963903 4087338.25089854281395674, 670796.18594406009651721 4087338.05096173053607345),(670821.05711140809580684 4087337.8436735225841403, 670824.12722124985884875 4087255.84220110531896353, 670929.67811466054990888 4087253.72090833634138107, 670930.47750282462220639 4087337.56890620524063706, 670821.05711140809580684 4087337.8436735225841403)),((670648.88963292469270527 4087236.86040469771251082, 670943.25558728328906 4087232.40550666907802224, 670942.91290281957481056 4087129.60016754502430558, 670646.83352614217437804 4087133.71238110959529877, 670648.88963292469270527 4087236.86040469771251082),(670667.39459396700840443 4087217.67007472785189748, 670718.45457906532101333 4087218.35544365551322699, 670716.3984722828026861 4087206.36148742400109768, 670667.73727843072265387 4087207.04685635166242719, 670667.39459396700840443 4087217.67007472785189748),(670666.02385611203499138 4087192.65410887403413653, 670736.27417118020821363 4087192.65410887403413653, 670734.90343332523480058 4087158.38566249934956431, 670665.3384871844900772 4087159.07103142701089382, 670666.02385611203499138 4087192.65410887403413653),(670752.72302544000558555 4087176.54793907795101404, 670773.96946219238452613 4087175.17720122309401631, 670770.19993309106212109 4087158.72834696341305971, 670750.32423419377300888 4087157.70029357215389609, 670752.72302544000558555 4087176.54793907795101404),(670794.18784555338788778 4087175.86257015075534582, 670922.00915053102653474 4087176.20525461435317993, 670919.26767482107970864 4087139.19533252995461226, 670794.18784555338788778 4087140.90875484840944409, 670794.18784555338788778 4087175.86257015075534582),(670806.52448624826502055 4087215.61396794533357024, 670825.02944729058071971 4087214.92859901767224073, 670821.94528711691964418 4087184.08699728036299348, 670803.09764161077328026 4087183.40162835316732526, 670806.52448624826502055 4087215.61396794533357024),(670838.73682584054768085 4087214.24323009047657251, 670923.72257284971419722 4087213.55786116281524301, 670922.69451945857144892 4087182.37357496190816164, 670838.05145691300276667 4087186.48578852694481611, 670838.73682584054768085 4087214.24323009047657251)))'
multipolygon = wkt.loads(multipolygon_wkt)

list_parts = []
eps = 1000

for polygon in multipolygon.geoms:
    list_interiors = []
    
    for interior in polygon.interiors:
        p = Polygon(interior)

        if p.area > eps:
            list_interiors.append(interior)

    temp_pol = Polygon(polygon.exterior.coords, holes=list_interiors)
    list_parts.append(temp_pol)
    
new_multipolygon = MultiPolygon(list_parts)

Before:

enter image description here

After:

enter image description here

2
  • Consider using new_multipolygon = unary_union(list_parts) for the creation of the multipolygon. Otherwise you might receive not a JSON compliant number error
    – odedfos
    Commented Mar 23, 2023 at 13:57
  • Amazing! I tried many alternatives but this was the only that worked well!
    – MauTOz
    Commented Jan 9 at 12:33
0

Here is a nicely packaged function that works with many datatypes:

from typing import TypeVar, Callable

import geopandas as gpd
from shapely.geometry import Polygon, MultiPolygon


def _remove_holes(geom, should_remove):
    def p(p: Polygon, should_remove) -> Polygon:
        holes = [i for i in p.interiors if not should_remove(Polygon(i))]
        return Polygon(shell=p.exterior, holes=holes)

    def mp(mp: MultiPolygon, should_remove) -> MultiPolygon:
        return MultiPolygon([p(i, should_remove) for i in mp.geoms])

    if isinstance(geom, Polygon):
        return p(geom, should_remove)
    elif isinstance(geom, MultiPolygon):
        return mp(geom, should_remove)
    else:
        return geom


_Geom = TypeVar("_Geom", Polygon, MultiPolygon, gpd.GeoSeries, gpd.GeoDataFrame)


def remove_holes(
    geom: _Geom, *, should_remove: Callable[[Polygon], bool] = lambda hole: True
) -> _Geom:
    """Remove all holes from a geometry that satisfy the filter function."""
    if isinstance(geom, gpd.GeoSeries):
        return geom.apply(_remove_holes, should_remove=should_remove)
    elif isinstance(geom, gpd.GeoDataFrame):
        geom = geom.copy()
        geom["geometry"] = remove_holes(geom["geometry"], should_remove=should_remove)
        return geom
    return _remove_holes(geom, should_remove=should_remove)


# remove holes smaller than 1 km^2
# remove_holes(gdf, should_remove=lambda hole: hole.area < 1e6)

Not the answer you're looking for? Browse other questions tagged or ask your own question.