For todays 30dayMapChallenge: Process I was processing a coastline dataset for Norway, trying to find out about the fractal dimension.
osmdata.openstreetmap.de has a nice extract of this here: https://osmdata.openstreetmap.de/data/coastlines.html, you just need to clip it for youre desired region. I was doing so by applying a buffer to the GADM dataset and then clipping by location in QGIS to get a Norway coastline line item:
First step in calculating the fractal dimensions: we need to reduce the resolution. I was doing this using a step-approach to be able to visualize differences in a GIF afterwards:
import geopandas as gpd
from shapely.geometry import LineString, MultiLineString
from shapely.ops import linemerge
import os
# -----------------------------
# CONFIGURATION
# -----------------------------
INPUT_PATH = "/Users/riccardoklinger/Documents/work/norwa_coastline_diss_25832.gpkg" # your detailed coastline line dataset
OUTPUT_FOLDER = "simplified_steps"
os.makedirs(OUTPUT_FOLDER, exist_ok=True)
# 10 simplification levels
tolerance_values = [
10,
50,
100,
200,
400,
800,
1500,
2500,
4000,
7000,
12000,
] # in meters
min_length_threshold = 500
# -----------------------------
# LOAD DATA
# -----------------------------
print("Loading...")
gdf = gpd.read_file(INPUT_PATH)
# Ensure projected CRS (coastline fractal analysis requires metres!)
if gdf.crs is None or gdf.crs.to_epsg() in [4326, 4258]:
raise ValueError(
"Dataset must be in a projected CRS (meters). Reproject before running!"
)
# dissolve all lines into a single geometry
merged = gdf.union_all()
# linemerge gets a cleaner MultiLineString
merged = linemerge(merged)
# ensure it's a MultiLineString
if isinstance(merged, LineString):
merged = MultiLineString([merged])
# -----------------------------
# FUNCTION: simplify + remove small components
# -----------------------------
def process_step(lines: MultiLineString, tolerance: float, min_length: float):
simplified_parts = []
for geom in lines.geoms:
simp = geom.simplify(tolerance, preserve_topology=False)
if simp.length >= min_length:
simplified_parts.append(simp)
if len(simplified_parts) == 0:
print("Warning: after simplifying, all components removed!")
return MultiLineString([])
return MultiLineString(simplified_parts)
# -----------------------------
# ITERATE THROUGH STEPS
# -----------------------------
current = merged
for i, tol in enumerate(tolerance_values, start=1):
print(f"Step {i}: tolerance = {tol} m")
current = process_step(current, tol, min_length_threshold)
out_gdf = gpd.GeoDataFrame(geometry=[current], crs=gdf.crs)
out_path = os.path.join(OUTPUT_FOLDER, f"coast_simplified_stepa_{i:02d}.gpkg")
# Each step stored in its own GPKG; layer name required
layer_name = f"step_{i:02d}"
out_gdf.to_file(out_path, layer=layer_name, driver="GPKG")
print("Saved:", out_path)
print("Done.")
After this I got 10 nice files each with a different simplicity:

Here is a small insight to the Vesterålen, Lofot and Ofot islands with different levels of simplification:

After this I was curious about the fractal dimensions of each coast line. The fractal dimension is somewhere between 1 and 2. a curved line is not a real linear feature (dimension is 1) and not a real area (dimension 2). But we need to agree: a curved line takes some area… and that’s the fractal dimension. You can read about it on good old wikipedia.
The mathematical concept is quite good explained here. A basic implementation in python might look like this:
# ---------------------------------------------
# Box-counting fractal dimension
# ---------------------------------------------
def box_counting_dimension(geom, epsilons):
minx, miny, maxx, maxy = geom.bounds
Ns = []
for eps in epsilons:
nx = int(math.ceil((maxx - minx) / eps))
ny = int(math.ceil((maxy - miny) / eps))
occupied = set()
# For each linestring: sample points along the line
for line in geom.geoms:
length = line.length
n_samples = max(int(length / (eps / 2)), 2)
for d in np.linspace(0, length, n_samples):
x, y = line.interpolate(d).coords[0]
ix = int((x - minx) // eps)
iy = int((y - miny) // eps)
occupied.add((ix, iy))
Ns.append(len(occupied))
# Prepare regression
X = np.log(1 / np.array(epsilons)).reshape(-1, 1)
y = np.log(np.array(Ns))
model = LinearRegression().fit(X, y)
D = model.coef_[0]
return D, Ns
In this process the ruler/box size is the interesting factor and I was using it with ranges from 1km till 100km. The srcipt looks like this in the end:
- loads the gpkg
- calculate dimensions
- adds results to a dataframe
- print the dataframe
Feel free to download the file here.
The final result is made in QGIS and EZGIF:
