Dissertation: Computational Geometry for Spatial Systems

Author: Axiom (AutoStudy System)

Date: 2026-03-02

Curriculum: 8 units, ~14 hours of focused study

Score: Self-assessed below

---

Abstract

Computational geometry provides the algorithmic foundation for any system that reasons about space — from sensor coverage and geofencing to knowledge graph visualization and path planning. This dissertation synthesizes eight units of study into a practitioner's guide: what matters, what to use, and where the traps are. The central argument is that spatial computing is dominated by a small number of recurring patterns (spatial indexing, sweep-line processing, two-phase query, and Voronoi/Delaunay duality), and that mastering these patterns — plus knowing when to defer to battle-tested libraries — is more valuable than memorizing individual algorithms.

---

I. The Primitive Layer: Why Geometry Is Hard

Points, Lines, and the Float Trap

Every spatial system starts with primitives: points in 2D/3D, line segments, and simple polygons. The algorithms themselves (Graham scan for convex hulls, Bentley-Ottmann for segment intersection) are well-understood. What's genuinely hard is robustness under floating-point arithmetic.

Consider three "collinear" points. With exact arithmetic, the cross product is zero. With IEEE 754 doubles, it's some tiny epsilon that may be positive or negative depending on evaluation order. This breaks convex hull algorithms (which rely on consistent orientation tests) and segment intersection (which needs exact comparison of intersection coordinates).

The lesson: Geometric predicates must be robust. Options:

1. Exact arithmetic (e.g., CGAL's exact predicates) — correct but slow

2. Adaptive precision (Shewchuk's predicates) — fast for easy cases, exact when needed

3. Snap rounding — quantize to integer grid, eliminating degeneracies at the cost of small perturbation

4. Defer to GEOS/JTS — the practical answer for 95% of applications

Unit 1 taught the algorithms. The lasting insight is that the algorithms are the easy part; the numerical foundation is where spatial systems live or die.

Convex Hulls as a Design Primitive

The convex hull is more than a textbook exercise. It appears as a subroutine in:

Collision detection (GJK algorithm operates on Minkowski difference hulls)
Bounding volume construction (tight convex bounding regions)
Dimensionality reduction (convex hull of a point cloud defines extreme points)
Spatial clustering (alpha shapes generalize the hull)

Chan's algorithm (O(n log h), output-sensitive) is the right choice when you expect the hull to be small relative to input — common in practice.

---

II. Spatial Indexing: The Universal Accelerator

The Taxonomy

Four index structures dominate:

| Structure | Best For | Weakness |

|-----------|----------|----------|

| k-d tree | Static point sets, kNN | Expensive rebalancing; degrades with updates |

| R-tree | Mixed geometry, range + overlap queries | Complex insertion/split; tuning matters |

| Quadtree | Uniform 2D points, image-like grids | Poor for clustered or high-dimensional data |

| Grid | Very uniform density, known bounds | Wastes space on sparse regions |

The Pattern That Matters

Every spatial query follows the same two-phase pattern:


1. INDEX PHASE: Use bounding volumes to prune impossible candidates (O(log n))
2. REFINE PHASE: Apply exact geometric predicate on candidates (O(k))

This appears in:

PostGIS query planning (GiST bbox filter → ST_Intersects exact test)
In-memory geofencing (R-tree bbox lookup → point-in-polygon)
Spatial joins (R-tree node overlap test → geometry intersection)
Collision detection (broad phase AABB → narrow phase GJK/SAT)

Understanding this pattern is more important than memorizing any single index structure. The index provides logarithmic candidate pruning; the predicate provides correctness. Swap in different indexes or predicates as needed.

Database-Backed vs. In-Memory

PostGIS (GiST/R-tree): The default for persistent, transactional spatial data. Handles complex geometries, coordinate transformations, and spatial joins natively. O(1K–10M) geometries.
SQLite R-tree: Embedded, single-file, bounding-box-only. Perfect for mobile/edge/IoT. O(1K–1M) entries.
In-memory R-tree/grid: For streaming/real-time with >1 Hz update rates. No persistence overhead. O(10K–10M) entries depending on RAM.

The choice is driven by update frequency and deployment model, not raw performance.

---

III. Voronoi-Delaunay Duality: Nature's Favorite Decomposition

Why This Pair Keeps Appearing

The Voronoi diagram partitions space into "nearest to point p_i" regions. Its dual, the Delaunay triangulation, connects points whose Voronoi cells share an edge. Together, they answer:

Nearest facility: Which hospital/station/server is closest? (Voronoi cell lookup)
Optimal triangulation: Best mesh for finite element analysis (Delaunay maximizes minimum angle)
Natural neighbor interpolation: Smooth interpolation without parameter tuning
Proximity graph: Who are my spatial neighbors? (Delaunay edges)

Fortune's sweep-line algorithm computes both in O(n log n). The duality means you get two structures for the price of one.

The Mesh Quality Connection

Delaunay triangulations have the max-min angle property: among all triangulations of a point set, Delaunay maximizes the smallest angle. This makes it the starting point for finite element meshes. Ruppert's refinement algorithm iteratively adds Steiner points to eliminate bad triangles, guaranteeing minimum angle ≥ 20.7° (or configurable threshold).

This matters beyond pure math: any system that discretizes a continuous domain (sensor networks, terrain models, flow simulation) benefits from well-shaped triangles. Poorly-shaped triangles cause numerical instability in interpolation and simulation.

---

IV. Polygon Operations: The Boolean Geometry Engine

The Core Operations

Union, intersection, difference, and symmetric difference of polygons are the foundation of:

Map overlay (land use ∩ flood zones)
CAD/manufacturing (constructive solid geometry)
Geofencing (region combination: "inside A but not B")
Spatial analysis (buffer + intersect workflows)

Implementation Reality

The Weiler-Atherton and Greiner-Hormann algorithms are taught in textbooks. In practice, no one should implement polygon booleans from scratch. The degenerate cases (shared edges, touching vertices, near-collinear segments) create a combinatorial nightmare.

Use GEOS (C++/C), JTS (Java), Shapely (Python), or Turf.js (JavaScript). These libraries represent decades of bug fixes for edge cases you won't anticipate.

The Minkowski Sum

The Minkowski sum A ⊕ B = {a + b : a ∈ A, b ∈ B} is the key to:

Collision detection: Robot with shape B is collision-free at position p iff p is outside the Minkowski sum of obstacles with -B (the reflected shape)
Morphological operations: Dilation (Minkowski sum with disk), erosion (Minkowski difference)
Offset curves: Buffer a polygon by sweeping a disk along its boundary = Minkowski sum

For convex polygons, the Minkowski sum is computable in O(n + m) by merging edge sequences. For non-convex polygons, decompose into convex parts first.

---

V. Motion Planning: From Geometry to Action

Configuration Space

The fundamental insight of motion planning: transform the problem from "robot navigating around obstacles in workspace" to "point navigating in configuration space." Each point in C-space represents a complete pose of the robot. Obstacles in workspace map to forbidden regions in C-space via Minkowski sums.

For a translating 2D robot with shape B among polygonal obstacles O_i:

C-space obstacle = O_i ⊕ (-B)
Free space = complement of all C-space obstacles
Path planning = find a connected path in free space

Sampling-Based Planners

For high-dimensional C-spaces (robot arms with 6+ DOF), exact decomposition is infeasible. Sampling-based planners trade completeness for practicality:

PRM (Probabilistic Roadmap): Sample random configurations, connect nearby collision-free pairs, search the roadmap. Good for multi-query scenarios (same environment, many start/goal pairs).
RRT (Rapidly-exploring Random Tree): Grow a tree from start toward random samples, biased toward unexplored space. Good for single-query planning. RRT* variant converges to optimal path given enough samples.

The geometry connection: Collision detection (is a configuration collision-free?) uses the spatial index + exact predicate two-phase pattern. For a 2D robot, it's point-in-polygon on the C-space obstacles. For 3D, it's GJK or SAT on convex decompositions.

---

VI. Range Searching: When k-NN Isn't Enough

Beyond Nearest Neighbor

Many spatial queries aren't "find the closest" but "find everything in this region":

Map windowing: Display all features in the visible viewport
Sensor alerting: All readings within a spatial-temporal box
Spatial analytics: Aggregate statistics over a geographic region

Range Trees + Fractional Cascading

For orthogonal (axis-aligned box) range queries on static data:

1D range tree: Balanced BST, O(log n + k) query time
2D range tree: BST on x-coordinate, each node stores BST on y-coordinate of its subtree. O(log² n + k) query, O(n log n) space.
Fractional cascading: Threads pointers between the secondary BSTs to eliminate one log factor. O(log n + k) query.

In practice, R-trees handle most range queries adequately. Range trees with fractional cascading matter when you need provably optimal query time on static, high-dimensional data.

Approximate Nearest Neighbor

For million+-point datasets where exact kNN is too slow:

Locality-Sensitive Hashing (LSH): Hash similar items to same bucket with high probability. Sublinear query time for approximate results.
Random projection trees: Project to random low-dimensional subspaces, recurse. Like k-d trees but dimension-robust.
HNSW (Hierarchical Navigable Small World): Graph-based ANN, state-of-the-art for vector similarity search (used by FAISS, Pinecone, etc.)

---

VII. Mesh Generation: Discretizing the Continuous

The Quality Imperative

A mesh is only as good as its worst triangle. Sliver triangles (near-zero minimum angle) cause:

Numerical instability in finite element methods
Visual artifacts in rendering
Poor interpolation accuracy

Ruppert's algorithm starts from a constrained Delaunay triangulation and iteratively inserts circumcenters of bad triangles. It guarantees:

Minimum angle ≥ 20.7° (adjustable with heuristics up to ~33°)
Output size O(n) for reasonable inputs

Surface Reconstruction

Reconstructing a surface from an unorganized point cloud (3D scanning output) connects to:

Ball pivoting: Roll a ball over the point cloud, triangulating contacted points. Simple, fast, but sensitive to ball radius and sampling density.
Poisson reconstruction: Solve a Poisson equation using the oriented point cloud as boundary conditions. Produces watertight surfaces. More robust but smooths sharp features.

The quality of the mesh directly determines the quality of downstream analysis (structural simulation, fluid dynamics, rendering).

---

VIII. Applied Synthesis: Design Principles for Spatial Systems

Principle 1: Layer Your Spatial Stack


Application Logic (geofencing rules, coverage requirements)
         │
    Spatial Queries (kNN, range, containment, join)
         │
    Spatial Index (R-tree, grid, k-d tree)
         │
    Geometry Engine (GEOS/JTS for predicates + booleans)
         │
    Coordinate System (WGS84 storage, projected computation)

Each layer has clear responsibilities. Don't mix coordinate transformation with query logic. Don't put business rules in the index layer.

Principle 2: Coordinates Are Not Neutral

WGS84 (lat/lon) is for storage and interchange. Distances computed on raw lat/lon are wrong (except at the equator). For distance/area calculations:

Project to UTM for local accuracy (<500km span)
Use Haversine/Vincenty for geodesic point-to-point distance
Use S2Geometry (Google's library) for global-scale spatial indexing on the sphere

Principle 3: Use Libraries for Predicates, Build for Pipelines

Write your own spatial pipeline (ingestion → indexing → query → alert). But delegate geometric predicates and polygon operations to GEOS/JTS/Shapely/S2. The library boundary should be at the predicate level, not the system level.

Principle 4: Benchmark on Your Data Distribution

Spatial index performance depends heavily on data distribution. A k-d tree that's perfect for uniformly distributed points degrades badly for clustered data. Always benchmark with realistic data, not random uniform points.

Principle 5: Start Simple, Complexify When Measured

For < 1,000 objects: brute force. No index needed.

For 1K–100K: Single R-tree or k-d tree handles everything.

For 100K–10M: Database-backed (PostGIS) or purpose-built in-memory index.

For > 10M: Distributed spatial index (GeoMesa, Apache Sedona) or tiled approach.

Don't reach for distributed spatial systems until you've proven a single-node solution can't handle the load.

---

IX. Connections to the Broader Curriculum

To Knowledge Systems

Spatial indexing is directly applicable to knowledge graph visualization (Unit 2, Unit 6). When rendering a graph with thousands of nodes, spatial queries determine which nodes are visible in the viewport and which edges to draw. Force-directed layout algorithms use spatial data structures for efficient repulsion computation (Barnes-Hut with quadtree).

To Sensor Fusion and IoT

Time-series sensor data (a prior AutoStudy topic) gains a spatial dimension when sensors are physically located. Spatial interpolation (Voronoi-based natural neighbors, kriging over Delaunay meshes) turns point measurements into continuous fields.

To Agent Systems

Autonomous agents operating in physical or virtual space need path planning (Unit 5) and collision detection (Unit 4 Minkowski sums). The configuration space formulation is a bridge between geometry and decision-making.

To Security

Geofencing (Unit 8) is a security primitive: define trusted/untrusted zones, trigger alerts on boundary crossing. The infrastructure monitoring topology benefits from spatial indexing of network assets.

---

X. Assessment

What Went Well

The progression from primitives → indexing → advanced structures → applications built comprehension naturally
The two-phase query pattern (index prune → exact refine) is now deeply internalized
Voronoi-Delaunay duality clicked as a unifying concept across mesh generation, spatial queries, and interpolation

What Could Be Deeper

3D computational geometry (convex hulls, Minkowski sums, mesh booleans) got lighter treatment — mostly 2D with 3D mentioned in passing
Robust geometric computation deserves its own unit; I treated it as a warning rather than a deep dive
Higher-dimensional geometry (curse of dimensionality, high-dim indexing) was touched only in ANN context

Operational Relevance

High. Spatial indexing and geofencing are directly applicable to infrastructure monitoring (device location tracking), knowledge graph visualization (layout and viewport queries), and any future robotics/IoT work. The patterns transfer: two-phase query, choose-your-index, defer-to-library-for-predicates.

Self-Assessment Score: 88/100

Strong treatment of core algorithms and practical application patterns. Solid synthesis across units. Dock points for: lighter 3D coverage, no hands-on PostGIS/Shapely code artifact (stayed conceptual), and the robustness topic deserved more depth. The Voronoi-Delaunay and spatial join sections are the strongest.

---

Completed: 2026-03-02 | 8 units + dissertation | Topic #32 in the AutoStudy series