Texas Real Estate Sourcing GIS Pipeline
GOAL
RESULT
PROJECT DURATION
Create a reproducible GIS workflow to identify, screen, and rank real estate sourcing opportunities across the Dallas–Fort Worth market using public spatial data and transparent proxy scoring logic.
Delivered a QA-tested spatial pipeline that screened 2,401 candidate areas, ranked 25 priority sites, excluded waterbody and edge-fragment risks, and produced static maps, GeoJSON/GPKG exports, audit tables, and a public GitHub-ready workflow.
3-day focused portfolio sprint, covering data preparation, spatial screening, scoring, map production, QA validation, documentation, and GitHub publication.
2,401 Candidate Zones
1,000m Analyst-Defined Grid
2,161 Qualified Screening Areas
240 Disqualified Risk Areas
Top 25 Ranked Candidate Sites
Census TIGER/Line + HUD Context Data
EPSG:32138 Texas North Central CRS
This project transforms public geospatial data into a reproducible GIS screening pipeline for identifying and ranking proxy real estate candidate sites across Dallas County and the wider Dallas–Fort Worth market.
Texas Real Estate Sourcing GIS Pipeline
Project Summary
This project is a reproducible GIS-based real estate sourcing pipeline designed for Dallas–Fort Worth, Texas. It screens analyst-defined candidate areas, applies spatial constraints, ranks qualified locations with a frozen proxy scoring model, and produces portfolio-ready GIS outputs including static maps, GeoJSON exports, a GeoPackage, CSV audit tables, and an interactive Folium web map.
The goal was not to create an official parcel acquisition model, but to demonstrate a professional spatial analysis workflow: source preparation, validation, candidate screening, scoring, QA, reproducibility, documentation, and map-based communication.
Problem
Real estate sourcing workflows often rely on fragmented map checks, manual spreadsheet scoring, and inconsistent site comparisons. This makes it difficult to explain why one candidate location is stronger than another, especially when spatial constraints and contextual factors must be considered together.
This project addresses that problem by creating a structured GIS pipeline that can generate, screen, score, rank, and audit candidate locations in a repeatable way.
Solution
The pipeline creates analyst-defined proxy candidate areas across Dallas County and evaluates them using spatial context, constraint avoidance, submarket context, developable geometry, and policy/incentive context.
The final model uses a frozen v2 scoring framework and produces a Top 25 ranked candidate set. The workflow includes waterbody exclusion, edge-fragment QA, rank stability checks, source preflight checks, and repository QA to make the results easier to reproduce and review.
Technical Workflow
The workflow includes:
-
Census and contextual GIS source preparation
-
Dallas County and DFW study area processing
-
ZCTA-based submarket proxy creation
-
Candidate grid generation
-
Candidate screening and disqualification rules
-
Waterbody exclusion QA
-
Edge-fragment QA
-
Frozen v2 proxy scoring model
-
Ranked candidate exports
-
Platform-ready GeoJSON and GeoPackage outputs
-
Interactive Folium web map generation
-
Static portfolio map exports
-
Repository and reproducibility QA
Scoring Model
The final scoring model is versioned as:
v2_professional_proxy_screening_limited
Final scoring weights:
-
Developable geometry score: 30%
-
Constraint avoidance score: 25%
-
Spatial context score: 25%
-
Submarket context score: 15%
-
Opportunity incentive score: 5%
School district context is retained as neutral metadata only and is not used as a ranking variable. Opportunity Zone context is treated only as a policy/incentive proxy with limited weight.
QA & Reproducibility
The final release includes a dedicated analytical QA evidence package.
Key QA results:
-
Total candidates: 2,401
-
Qualified candidates: 2,161
-
Disqualified candidates: 240
-
Top 25 ranked candidates: 25
-
Waterbody-disqualified candidates: 200
-
Top 25 max water overlap ratio: 0.0
-
Top 25 centroid-inside-water count: 0
-
Edge-fragment candidates: 50
-
Top 25 edge-fragment failures: 0
-
Repository QA: 54/54 checks passed
The project also includes a one-command full pipeline runner and source preflight checks for release-grade reproducibility.
Key Outputs
The project produces:
-
Ranked candidate CSV
-
Top 25 candidate GeoJSON
-
Platform-ready GeoJSON exports
-
GeoPackage exports
-
Static PNG/PDF portfolio maps
-
Interactive Folium web map
-
QA audit tables
-
Analytical QA summary
-
Reproducibility and repository QA reports
Tools Used
-
Python
-
GeoPandas
-
Pandas
-
Shapely
-
Folium
-
Matplotlib
-
Contextily
-
QGIS-compatible GeoJSON/GPKG outputs
-
Git / GitHub
Important Limitations
This is a portfolio-grade GIS screening pipeline, not an official parcel acquisition recommendation.
Candidate geometries are analyst-defined grid proxies, not official parcels. The model does not include ownership, parcel valuation, zoning, utilities, legal access, engineering feasibility, FEMA flood analysis, road accessibility, or full land-use suitability analysis.
Basemap layers are used for visualization only. The scoring model should be interpreted as a proxy screening framework, not investment advice.
Result
The final project demonstrates a complete GIS analysis workflow: from source planning and spatial validation to candidate screening, ranked outputs, QA evidence, reproducibility checks, and portfolio-ready map communication.

