AI & Data Scientist | Environmental Justice, Climate & Energy
Building intelligent machine learning systems for environmental justice, climate resilience, and sustainable energy. Specializing in geospatial analysis, satellite imagery, and data-driven solutions for a better planet.
About Me
Background
I'm an AI and Data Scientist with a focus on environmental justice, climate science, and energy systems. My work combines machine learning, geospatial analysis, and satellite imagery processing to build data-driven solutions that address sustainability challenges and promote environmental equity.
I create accessible, open-source tools that enable communities and organizations to leverage climate data and AI for informed decision-making on environmental issues.
Core Expertise
Tools: TensorFlow, PyTorch, GDAL, Rasterio, Jupyter, Vercel, GitHub, Google AI Studio
Featured Projects
Filter by topic to explore my recent work in AI, environmental justice, and geospatial analysis.
Articles & Insights
Technical case studies and analysis on AI, energy infrastructure, and data engineering.
AnchorFlow: Technical Case Study
Unified Telemetry & Intelligence Mesh for US Natural Gas Infrastructure
1. The Problem: The High Cost of Data Fragmentation
The United States natural gas system is a $100 billion industrial backbone that increasingly behaves like a real-time network—but is monitored as if it were a filing cabinet [web:84][web:86]. Market participants are forced to stitch together insights from disconnected systems: Electronic Bulletin Boards (EBBs) for pipeline operations, maritime AIS streams for LNG cargoes, and financial feeds for spot and futures prices. The result is a grid that moves at the speed of physics, managed with tools that move at the speed of paperwork.
When a compressor station in the Permian Basin trips offline, the first signal is often a terse notice buried deep in an EBB, formatted for compliance rather than comprehension. Minutes later, physical flows re-route, pressures drop, and the Waha Hub can see its basis spread blow out as local gas is stranded behind a bottleneck. Yet the causal chain—what failed, where it failed, and how that failure propagated downstream—is typically reconstructed hours after the price has already moved. The grid is effectively running a latency penalty between physical reality and human understanding.
This is the core pathology of data fragmentation. Each system—pipelines, LNG terminals, tankers, trading desks—holds a partial, time-shifted view of reality [web:87][web:91]. Analysts build ad hoc spreadsheets and custom dashboards, but these are static snapshots of a dynamic infrastructure. The structural asymmetry is simple: volatility in natural gas markets is now measured in seconds and minutes; infrastructure auditing is still measured in hours and days. The cost shows up in widened bid-ask spreads, mispriced basis risk, and delayed operational responses when it matters most.
AnchorFlow was designed to collapse this gap. It treats the entire network—pipes, tankers, hubs, and indices—as a single Infrastructure Mesh, fusing telemetry into a coherent, time-aligned picture. Instead of asking humans to reconcile siloed feeds, AnchorFlow delivers a continuously updated, machine-reasoned view of the grid that makes infrastructure risk and opportunity legible in real time.
2. The Engine: Grounded Intelligence & Gemini 3 Pro
AnchorFlow's core design principle is Grounded Intelligence: every analytical statement must trace back to an observable signal. Where conventional generative AI systems are optimized for eloquence, AnchorFlow is optimized for Industrial Fidelity. Its reasoning engine, powered by Gemini 3 Pro, operates under a Zero-Hallucination Protocol that converts a general-purpose model into a constrained analytical instrument.
The Validation Gate
Every insight passes through a three-part Validation Gate before it reaches the user:
Telemetry Check: The platform compares current pipeline throughput and pressures against historical baselines and 95th-percentile capacity estimates. If the data does not confirm a constraint, the model cannot claim one.
Contextual Grounding: The engine scans for real-time maintenance and capacity notices in EBB postings and related operational sources. If an outage or derate is not documented or strongly implied, the system must label its assessment as uncertain or refrain from causal attribution.
Market Echo: AnchorFlow cross-references physical constraints against observed price action, including regional hub prices, Henry Hub benchmarks, and basis spread dynamics. An inferred bottleneck that leaves no trace in price or dispatch patterns is treated with heightened skepticism.
Only when all three tests align—telemetry anomaly, documented context, and market echo—does the platform assert a causal narrative. Otherwise, it delivers qualified diagnostics or explicitly says, in effect, "the data does not support that conclusion yet."
The "Anchor" Effect
Gemini 3 Pro's extended context window becomes a structural advantage in this architecture [web:90][web:92]. With up to a million tokens of context, AnchorFlow can ingest EBB streams from multiple interstate pipelines, feedgas telemetry from LNG terminals, AIS tracks from dozens of LNG carriers, and multi-hub price histories into a single reasoning pass. That enables global pattern recognition across the Infrastructure Mesh.
For example, a constraint emerging in Transco Zone 5 in the Northeast can be analyzed in the same context as changing feedgas flows into Sabine Pass on the Gulf Coast. Rather than treating them as separate events, AnchorFlow can evaluate whether increased Gulf export demand is tightening flows along the Transco corridor, raising utilization levels that convert into basis pressure hundreds of miles away. This is the Anchor effect: local anomalies interpreted through a global, grounded context.
Hardened Security, Industrial Posture
Because infrastructure intelligence is operationally sensitive, AnchorFlow embeds a hardened security layer:
- AES-256 encryption for all telemetry streams and query traffic
- Strict separation between system prompts, user input, and external content to mitigate prompt injection and cross-tenant data leakage
- Policy controls that prevent the model from inventing sources or fabricating operational data, favoring "no answer" over a plausible but ungrounded explanation
The result is an AI layer that behaves less like a chatbot and more like a calibrated instrument—always anchored to what the data can actually support.
3. Pillar I: Pipeline Telemetry & Capacity Inference
Traditional pipeline dashboards answer the question "What is flowing right now?" AnchorFlow is built to answer the more consequential question: "What could flow—and what cannot?"
95th-Percentile Practical Capacity
Pipeline nameplate capacity is often proprietary, outdated, or expressed in ranges that are operationally imprecise [web:89]. AnchorFlow sidesteps this by inferring Practical Capacity from observed behavior. It analyzes rolling 30-day throughput and derives the 95th-percentile flow level for each segment. That empirical benchmark becomes the capacity reference used in its risk models.
This approach adapts as the system changes. If operators debottleneck a segment by adding compression or looping pipe, observed flows will climb and the 95th-percentile threshold rises. Conversely, equipment derates or chronic maintenance will drag the Practical Capacity lower, tightening the room available before a constraint is reached. Users see not just static limits, but capacity as it lives in the real grid.
Severity Audit Ranking & Constraint Nodes
AnchorFlow runs a continuous Severity Audit Ranking across the network. For each pipeline segment and node:
- It computes current utilization as a fraction of 95th-percentile Practical Capacity
- It identifies Constraint Nodes where utilization exceeds 85% of that benchmark, a level empirically associated with congestion risk and heightened basis volatility
- It ranks nodes by severity, persistence, and systemic importance (e.g., key interconnects, high-volume corridors, or nodes feeding major demand centers)
These rankings are visualized as a live constraint heatmap, making it immediately obvious where the system is approaching its stress limits.
Case Context: Permian Basin 2024
Consider the Permian Basin in 2024, when associated gas production climbed toward 20.7 Bcf/d and new takeaway projects like the Matterhorn Express Pipeline entered service. In a conventional workflow, analysts would discover saturation after the fact—once spread compression and negative Waha pricing began surfacing in screens and news feeds.
With AnchorFlow, the story is different:
- Within weeks of Matterhorn's September 2024 launch, the system would observe throughput converging toward its inferred Practical Capacity
- Utilization breaching the 85% threshold would flag Matterhorn as a rising Constraint Node, even before negative pricing reappeared
- Traders and risk managers monitoring the Permian corridor would receive early signals that incremental production was again colliding with finite takeaway, setting the conditions for a basis blowout
Instead of reacting to headlines, stakeholders would see the bottleneck forming in the telemetry layer and could reposition ahead of the market.
4. Pillar II: Maritime–Infrastructure Synergy (AIS)
As the United States has become a leading LNG exporter, the functional endpoint of the natural gas grid is no longer the last onshore meter. It is the tanker manifold of an LNG carrier several miles offshore. AnchorFlow's second pillar connects offshore logistics to onshore constraints through a maritime mesh.
Proximity Logic & Maritime Mesh
AnchorFlow ingests real-time AIS streams and classifies vessels of interest—LNG carriers, typically AIS vessel types 70–79. The Proximity Logic engine continuously:
- Detects LNG vessels within a configurable radius (e.g., 20 km) of coastal infrastructure nodes and LNG export terminals
- Correlates these positions with terminal nominal capacity, berth availability, and historical loading patterns
- Identifies clusters of vessels that may indicate congestion, weather avoidance, or synchronized loading schedules
Each vessel becomes a moving node in the broader mesh, linked to specific terminals and, through those terminals, to upstream pipelines.
Predictive Dwell Times & Loading Latency
By correlating AIS tracks with upstream feedgas levels and pipeline load factors, AnchorFlow can infer how efficiently terminals are converting pipeline gas into LNG cargoes. It learns typical dwell time distributions for different terminals and operational states.
When conditions deviate—for instance:
- Three LNG carriers are moored at Corpus Christi
- Upstream pipeline segments feeding the terminal are operating at 98% of Practical Capacity
- Feedgas flow patterns indicate that the terminal is near its operational ceiling
AnchorFlow flags a Loading Latency Risk. It can project that cargo loading times will extend beyond normal ranges, potentially delaying departures and disrupting onward schedules. This is actionable intelligence for traders managing destination-flexible cargoes, shipowners scheduling fleet deployment, and pipeline operators anticipating backpressure effects if terminal intake falls behind planned levels.
In effect, the maritime layer becomes a real-time barometer of coastal throughput stress, tightly coupled to the onshore telemetry fabric.
5. Pillar III: Market Volatility & Causal Analysis
AnchorFlow treats price as a diagnostic signal, not the primary object of analysis. The central thesis: price moves are the symptom; physical constraints are the disease.
Dual-Layer Volatility: SMA-7 and SMA-20
To structure this diagnosis, the platform overlays two simple moving averages:
SMA-7 (7-day): Highlights rapid price inflections driven by transient events—short outages, weather anomalies, or brief scheduling failures.
SMA-20 (20-day): Filters noise and reveals structural shifts—persistent takeaway deficits, new capacity coming online, or demand regime changes.
When the SMA-7 crosses above or below the SMA-20, AnchorFlow marks a Volatility Event. But instead of leaving that pattern as a purely statistical artifact, the engine immediately interrogates the Infrastructure Mesh.
Causal Correlation Engine
Upon detecting a Volatility Event, the Causal Correlation Engine:
- Scans pipeline telemetry for new or intensifying Constraint Nodes
- Checks EBBs for fresh maintenance postings or force majeure notices
- Examines AIS and terminal data for feedgas anomalies or loading disruptions
- Considers regional weather signals and seasonal demand profiles when relevant
The system then assembles a causal hypothesis: for instance, "Henry Hub is spiking because import flows from a particular Appalachian interconnect are operating at 100% inferred capacity under cold-weather demand, not simply because of generic 'winter storms.'"
This fusion—volatility signals tied directly to physical diagnostics—allows users to distinguish between narrative-driven price action and moves grounded in hard constraints. In practice, that is the difference between chasing noise and acting on structural signals.
6. Strategic Use Cases
AnchorFlow's architecture unlocks distinct advantages for different classes of stakeholders, all anchored in the same telemetry-driven intelligence mesh.
For Energy Traders: From Reactive Hedging to Predictive Positioning
Traders traditionally discover basis risk when it shows up in P&L. With AnchorFlow, they can see the conditions for a basis blowout forming before it hits the tape.
- Constraint Nodes exceeding 85% utilization in a production basin signal tightening takeaway
- Maritime congestion at key terminals hints at temporary dampening or amplification of coastal demand
- Volatility Events accompanied by clear causal pathways support conviction in directional trades or structured hedges
Instead of merely reacting with defensive hedges, traders can build proactive positions that anticipate how the physical grid will re-price spreads over days and weeks.
For Grid Operators: Early Warning on Fuel Switching
Grid operators balancing generation portfolios need an early view into gas deliverability. AnchorFlow's load persistence metrics—how long a segment has remained in a high-utilization regime—provide a leading indicator of stress.
- In a winter surge, persistent high load factors on key corridors can precede capacity allocations and pressure drops
- When pipeline telemetry and market signals indicate sustained strain, operators can preemptively plan fuel switching—ramping coal, nuclear, hydro, or storage—before gas supply reliability degrades
The result is a smoother response curve, fewer emergency calls, and a more resilient grid under peak conditions.
For Infrastructure Investors: Quantifying Real-World Impact
For investors evaluating midstream projects, the key question is not just "How big is the pipe?" but "What did it actually fix?"
AnchorFlow provides a before-and-after lens:
- Prior to an expansion, it characterizes the frequency, location, and severity of Constraint Nodes in a corridor
- Post-commissioning, it tracks whether constraints have disappeared, shifted, or multiplied downstream
- It quantifies reductions in basis volatility and constraint persistence as realized value, not just as theoretical capacity
This enables investors and developers to validate that a project delivered genuine system relief rather than simply moving congestion 50 miles down the line.
7. Conclusion: The Era of the Intelligence Mesh
The next decade of energy infrastructure will not be defined solely by more steel in the ground. It will be defined by how intelligently that steel is monitored, interpreted, and orchestrated [web:88]. AnchorFlow embodies this shift—a move from disjointed monitoring tools to a unified, Grounded Intelligence layer that treats the entire US natural gas system as a living telemetry fabric.
By enforcing a Zero-Hallucination Protocol, binding every claim to verifiable signals, and continuously reconciling pipelines, tankers, and markets in a single reasoning context, AnchorFlow transforms the grid from an opaque collection of assets into a transparent, queryable Intelligence Mesh. The payoff is tangible: earlier warning of constraints, clearer attribution of volatility, more precise investment decisions, and a marked reduction in the informational asymmetry that has long defined the sector.
In an industry where seconds of latency can mean millions in mispriced risk, AnchorFlow's mission is straightforward: ensure that insight arrives as fast as the molecules move.
Technical Specifications (Conceptual Profile)
- Core Engine: Google Gemini 3 Pro-based analytical instrument
- Security: AES-256 encryption with hardened prompt and context isolation
- Data Latency Target: ~15-second refresh cycle for key telemetry streams
- Capacity Inference Model: Rolling 30-day, 95th-percentile Practical Capacity calculation across pipeline segments