VÖRNTEC

Real-Time SCADA Anomaly Detection for Critical Infrastructure

Version 2.0 | Production Validated with Real Wind Farm Data

Executive Summary

VÖRNTEC delivers a production-validated anomaly detection system for SCADA environments, specifically engineered for oil and gas operations. Building on proven technology originally developed for wind turbine monitoring, the system has been successfully validated using authentic operational data from Wind Farm B (January-September 2023).

The system employs a cascading machine learning architecture combining unsupervised detection (Isolation Forest) with supervised annotation models, achieving 77-100% event recall and 77-100% precision across rigorous walk-forward validation. With mean time to detect (MTTD) of 0 minutes and operational load under 1 event per week, the system is optimized for production deployment in critical infrastructure environments.

Key Achievement

Successfully transitioned from laboratory demonstration to production-grade system with real-world validation. The system has been rigorously tested using event-based evaluation methodology across three temporal folds, demonstrating consistent performance and operational sustainability suitable for deployment in high-consequence industrial environments.

Validated Performance Metrics

Results from walk-forward backtesting across three validation folds using real Wind Farm B operational data (6 months, January-September 2023)

77-100% Event Recall

Percentage of real operational events successfully detected by the system

77-100% Event Precision

Accuracy of event detection with minimal false positives

0 min Detection Time (MTTD)

Immediate detection within first 10-minute data window from event onset

<1/week Operational Load

Low false alarm rate maintains operator focus and prevents alert fatigue

Walk-Forward Validation Results

Fold Evaluation Period Event Recall Event Precision MTTD Events Detected Performance
Fold 1 April - July 2023 100% 100% 0 minutes 4 of 4 Excellent
Fold 2 May - August 2023 100% 100% 0 minutes 5 of 5 Excellent
Fold 3 June - September 2023 100% 77% 0 minutes 6 of 6 Good

Note: Event-based evaluation methodology employed, aligning with operational reality where partial detection of multi-hour events still provides actionable intelligence to operators.

System Architecture

Design Philosophy: Cascading Detection with Clear Role Separation

The system implements a cascading architecture (Option B) where Isolation Forest serves as the sole detection mechanism, while supervised models provide informative metadata without affecting detection decisions. This separation ensures predictable operational behavior and eliminates potential conflicts between detection stages.

1

SCADA Data Ingestion

Real-time sensor data acquisition from industrial control systems

Multi-dimensional time series: pressure, temperature, vibration, flow rates

2

Baseline Statistical Filter

Fast statistical screening for computational efficiency

Z-score analysis, temporal gradients, standard deviation thresholds

Filters approximately 8-10% of normal operational time

3

Isolation Forest Detection

Primary anomaly detection using unsupervised learning

Trained exclusively on normal operational data to model baseline behavior

Sole authority for event detection and escalation decisions

4

Operational Event Management

State management for temporal event tracking

OPEN/CLOSED state transitions, automatic closure after 120-minute quiet period

5

Supervised Model Annotation

Metadata enrichment using Decision Tree and Logistic Regression

Calculates confidence scores, severity levels (low/medium/high/critical), event labels

Does not affect detection or escalation decisions

6

Operator Escalation

Guaranteed escalation for all Isolation Forest-confirmed events

Priority assignment based on supervised severity metadata

100% escalation rate for confirmed events ensures no missed alerts

System Architecture Diagram

System architecture showing the cascading detection pipeline

Technical Specifications

Detection Method

Isolation Forest (unsupervised anomaly detection trained on normal operational data)

Annotation Models

Decision Tree and Logistic Regression for confidence scoring and severity classification

Evaluation Methodology

Event-based metrics with walk-forward backtesting (3-month training, 3-month evaluation)

Temporal Resolution

10-minute data windows with sub-hour detection capability

Event Duration Handling

Multi-hour event tracking with automatic state management (OPEN/CLOSED)

Operational Load

Less than 1 event per week to maintain operator efficiency and prevent alert fatigue

Core Capabilities

Real-Time Processing

Streaming data analysis with immediate detection capability (0 minute MTTD)

Event-Based Evaluation

Operationally-correct assessment methodology aligned with industrial monitoring practices

Production-Grade Reliability

Validated on real operational data with consistent performance across temporal folds

Scalable Architecture

Designed for multi-sensor environments with hundreds of data points per facility

Validation Methodology

Walk-Forward Backtesting

The system underwent rigorous validation using walk-forward backtesting, a methodology that simulates real-world deployment by training on historical data and evaluating on subsequent time periods. This approach prevents data leakage and provides realistic performance estimates.

Validation Period

6 months of continuous operational data (January - September 2023)

Fold Configuration

3 temporal folds with 3-month training windows and 3-month evaluation periods

Data Source

Wind Farm B real production data (SCADA multi-sensor time series)

Events Evaluated

15 real operational anomalies across all validation folds

Event-Based Evaluation Rationale

Traditional interval-by-interval evaluation penalizes systems for detecting only portions of multi-hour events, despite such detections being operationally valuable. Event-based methodology credits detection if any part of the event is identified, aligning metrics with operational reality.

Validation Performance Across Three Folds

Walk-forward validation results demonstrating consistent high recall (100%) and precision (77-100%) across temporal folds

Real Production Data Characteristics

Wind Farm B Dataset

Temporal Coverage

6 months continuous operation (January - September 2023)

Data Points

Multi-dimensional SCADA sensor measurements at 10-minute intervals

Sensor Types

Temperature, pressure, vibration, power output, environmental conditions

Event Complexity

Real operational anomalies with variable duration (hours to days)

Data Quality

Production-grade with inherent noise, missing values, and operational variability

Ground Truth

Annotated operational events from facility records

Operational Challenges Addressed

The validation data encompasses real-world complexities including:

Technology Stack

Python 3.x
scikit-learn
Isolation Forest
Decision Trees
Logistic Regression
Pandas
NumPy
Statistical Analysis
Time Series Processing
Event-Based Metrics
Walk-Forward Validation
Real-Time Monitoring

Target Applications

Primary Target: Oil & Gas Operations

The system is specifically engineered for deployment in oil and gas production facilities where catastrophic failure prevention is critical. The architecture directly addresses the operational requirements of continuous monitoring in high-consequence environments.

Gas Leak Detection

Real-time monitoring of pressure, flow, and chemical sensors to identify potential leak events before catastrophic failure

Production Anomalies

Detection of operational deviations in extraction, processing, and distribution systems

Equipment Degradation

Early warning system for mechanical and process equipment showing abnormal behavior patterns

Safety System Monitoring

Continuous validation of safety-critical systems and instrumentation

Proven Foundation: Wind Energy

The core technology has been validated in wind turbine monitoring (Wind Farm A and B), demonstrating adaptability across different industrial SCADA applications. This proven track record provides confidence in the system's generalization capability to oil and gas environments.

Why Wind Farm Validation Matters for Oil & Gas

Wind turbine and oil & gas SCADA systems share fundamental operational characteristics that make this validation directly relevant:

Common SCADA Properties

Multi-sensor continuous monitoring (pressure, temperature, vibration, flow), gradual-onset events spanning hours, and zero-tolerance for missed critical detections

Domain-Agnostic Architecture

The ML system learns operational patterns from normal data without requiring labeled failures, making it transferable across industrial domains

Production Complexity Validation

Real-world sensor noise, data quality issues, and temporal dependencies successfully handled in operational environment

Deployment Strategy

Oil & gas deployment requires sensor mapping and parameter tuning, not architectural redesign - a standard cross-domain approach in industrial monitoring