ecpet.ecengine module¶
EC-PeT Processing Engine¶
Main orchestration module for complete eddy-covariance data processing workflows. Coordinates the entire processing pipeline from raw data ingestion through final flux calculations and quality-controlled output generation. Provides command-line interface and stage-based processing with resumption capabilities.
- Key Features:
Stage-based processing with checkpoint/resume functionality
Parallel processing support for large datasets
Automatic time range detection from raw data files
Column mapping for TOA5 datalogger formats
SQLite database for intermediate data management
Comprehensive progress reporting and logging
Command-line interface with configurable verbosity
- Command Line Interface:
Supports processing control, parallel execution settings, logging verbosity adjustment, and stage-specific restart capabilities for efficient workflow management.
- ecpet.ecengine.check_raw_columns(conf)¶
Resolve TOA5 column specifications from numbers to database names.
- Parameters:
conf (object) – Configuration object with column specifications
- Returns:
Updated configuration with resolved column names
- Return type:
Converts column numbers to standardized database column names by scanning TOA5 file headers. Handles both slow and fast data channels, validates consistency across multiple files, and updates configuration accordingly.
- ecpet.ecengine.read_raw_data(conf)¶
Ingest raw data files into SQLite database for processing.
- Parameters:
conf (object) – Configuration object with file paths and format settings
Reads TOA5 files specified in configuration into database tables. Separates fast (high-frequency) and slow (reference) measurements into different tables with parallel processing support.
- ecpet.ecengine.get_start_end(conf)¶
Determine processing time range from configuration or auto-detect from data.
- Parameters:
conf (object) – Configuration object with date settings
- Returns:
Updated configuration with resolved time range
- Return type:
Auto-detects time range from database if not specified in configuration. Updates DateBegin and DateEnd parameters for subsequent processing stages.
- ecpet.ecengine.collectdata(conf)¶
Stage 0: Collect and ingest raw data files into processing database.
- Parameters:
conf (object) – Configuration object with file specifications
- Returns:
Updated configuration with resolved parameters
- Return type:
Expands file patterns, auto-detects time ranges, resolves column mappings, and ingests all specified data files into SQLite database.
- ecpet.ecengine.write_output(conf, intervals)¶
Stage 5: Generate final output files from processed intervals.
- Parameters:
conf (object) – Configuration object with output settings
intervals (pandas.DataFrame) – DataFrame with processed interval results
Creates standardized output files including quality flags, flux data, and comprehensive results in multiple formats.
- ecpet.ecengine.process(conf, startat)¶
Main processing pipeline coordinator with stage-based execution.
- Parameters:
Orchestrates complete processing workflow with checkpoint capabilities. Each stage saves results to database, enabling restart at any point without reprocessing earlier stages.
- ecpet.ecengine.cli()¶
Command-line interface for EC-PeT processing engine.
Provides argument parsing for processing control, parallel execution settings, verbosity levels, and stage-specific restart capabilities. Initializes logging, configuration, and database before starting the main processing pipeline.