NEMDataTools Documentation
An MIT-licensed Python package for accessing and preprocessing data from the Australian Energy Market Operator (AEMO) for the National Electricity Market (NEM).
Overview
NEMDataTools provides a production-ready interface for:
Complete data pipeline: Download → Extract → Process → Cache → Analyze
Multi-source support: MMSDM, pre-dispatch, and static data
Advanced processing: Time series resampling, statistical analysis
Intelligent caching: Metadata-based local caching with configurable TTL
Production features: Error handling, retry logic, comprehensive testing
This package is designed for researchers, analysts, and developers who need reliable access to AEMO data.
Installation
From PyPI (Recommended)
pip install nemdatatools
From TestPyPI (Pre-releases)
pip install --index-url https://test.pypi.org/simple/ nemdatatools
From Source (Development)
# Clone the repository
git clone https://github.com/ZhipengHe/nemdatatools.git
cd nemdatatools
# Install in development mode with all dependencies
pip install -e ".[dev,docs]"
# Or install just the core package
pip install -e .
Requirements
Python 3.10 or higher
pandas, numpy, requests, pyarrow, tqdm
Quick Start
import nemdatatools as ndt
# Download and process dispatch price data with automatic caching
data = ndt.fetch_data(
data_type="DISPATCHPRICE",
start_date="2023/01/01",
end_date="2023/01/02",
regions=["NSW1", "VIC1"],
cache_path="./cache" # Enable local caching
)
# Data is already processed and standardized
print(f"Downloaded {len(data)} records")
print(data.head())
# Advanced analysis with built-in functions
stats = ndt.calculate_price_statistics(data)
resampled = ndt.resample_data(data, '1H') # Resample to hourly
windows = ndt.create_time_windows(data, window_size='4H') # 4-hour windows
Core Features
🚀 Complete Data Pipeline: Download → Extract → Process → Cache → Analyze in one API call
📊 Core Data Types: MMSDM dispatch data, pre-dispatch forecasts, with framework for expansion
⚡ Intelligent Caching: Metadata-based local caching with configurable TTL
🔄 Advanced Processing: Data standardization, time series resampling, statistical analysis
⏰ Time-Aware: Proper AEST timezone handling and dispatch interval management
🌏 Region Support: All NEM regions (NSW1, VIC1, QLD1, SA1, TAS1) with filtering
🛡️ Production Ready: Robust error handling, retry logic, comprehensive testing
Development Status
NEMDataTools has reached production readiness with core functionality complete and thoroughly tested.
✅ Completed Features
[x] Complete Data Pipeline
[x] Multi-source data downloading (MMSDM, pre-dispatch, static)
[x] ZIP file extraction and CSV processing
[x] Intelligent caching with metadata management
[x] End-to-end data standardization and validation
[x] Advanced Processing Capabilities
[x] Time series resampling and statistical analysis
[x] Price and demand calculation functions
[x] Time window creation for analysis
[x] AEST timezone and dispatch interval handling
[x] Production Infrastructure
[x] Comprehensive error handling and retry logic
[x] 79 test functions with 58% coverage
[x] Pre-commit hooks with Black, Ruff, MyPy
[x] GitHub Actions CI/CD pipeline
[x] Type annotations throughout codebase
🚧 In Progress
[ ] Data Type Expansion: Adding support for remaining MMSDM tables
[ ] Documentation: API reference and advanced usage guides
📋 Tested Data Types
Data Type |
Status |
Description |
---|---|---|
|
✅ Fully Tested |
5-minute dispatch prices by region |
|
✅ Fully Tested |
5-minute regional dispatch summary |
|
✅ Fully Tested |
Generator SCADA readings |
|
✅ Fully Tested |
Pre-dispatch price forecasts |
|
✅ Tested |
Direct CSV price and demand data |
|
⚠️ Framework Ready |
5-minute pre-dispatch (implementation complete, testing pending) |
Static Data Types |
✅ Framework Ready |
Registration lists and boundaries |
Documentation Structure
Development Guides: Setup instructions and development workflow
API Reference: Coming soon - detailed function documentation
Examples: Coming soon - working code examples and tutorials
API Reference
Core Functions
# Main data fetching function
data = ndt.fetch_data(
data_type="DISPATCHPRICE",
start_date="2023/01/01",
end_date="2023/01/02",
regions=["NSW1", "VIC1"],
cache_path="./cache"
)
# Check available data types
available_types = ndt.get_available_data_types()
# Batch operations
ndt.download_multiple_tables(
tables=["DISPATCHPRICE", "DISPATCHREGIONSUM"],
start_date="2023/01/01",
end_date="2023/01/02"
)
# Advanced analysis
stats = ndt.calculate_price_statistics(data)
resampled = ndt.resample_data(data, '1H')
windows = ndt.create_time_windows(data, window_size='4H')
License
NEMDataTools is released under the MIT License.