Testing Strategy ================= BigPlanet uses a comprehensive two-tier testing approach that balances speed, reliability, and API compatibility verification. Overview -------- The BigPlanet test suite consists of: - **133 unit tests** using synthetic VPLanet data (~1 second total runtime) - **12 integration tests** using real VPLanet simulations (several minutes runtime) - **74% code coverage** and growing This hybrid approach provides: 1. **Fast feedback** during development (unit tests complete in <1 second) 2. **Platform independence** (unit tests work without VPLanet binary installed) 3. **API compatibility verification** (integration tests detect VPLanet format changes) 4. **Comprehensive coverage** (both synthetic edge cases and real-world scenarios) Unit Tests: Synthetic Data --------------------------- Location ~~~~~~~~ All unit tests are in ``tests/unit/``: - ``test_read.py`` - Configuration file parsing (22 tests) - ``test_process.py`` - Log file and output file processing (31 tests) - ``test_extract.py`` - Data extraction and statistics (42 tests) - ``test_filter.py`` - Key categorization (25 tests) - ``test_bigplanet.py`` - CLI argument parsing (15 tests) Why Synthetic Data? ~~~~~~~~~~~~~~~~~~~ Unit tests use hand-crafted VPLanet output files created in test fixtures. This approach provides: **Speed** - Unit test suite completes in <1 second - No VPLanet binary execution required - Ideal for rapid development iteration **Platform Independence** - Tests pass without VPLanet installation - Works on fresh developer machines - Compatible with minimal CI environments - No dependency on VPLanet version or build configuration **Test Isolation** - Tests verify BigPlanet code only, not VPLanet correctness - Clear failure attribution (BigPlanet bug vs VPLanet bug) - Deterministic results across all platforms **Edge Case Coverage** - Easy to create malformed files for error handling tests - Can test extreme values without long simulations - Complete control over input structure Example: Synthetic Log File ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Unit tests use fixtures like this:: @pytest.fixture def minimal_vplanet_log(tempdir): """Hand-crafted VPLanet log file for testing.""" log_content = \"\"\" ---- FINAL SYSTEM PROPERTIES ---- System Age: 4.500000e+09 years ----- BODY: earth ------ Mass (Mearth): 1.000000e+00 Radius (Rearth): 1.000000e+00 \"\"\" log_file = tempdir / "earth.log" log_file.write_text(log_content) return log_file This synthetic approach means: - **No VPLanet execution** - file created instantly - **Controlled values** - test exactly what you need - **Portable** - works on any system Fixtures and Test Data ~~~~~~~~~~~~~~~~~~~~~~ All synthetic test data is created using pytest fixtures in: - ``tests/conftest.py`` - Shared fixtures for all tests - ``tests/fixtures/generators.py`` - Helper functions to create simulation structures Key fixtures include: ``tempdir`` Temporary directory (pathlib.Path) automatically cleaned up after each test ``minimal_simulation_dir`` Complete simulation directory structure with synthetic VPLanet files:: test_sims/sim_00/ ├── vpl.in # Primary input file ├── earth.in # Body input file ├── earth.log # Synthetic log file └── earth.earth.forward # Synthetic forward evolution file ``sample_vplanet_help_dict`` Mock VPLanet parameter dictionary (replaces ``vplanet -H`` output):: { "dMass": { "Type": "Double", "Dimension(s)": "mass", "Default value": "0.0" }, ... } ``mock_vplanet_help`` Monkeypatch fixture that mocks ``GetVplanetHelp()`` across all modules Running Unit Tests ~~~~~~~~~~~~~~~~~~ Run all unit tests:: pytest tests/unit/ -v Run with coverage report:: pytest tests/unit/ --cov=bigplanet --cov-report=term-missing Run specific test file:: pytest tests/unit/test_read.py -v Run specific test:: pytest tests/unit/test_read.py::TestReadFile::test_read_file_basic -v Integration Tests: Real VPLanet -------------------------------- Test Scenarios ~~~~~~~~~~~~~~ Integration tests are in scenario-specific directories:: tests/CreateHDF5/ # Archive creation tests/SingleSim/ # Single simulation archive tests/ExtractArchive/ # Archive extraction tests/ExtractFilterArchive/ # Filtered archive extraction tests/ExtractFilterRawData/ # Raw data filtering tests/Stats/ # Statistical aggregations tests/UlyssesAggregated/ # Ulysses MCMC format tests/UlyssesForward/ # Ulysses forward mode tests/MD5CheckSum/ # Deprecated MD5 tests tests/Bpstatus/ # Status checking tests/Fletcher32CheckSum/ # HDF5 checksum verification What They Test ~~~~~~~~~~~~~~ Integration tests run the complete BigPlanet pipeline: 1. **vspace** - Generate VPLanet input parameter space 2. **multiplanet** - Run VPLanet simulations (real binary execution) 3. **bigplanet** - Create archive or filtered files 4. **Verification** - Check output correctness Example: CreateHDF5 Integration Test ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The ``tests/CreateHDF5/test_CreateHDF5.py`` test: 1. Runs ``vspace vspace.in`` to generate 3 simulation directories 2. Runs ``multiplanet vspace.in`` to execute 3 VPLanet simulations 3. Runs ``bigplanet bpl.in -a`` to create archive 4. Verifies archive contains all expected data **VPLanet Configuration:** - Modules: ``radheat`` + ``thermint`` (thermal evolution) - Duration: 4.5 billion years - Runtime per simulation: ~1.1 seconds - Total test time: ~5 seconds (3 sims + overhead) VPLanet API Contract Verification ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Integration tests **automatically detect VPLanet API changes**: **Log File Format Changes** If VPLanet changes how it writes log files (e.g., different parameter names, different formatting), BigPlanet's ``ProcessLogFile()`` will fail to parse the output, causing integration tests to fail. **Forward File Format Changes** If VPLanet changes time series output formats, ``ProcessOutputfile()`` will fail to read forward evolution data. **Help Output Changes** If ``vplanet -H`` output format changes, ``GetVplanetHelp()`` will fail to parse parameter metadata. This provides **implicit API contract testing** - any breaking change in VPLanet's output format is immediately detected by integration test failures. Requirements ~~~~~~~~~~~~ Integration tests require: - VPLanet binary in ``PATH`` or ``/Users/rory/src/vplanet/bin/vplanet`` - vspace installed (``pip install vspace``) - multiplanet installed (``pip install multi-planet``) Running Integration Tests ~~~~~~~~~~~~~~~~~~~~~~~~~ Run all integration tests:: pytest tests/CreateHDF5/ tests/SingleSim/ tests/ExtractArchive/ -v Run specific scenario:: pytest tests/CreateHDF5/test_CreateHDF5.py -v **Warning:** Integration tests take several minutes due to VPLanet simulation time. Why Not Use Real VPLanet in Unit Tests? ---------------------------------------- We **intentionally avoid** using real VPLanet in unit tests because: Time Cost ~~~~~~~~~ - **Current unit tests:** <1 second - **With real VPLanet (best case):** 22 seconds (20 VPLanet runs × 1.1 sec) - **With real VPLanet (worst case):** 146 seconds (133 tests × 1.1 sec) - **CI/CD pipeline:** 12 test matrix cells → 4-29 minutes vs <12 seconds Environmental Dependencies ~~~~~~~~~~~~~~~~~~~~~~~~~~ Unit tests would fail on: - Fresh developer machines before VPLanet installation - CI environments without VPLanet pre-installed - Systems with incompatible VPLanet versions - Platforms with different VPLanet compilation flags Platform Variability ~~~~~~~~~~~~~~~~~~~~ VPLanet behavior can vary across: - Python versions (3.9 vs 3.13 subprocess differences) - Operating systems (macOS Intel vs ARM vs Ubuntu 22.04 vs 24.04) - VPLanet versions (development branch vs release) - Numerical precision (floating point rounding differences) This creates **flaky tests** that pass sometimes and fail other times. Test Isolation Violation ~~~~~~~~~~~~~~~~~~~~~~~~~ Unit tests should test **BigPlanet code**, not VPLanet correctness. **Current approach:** Tests verify BigPlanet can parse *any* valid VPLanet output format **With real VPLanet:** Test failures could indicate VPLanet bugs, BigPlanet bugs, or both - making debugging difficult Non-Determinism ~~~~~~~~~~~~~~~ Real VPLanet may have: - Floating point rounding variations across platforms - Different numerical integration behavior - Platform-specific SIMD optimizations - Build-specific configurations This makes tests **unreliable** and hard to debug. Coverage Strategy ----------------- Current Coverage by Module ~~~~~~~~~~~~~~~~~~~~~~~~~~ :: Module Coverage Lines Tests Uncovered Lines ================================================================ read.py 92% 179 22 Lines 66-67, 77-80, etc. process.py 83% 322 31 Lines 241-270, 346-423 extract.py 73% 242 42 Lines 416-451 archive.py 67% 117 30 Lines 61-88 (multiprocessing) filter.py 40% 121 25 Lines 208-271 (orchestration) bigplanet.py 24% 49 15 Lines 35-73 (deleterawdata) bpstatus.py 0% 35 0 Not yet tested ================================================================ TOTAL 74% 1107 133 Coverage Goals ~~~~~~~~~~~~~~ **Target: 90% coverage** through: 1. **Unit test expansion** (in progress) - filter.py refactoring and testing - archive.py multiprocessing tests - bpstatus.py basic tests - Edge case coverage in process.py 2. **Integration test maintenance** (complete) - Already covers all major workflows - Provides VPLanet API compatibility verification - No additional integration tests needed 3. **Synthetic fixtures** (ongoing) - Expand fixtures in conftest.py - Add edge case generators - Mock multiprocessing where possible GitHub Actions CI/CD -------------------- BigPlanet uses GitHub Actions to test across multiple environments: Test Matrix ~~~~~~~~~~~ :: Python Versions: 3.9, 3.10, 3.11, 3.12 Operating Systems: Ubuntu 22.04, Ubuntu 24.04, macOS-13 (Intel), macOS-14 (ARM) Total combinations: 4 × 4 = 16 test matrix cells Unit Tests in CI ~~~~~~~~~~~~~~~~ **Always run** on every commit: - Fast (<1 second per matrix cell) - Platform independent (synthetic data) - No VPLanet installation required - Catches BigPlanet regressions immediately Integration Tests in CI ~~~~~~~~~~~~~~~~~~~~~~~~ **Run selectively** (comprehensive CI only): - Require VPLanet binary installation - Take several minutes per matrix cell - Verify VPLanet API compatibility - Run before releases or major merges Running Tests Locally ---------------------- Quick Development Feedback ~~~~~~~~~~~~~~~~~~~~~~~~~~ During active development, run unit tests only:: pytest tests/unit/ -v This provides sub-second feedback for BigPlanet code changes. Pre-Commit Verification ~~~~~~~~~~~~~~~~~~~~~~~ Before committing, run unit tests with coverage:: pytest tests/unit/ --cov=bigplanet --cov-report=term-missing Verify coverage hasn't decreased and new code is tested. Pre-Push Verification ~~~~~~~~~~~~~~~~~~~~~ Before pushing to GitHub, run integration tests:: pytest tests/CreateHDF5/ tests/SingleSim/ tests/ExtractArchive/ -v This catches any VPLanet API incompatibilities. Full Test Suite ~~~~~~~~~~~~~~~ To run everything (takes several minutes):: pytest tests/ -v This runs all 133 unit tests + 12 integration tests. Test Development Guidelines ---------------------------- When writing new tests, follow these principles: 1. **Unit tests use synthetic data** - Create fixtures in conftest.py - Hand-craft minimal VPLanet output files - Mock external dependencies (GetVplanetHelp, subprocess calls) 2. **Integration tests use real VPLanet** - Create new test scenario directory - Include VPLanet input files (vpl.in, body.in) - Run full vspace → multiplanet → bigplanet pipeline 3. **Test one thing at a time** - Unit tests should test a single function or behavior - Use Given-When-Then documentation format - Assert specific expected outcomes 4. **Keep functions testable** - Functions <20 lines are easier to test - Pure functions (no side effects) are easiest to test - Extract complex logic into separate testable functions 5. **Use descriptive test names** - ``test_process_log_file_with_grid_output_order()`` is better than ``test_log()`` - Name should describe the scenario being tested - Someone should understand what's tested without reading the code Example: Adding a New Test ~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Good - Unit test with synthetic data:** .. code-block:: python def test_process_log_file_missing_body_section(tempdir): \"\"\" Given: Log file without a body section When: ProcessLogFile is called Then: Returns empty data dictionary \"\"\" log_content = \"\"\" ---- FINAL SYSTEM PROPERTIES ---- System Age: 0.0 sec \"\"\" log_file = tempdir / "test.log" log_file.write_text(log_content) result = process.ProcessLogFile("test.log", {}, str(tempdir)) assert result == {} **Bad - Unit test calling real VPLanet:** .. code-block:: python def test_process_log_file_real_vplanet(tempdir): # DON'T DO THIS in unit tests subprocess.run(["vplanet", "vpl.in"], cwd=tempdir) result = process.ProcessLogFile("earth.log", {}, str(tempdir)) # This belongs in integration tests, not unit tests Summary ------- BigPlanet's testing strategy provides: ✅ **Fast development iteration** (<1 second unit tests) ✅ **Platform independence** (synthetic data works everywhere) ✅ **VPLanet API verification** (integration tests catch format changes) ✅ **Comprehensive coverage** (74% → 90% goal) ✅ **Reliable CI/CD** (deterministic test results) The two-tier approach (synthetic unit tests + real integration tests) balances speed, reliability, and API compatibility verification without sacrificing any of these critical goals.