Selection Utilities API Documentation
This document provides comprehensive API documentation for the staggered selection utilities in MetaPulsar.
Table of Contents
Overview
The selection_utils module provides a modern, well-documented API for creating Enterprise-compatible selection functions. It replaces the legacy create_selection_stag function with improved functionality, better documentation, and enhanced type safety.
Key Features
Hierarchical Flag Selection: Support for staggered selection with automatic fallback
Single Flag Selection: Simple flag-based selection for basic use cases
Frequency Filtering: Optional frequency band filtering
Enterprise Compatibility: Full compatibility with
enterprise.signals.selections.SelectionType Safety: Complete type hints for better IDE support
Error Handling: Graceful handling of edge cases
API Reference
create_staggered_selection
def create_staggered_selection(
name: str,
flag_criteria: Dict[Union[str, Tuple[str, ...]], Optional[str]] = None,
freq_range: Optional[Tuple[float, float]] = None,
) -> Callable
Creates a staggered selection function for Enterprise.
Parameters
name : str
Base name for the selection (e.g., “efac”, “ecorr”, “band”)
Used as prefix for all generated selection names
Example:
"efac"→"efac_ASP_430","efac_ASP_800"
flag_criteria : Dict[Union[str, Tuple[str, ...]], Optional[str]], optional
Dictionary mapping flag specifications to target values
Key types:
str: Single flag name (e.g.,"group")Tuple[str, ...]: Staggered flags for fallback (e.g.,("group", "f"))
Value types:
None: Select all unique values (excluding empty strings)str: Select specific value only
Examples:
{"group": None}- Use all values of “group” flag{"group": "ASP_430"}- Use only “ASP_430” value of “group” flag{("group", "f"): None}- Use “group” if available, fallback to “f”{("group", "f"): "ASP_430"}- Use “group” if available and matches “ASP_430”, fallback to “f”
freq_range : Tuple[float, float], optional
Optional frequency range for filtering (low_freq, high_freq)
Only frequencies within this range will be selected
Range is inclusive of low_freq, exclusive of high_freq:
[low_freq, high_freq)Example:
(400, 1000)selects frequencies 400-999.999 MHz
Returns
Callable
Selection function compatible with
enterprise.signals.selections.SelectionFunction signature:
(flags: Dict[str, np.ndarray], freqs: np.ndarray) -> Dict[str, np.ndarray]Returns dictionary mapping selection names to boolean masks
Function Signature
The returned selection function has the signature:
def selection_function(flags: Dict[str, np.ndarray], freqs: np.ndarray) -> Dict[str, np.ndarray]:
"""
Selection function that works with Enterprise.
Args:
flags: Dictionary of flag arrays (e.g., {"group": ["ASP_430", "ASP_800"], "B": ["1", "2"]})
freqs: Array of frequencies
Returns:
Dictionary mapping selection names to boolean masks
"""
_create_selections_for_flag
def _create_selections_for_flag(
flag_values: np.ndarray,
target_value: Optional[str],
base_name: str,
freq_mask: np.ndarray,
) -> Dict[str, np.ndarray]
Internal helper function for creating selection masks for a specific flag.
Parameters
flag_values : np.ndarray
Array of flag values for the current flag
target_value : Optional[str]
Target value to match (None means all values)
base_name : str
Base name for the selection
freq_mask : np.ndarray
Frequency mask to apply
Returns
Dict[str, np.ndarray]
Dictionary mapping selection names to boolean masks
Usage Examples
Basic Flag Selection
from metapulsar.selection_utils import create_staggered_selection
# Simple group-based selection (all values)
group_sel = create_staggered_selection("efac", {"group": None})
# Specific value selection
specific_sel = create_staggered_selection("efac", {"group": "ASP_430"})
# Test with mock data
flags = {"group": np.array(["ASP_430", "ASP_800", "ASP_430"])}
freqs = np.array([100.0, 200.0, 300.0])
result = group_sel(flags, freqs)
# Returns: {"efac_ASP_430": [True, False, True], "efac_ASP_800": [False, True, False]}
Staggered Selection
# Staggered selection with fallback
staggered_sel = create_staggered_selection("ecorr", {("group", "f"): None})
# Test with both flags available (uses 'group')
flags_both = {
"group": np.array(["ASP_430", "ASP_800"]),
"f": np.array(["GASP_430", "GASP_800"])
}
result_both = staggered_sel(flags_both, freqs)
# Returns: {"ecorr_ASP_430": [True, False], "ecorr_ASP_800": [False, True]}
# Test with only fallback flag (uses 'f')
flags_fallback = {"f": np.array(["GASP_430", "GASP_800"])}
result_fallback = staggered_sel(flags_fallback, freqs)
# Returns: {"ecorr_GASP_430": [True, False], "ecorr_GASP_800": [False, True]}
Frequency Filtering
# Frequency band selection
band_sel = create_staggered_selection(
"band",
{"group": None},
freq_range=(400, 1000) # 400-1000 MHz band
)
flags = {"group": np.array(["ASP_430", "ASP_800", "ASP_430"])}
freqs = np.array([100.0, 500.0, 1500.0]) # Only 500.0 is in range
result = band_sel(flags, freqs)
# Returns: {"band_ASP_430": [False, True, False], "band_ASP_800": [False, False, False]}
Multiple Flag Criteria
# Multiple criteria
multi_sel = create_staggered_selection("efac", {
"pta": "EPTA", # EPTA-specific
("group", "f"): None # All groups with fallback
})
flags = {
"pta": np.array(["EPTA", "PPTA", "EPTA"]),
"group": np.array(["ASP_430", "ASP_800", "ASP_430"]),
"f": np.array(["GASP_430", "GASP_800", "GASP_430"])
}
freqs = np.array([100.0, 200.0, 300.0])
result = multi_sel(flags, freqs)
# Returns: {
# "efac_ASP_430": [True, False, True], # From group flag
# "efac_ASP_800": [False, False, False] # From group flag
# }
Enterprise Integration
Basic Integration
from enterprise.signals.selections import Selection
from metapulsar.selection_utils import create_staggered_selection
# Create selection function
efac_sel = create_staggered_selection("efac", {"group": None})
# Wrap with Enterprise Selection
selection = Selection(efac_sel)
# Create selection instance with pulsar
selection_instance = selection(pulsar)
masks = selection_instance.masks
# Use in Enterprise model
from enterprise.signals import white_signals
white_signal = white_signals.MeasurementNoise(efac=selection)
Parameter Generation
# Generate parameters for Enterprise
params, param_masks = selection_instance("efac", lambda x: f"param_{x}")
# params: {"efac_ASP_430_efac": "param_pulsar_name_efac_ASP_430_efac", ...}
# param_masks: {"efac_ASP_430_efac": [True, False, True], ...}
Advanced Integration
# Complex selection with multiple criteria
complex_sel = create_staggered_selection("efac", {
("group", "f"): None, # Staggered selection
"pta": "EPTA" # PTA-specific
})
# Use in Enterprise model
white_signal = white_signals.MeasurementNoise(
efac=Selection(complex_sel),
log10_efac=Uniform(-10, 10)
)
Migration Guide
From Legacy create_selection_stag
The new API is designed to be a drop-in replacement for the legacy function:
# Legacy code
from legacy.metapulsar import create_selection_stag
legacy_sel = create_selection_stag("efac", {"group": None}, lowfreq=400, highfreq=1000)
# New API
from metapulsar.selection_utils import create_staggered_selection
new_sel = create_staggered_selection("efac", {"group": None}, freq_range=(400, 1000))
Key Differences
Function name:
create_selection_stag→create_staggered_selectionFrequency parameters:
(lowfreq, highfreq)→freq_range=(low_freq, high_freq)Type hints: Full type annotations for better IDE support
Documentation: Comprehensive docstrings and examples
Error handling: Improved error handling for edge cases
Migration Steps
Update imports:
# Old from legacy.metapulsar import create_selection_stag # New from metapulsar.selection_utils import create_staggered_selection
Update function calls:
# Old sel = create_selection_stag("efac", {"group": None}, lowfreq=400, highfreq=1000) # New sel = create_staggered_selection("efac", {"group": None}, freq_range=(400, 1000))
Test compatibility: Verify that selections work identically with your data
Performance Considerations
Memory Usage
Selection functions use numpy arrays for efficient memory usage
Boolean masks are created on-demand to minimize memory footprint
Large datasets (>10,000 TOAs) may benefit from chunked processing
Computational Complexity
Time complexity: O(n) where n is the number of TOAs
Space complexity: O(n × m) where m is the number of unique flag values
Frequency filtering adds minimal overhead
Optimization Tips
Pre-filter data: Apply frequency filtering early if possible
Use specific values: Specify target values instead of
Nonewhen possibleMinimize flag criteria: Use only necessary flag criteria
Batch processing: Process multiple pulsars in batches for large datasets
Troubleshooting
Common Issues
1. Empty Selection Results
Problem: Selection returns empty dictionary
result = sel_func(flags, freqs) # Returns: {}
Causes:
No flags match the criteria
All flag values are empty strings
Frequency range excludes all TOAs
Solutions:
Check flag names and values
Verify frequency range parameters
Use
Nonefor target_value to select all values
2. Missing Flag Values
Problem: Expected flag values not found
# Expected "ASP_430" but got different values
result = sel_func({"group": np.array(["ASP_800", "ASP_1400"])}, freqs)
Solutions:
Check actual flag values in your data
Use
Nonefor target_value to select all available valuesVerify flag names are correct
3. Enterprise Integration Issues
Problem: Selection doesn’t work with Enterprise
# TypeError: selection_function() missing 1 required positional argument
Solutions:
Ensure function signature is
(flags, freqs)Check that pulsar has required attributes
Verify Enterprise version compatibility
Debugging Tips
Test selection function directly:
result = sel_func(flags, freqs) print(f"Selection keys: {list(result.keys())}") print(f"Mask shapes: {[mask.shape for mask in result.values()]}")
Check flag values:
print(f"Available flags: {list(flags.keys())}") for flag_name, values in flags.items(): print(f"{flag_name}: {np.unique(values)}")
Verify frequency range:
print(f"Frequency range: {freqs.min():.1f} - {freqs.max():.1f}") print(f"TOAs in range: {((freqs >= 400) & (freqs < 1000)).sum()}")
Error Messages
Error |
Cause |
Solution |
|---|---|---|
|
Missing required arguments |
Provide |
|
Flag not found in flags dictionary |
Check flag names in your data |
|
Array shape mismatch |
Verify flag arrays have same length as freqs |
|
Mask length mismatch |
Check that all arrays have same length |
Contributing
When contributing to the selection utilities:
Add tests for new functionality
Update documentation for API changes
Maintain backward compatibility when possible
Follow type hints for better IDE support
Test with Enterprise to ensure compatibility
License
This module is part of MetaPulsar and follows the same license terms.