ASTROMER
Models
Single-Band Encoder
The Single-Band Encoder represents the main class of the models, which load, fit, encode and train the preprocessed weights.
It took every single-band light curve that may vary between different stars, and this depends on the objectives of the survey being carried out.
The X is a set of observations of a celestial object over time (such as a star). Each observation had two characteristics: the magnitude (brightness) of the object and the Modified Julian Date (MJD) when the observation was made.
We propose to use learned representations of a transformer-based encoder to create embeddings that represent the variability of objects in dk.dimensional space. Making easy to fine-tune the model weights to match other surveys and use them to solve downstream task, such as classification or regression.
- class ASTROMER.models.SingleBandEncoder(num_layers=2, d_model=200, num_heads=2, dff=256, base=10000, dropout=0.1, maxlen=100, batch_size=None)[source]
Bases:
object
This class is a transformer-based model that process the input and generate a fixed-size representation Since each light curve has two characteristics (magnitude and time) we transform into embeddings Z = 200x256.
The maximum number of observations remain fixed and masked, so every Z had the same length even if some light curves are shorter than others.
- Parameters:
num_layer (Integer) – Number of self-attention blocks or transformer layers in the encoder.
d_model (Integer) – Determines the dimensionality of the model’s internal representation (must be divisible by ‘num_heads’).
num_heads (Integer) – Number of attention heads used in an attention layer.
dff (Integer) – Number of neurons for the fully-connected layer applied after the attention layers. It consists of two linear transformations with a non-linear activation function in between.
base (Float32) – Value that defines the maximum and minimum wavelengths of the positional encoder (see equation 4 on Oliva-Donoso et al. 2022). Is used to define the range of positions the attention mechanism uses to compute the attention weights.
dropout (Float32) – Regularization applied to output of the fully-connected layer to prevent overfitting. Randomly dropping out (i.e., setting to zero) some fraction of the input units in a layer during training.
maxlen (Integer) – Maximum length to process in the encoder. It is used in the SingleBandEncoder class to limit the input sequences’ length when passed to the transformer-based model.
batch_size (Integer) – Number of samples to be used in a forward pass. Note an epoch is completed when all batches were processed (default none).
- encode(dataset, oids_list=None, labels=None, batch_size=1, concatenate=True)[source]
This method encodes a dataset of light curves into a fixed-dimensional embedding using the ASTROMER encoder. The method first checks the format of the dataset containing the light curves.
Then, it loads the dataset using predefined functions from the ‘data’ module. In this part, if a light curve contains more than 200 observations, ASTROMER will divide it into shorter windows of 200 length.
After loading data, the data pass through the encoder layer to obtain the embeddings.
- Parameters:
dataset – The input data to be encoded. It can be a list of numpy arrays or a tensorflow dataset.
oids_list (List) – list of object IDs. Since ASTROMER can only process fixed sequence of 200 observations, providing the IDs allows the model to concatenate windows when the length of the objects is larger than 200.
labels – an optional list of labels for the objects associated to the input dataset.
batch_size – the number of samples to be used in a forward-pass within the encoder. Default is 1.
concatenate (Boolean) – a boolean indicating whether to concatenate the embeddings of objects with the same ID into a single vector.
- Returns:
- fit(train_batches, valid_batches, epochs=2, patience=40, lr=0.001, project_path='.', verbose=0)[source]
The ‘fit()’ method trains ASTROMER for a given number of epochs. After each epoch, the model’s performance is evaluated on the validation data, and the training stops if there is no improvement in a specified number of epochs (patience).
- Parameters:
train_batches (Object) – Training data already formatted as TF.data.Dataset
valid_batches (Object) – Validation data already formatted as TF.data.Dataset
epochs (Integer) – Number of training loops in where all light curves have been processed.
patience (Integer) – The number of epochs with no improvement after which training will be stopped.
lr (Float32) – A float specifying the learning rate
project_path – Path for saving weights and training logs
verbose (Integer) – if non zero, progress messages are printed. Above 50, the output is sent to stdout. The frequency of the messages increases with the verbosity level. If it more than 10, all iterations are reported.”
- Returns:
- from_pretraining(name='macho')[source]
Loads a pre-trained model with pre-trained weights for a specific astronomical dataset. This method allows users to easily load pre-trained models for astronomical time-series datasets and use them for their purposes.
This method checks if you have the weights locally, if not then downloads and then uploads them.
- Parameters:
name – Corresponds to the name of the survey used to pre-train ASTROMER. The name of the survey should match with the name of the zip file in https://github.com/astromer-science/weights
- Returns:
- load_weights(weights_folder)[source]
The ‘load_weights()’ method loads pre-trained parameters into the model architecture. The method loads the weights from the file located at {weights_folder}/weights directory, which is assumed to be in TensorFlow checkpoint format.
- Parameters:
weights_folder – the path to the folder containing the pre-trained weights.
- Returns:
Preprocessing
- ASTROMER.preprocessing.make_pretraining(input, batch_size=1, shuffle=False, sampling=False, max_obs=100, msk_frac=0.0, rnd_frac=0.0, same_frac=0.0, repeat=1, **numpy_args)[source]
Load and format data to feed ASTROMER model. It can process either a list of bumpy arrays or tf.records. At the the end of this method, a tensorflow dataset is generated following the preprocessing pipeline explained in Section 5.3 (Donoso-Oliva, et al. 2022)
- Parameters:
input (object) – The data set containing the light curves.
batch_size (Integer) – This integer determines the number of subsets that we will pass to our model.
shuffle (Boolean) – A boolean indicating whether to rearrange samples randomly
sampling (Boolean) – A Boolean that when is true, indicates the model to take samples of every light curve instead of all observation samples.
max_obs (Integer) – This Integer indicates how big each lightcurve sample will be. e.g. (with max_obs = 100): The length of a light curve is 720 observations so the model will generate 7 blocks of 100 observations, and the sample with 20 cases will be completed using padding with zero values after the last point in order to obtain a sequence of length 100.
msk_frac (Float32) – The fraction of samples that will be masked by the model
rnd_frac (Float32) – The fraction of samples in which their values will be changed by random numbers.
same_frac (Float32) – It is the fraction of the masked observations that you unmask and allow to be processed in the attention layer
repeat (Integer) – This Integer determines the number of times the same data set is repeated.
Utils
- ASTROMER.utils.download_weights(url, target)[source]
This method delivers the weights requested in the SingleBandEncoder class using the method ‘from_pretraining()’ that specifies the available surveys; ‘macho’, ‘atlas’ and ‘ztfg’. The UTILS module it’s a set of functions that allow performing functions not considered in models and preprocessing.
This code provides a simple and convenient way to download and extract zipped files from a URL to a specified directory using Python.
- Parameters:
url –
target –
Quick-start
Install
First, install the ASTROMER wheel using pip
pip install ASTROMER
from ASTROMER.models import SingleBandEncoder
Then initiate
model = SingleBandEncoder()
model = model.from_pretraining('macho')
It will automatically download the weights from this public github repository and load them into the SingleBandEncoder instance. Assuming you have a list of vary-lenght (numpy) light curves.
import numpy as np
samples_collection = [ np.array([[5200, 0.3, 0.2],
[5300, 0.5, 0.1],
[5400, 0.2, 0.3]]),
Light curves are Lx3 matrices with time, magnitude, and magnitude std. To encode samples use:
attention_vectors = model.encode(samples_collection,
oids_list=['1', '2'],
batch_size=1,
concatenate=True)
Fine Tune
ASTROMER can be easly trained by using the fit. It include
from ASTROMER import SingleBandEncoder
model = SingleBandEncoder(num_layers= 2,
d_model = 256,
num_heads = 4,
dff = 128,
base = 1000,
dropout = 0.1,
maxlen = 200)
model.from_pretrained('macho')
where,
num_layers: Number of self-attention blocks
d_model: Self-attention block dimension (must be divisible by num_heads)
num_heads: Number of heads within the self-attention block
dff: Number of neurons for the fully-connected layer applied after the attention blocks
base: Positional encoder base (see formula)
dropout: Dropout applied to output of the fully-connected layer
maxlen: Maximum length to process in the encoder
Notice you can ignore model.from_pretrained(‘macho’) for clean training.
mode.fit(train_data,
validation_data,
epochs=2,
patience=20,
lr=1e-3,
project_path='./my_folder',
verbose=0)
where,
train_data: Training data already formatted as tf.data
validation_data: Validation data already formatted as tf.data
epochs: Number of epochs for training
patience: Early stopping patience
lr: Learning rate
project_path: Path for saving weights and training logs
verbose: (0) Display information during training (1) don’t
train_data and validation_data should be loaded using load_numpy or pretraining_records functions. Both functions are in the ASTROMER.preprocessing module.
For large datasets is recommended to use Tensorflow Records see this tutorial to execute our data pipeline