The Photo-z (PZ) Server is an online service available for the LSST Community to host and share lightweight photo-z related data products. The upload and download of data and metadata can be done at the website pz-server.linea.org.br (during the development phase, a test environment is available at pz-server-dev.linea.org.br. There, you will find two separate pages containing a list of data products each: one for LSST Data Management's oficial data products, and other for user-generated data products. The registered data products can also be accessed directly from Python code using the PZ Server's data access API, as demonstrated below.
The PZ Server is developed and delivered as part of the in-kind contribution program BRA-LIN, from LIneA to the Rubin Observatory's LSST. The service is hosted in the Brazilian IDAC, not directly connected to the Rubin Science Platform (RSP). However, it requires RSP credentials for user's authentication. For a comprehensive documentation about the PZ Server, please visit the PZ Server's documentation page. There, you will find also an overview of all LIneA's contributions related to Photo-zs. The internal documentation of the API functions is available on the API's documentation page.
To upload a data product, click on the button NEW PRODUCT on the top left of the User-generated Data Products page and fill in the Upload Form with relevant metadata.
The photo-z-related products are organized into four categories (product types):
To download a data product available on the Photo-z Server, go to one of the two pages by clicking on the card "LSST PZ Data Products" (for official products released by LSST DM Team) or "User-generated Data Products" (for products uploaded by the members of LSST community. The download button is on the left side of each data product (each row of the list).
Using pip
The PZ Server API is avalialble on pip as pz-server-lib
. To install the API and its dependencies, type, on the Terminal:
$ pip install pzserver
For developers
Alternatively, if you have cloned the repository with:
$ git clone https://github.com/linea-it/pzserver.git
To install the API and its dependencies, type:
$ pip install -e .
$ pip install .[dev]
OBS: You might need to restart the kernel on the notebook to incorporate the new library.
from pzserver import PzServer
import matplotlib.pyplot as plt
%reload_ext autoreload
%autoreload 2
The connection with the PZ Server from Python code is done by an object of the class PzServer
. To get authorization to define an instance of PzServer
, the users must provide an API Token generated on the top right menu on the PZ Server website.
pz_server = PzServer(token="<paste your API Token here>", host="pz-dev") # "pz-dev" is the temporary host for test phase
The object pz_server
just created above can provide access to data and metadata stored in the PZ Server. It also brings useful methods for users to navigate through the available contents. The methods with the preffix get_
return the result of a query on the PZ Server database as a Python dictionary, and are most useful to be used programatically (see detaials on the API documentation page). Alternatively, those with the preffix display_
show the results as a styled Pandas DataFrames, optimized for Jupyter Notebook (note: column names might change in the display version). For instance:
Display the list of product types supported with a short description;
pz_server.display_product_types()
Product type | Description |
---|---|
Spec-z Catalog | Catalog of spectroscopic redshifts and positions (usually equatorial coordinates). |
Training Set | Training set for photo-z algorithms (tabular data). It usually contains magnitudes, errors, and true redshifts. |
Validation Results | Results of a photo-z validation procedure (free format). Usually contains photo-z estimates (single estimates and/or pdf) of a validation set and photo-z validation metrics. |
Photo-z Table | Results of a photo-z estimation procedure. If the data is larger than the file upload limit (200MB), the product entry stores only the metadata (instructions on accessing the data should be provided in the description field. |
Display the list of users who uploaded data products to the server;
pz_server.display_users()
GitHub username | name |
---|---|
crisingulani | Cristiano Singulani |
drewoldag | Drew Oldag |
glaubervila | Glauber Costa Vila-Verde |
gschwend | Julia Gschwend |
gverde | |
singulani |
Display the list of data releases available at the time;
pz_server.display_releases()
Release | Description |
---|---|
LSST DP0 | LSST Data Preview 0 |
Display all data products available (WARNING: this list can rapdly grow during the survey's operation).
pz_server.display_products_list()
id | internal_name | product_name | product_type | release | uploaded_by | official_product | pz_code | description | created_at |
---|---|---|---|---|---|---|---|---|---|
14 | 14_gama_specz_subsample | GAMA spec-z subsample | Spec-z Catalog | None | gschwend | False | A small subsample of the GAMA DR3 spec-z catalog (Baldry et al. 2018) as an example of a typical spec-z catalog from the literature. | 2023-03-29T20:02:45.223568Z | |
13 | 13_vvds_specz_subsample | VVDS spec-z subsample | Spec-z Catalog | None | gschwend | False | A small subsample of the VVDS spec-z catalog (Le Fèvre et al. 2004, Garilli et al. 2008) as an example of a typical spec-z catalog from the literature. | 2023-03-29T19:50:27.593735Z | |
12 | 12_goldenspike_knn | Goldenspike KNN | Validation Results | None | gschwend | False | KNN | Results of photoz validation using KNN on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository. | 2023-03-29T19:49:35.652295Z |
11 | 11_goldenspike_flexzboost | Goldenspike FlexZBoost | Validation Results | None | gschwend | False | FlexZBoost | Results of photoz validation using FlexZBoost on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository. | 2023-03-29T19:48:34.864629Z |
10 | 10_goldenspike_bpz | Goldenspike BPZ | Validation Results | LSST DP0 | gschwend | False | BPZ | Results of photoz validation using BPZ on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository. | 2023-03-29T19:42:04.424990Z |
9 | 9_goldenspike_train_data_hdf5 | Goldenspike train data hdf5 | Training Set | None | gschwend | False | A mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. Test upload of files in hdf5 format. | 2023-03-29T19:12:59.746096Z | |
8 | 8_goldenspike_train_data_fits | Goldenspike train data fits | Training Set | None | gschwend | False | A mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. Test upload of files in fits format. | 2023-03-29T19:09:12.958883Z | |
7 | 7_goldenspike_train_data_parquet | Goldenspike train data parquet | Training Set | None | gschwend | False | A mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. Test upload of files in parquet format. | 2023-03-29T19:06:58.473920Z | |
6 | 6_simple_training_set | Simple training set | Training Set | LSST DP0 | gschwend | False | A simple example training set created based on the Jupyter notebook simple_pz_training_set.ipynb created by Melissa Graham, available in the repository delegate-contributions-dp02. The file contains coordinates, redshifts, magnitudes, and errors, as an illustration of a typical training set for photo-z algorithms. | 2023-03-23T19:46:48.807872Z | |
1 | 1_simple_true_z_catalog | Simple true z catalog | Spec-z Catalog | None | gschwend | False | A simple example of a spectroscopic (true) redshifts catalog created based on the Jupyter notebook simple_pz_training_set.ipynb created by Melissa Graham, available in the repository delegate-contributions-dp02. The file contains only coordinates and redshifts, as an illustration of a typical spec-z catalog. | 2023-03-23T13:19:32.050795Z |
The information about product type, users, and releases shown above can be used to filter the data products of interest for your search. For that, the method list_products
receives as argument a dictionary mapping the products attributes to their values.
pz_server.display_products_list(filters={"release": "LSST DP0",
"product_type": "Training Set"})
id | internal_name | product_name | product_type | release | uploaded_by | official_product | pz_code | description | created_at |
---|---|---|---|---|---|---|---|---|---|
6 | 6_simple_training_set | Simple training set | Training Set | LSST DP0 | gschwend | False | A simple example training set created based on the Jupyter notebook simple_pz_training_set.ipynb created by Melissa Graham, available in the repository delegate-contributions-dp02. The file contains coordinates, redshifts, magnitudes, and errors, as an illustration of a typical training set for photo-z algorithms. | 2023-03-23T19:46:48.807872Z |
It also works if we type a string pattern that is part of the value. For instance, just "DP0" instead of "LSST DP0":
pz_server.display_products_list(filters={"release": "DP0"})
id | internal_name | product_name | product_type | release | uploaded_by | official_product | pz_code | description | created_at |
---|---|---|---|---|---|---|---|---|---|
10 | 10_goldenspike_bpz | Goldenspike BPZ | Validation Results | LSST DP0 | gschwend | False | BPZ | Results of photoz validation using BPZ on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository. | 2023-03-29T19:42:04.424990Z |
6 | 6_simple_training_set | Simple training set | Training Set | LSST DP0 | gschwend | False | A simple example training set created based on the Jupyter notebook simple_pz_training_set.ipynb created by Melissa Graham, available in the repository delegate-contributions-dp02. The file contains coordinates, redshifts, magnitudes, and errors, as an illustration of a typical training set for photo-z algorithms. | 2023-03-23T19:46:48.807872Z |
It also allows the search for multiple strings by adding the suffix __or
(two underscores + "or") to the search key. For instance, to get spec-z catalogs and training sets in the same search (notice that filtering is not case sensitive):
pz_server.display_products_list(filters={"product_type__or": ["Spec-z Catalog", "training set"]})
id | internal_name | product_name | product_type | release | uploaded_by | official_product | pz_code | description | created_at |
---|---|---|---|---|---|---|---|---|---|
14 | 14_gama_specz_subsample | GAMA spec-z subsample | Spec-z Catalog | None | gschwend | False | A small subsample of the GAMA DR3 spec-z catalog (Baldry et al. 2018) as an example of a typical spec-z catalog from the literature. | 2023-03-29T20:02:45.223568Z | |
13 | 13_vvds_specz_subsample | VVDS spec-z subsample | Spec-z Catalog | None | gschwend | False | A small subsample of the VVDS spec-z catalog (Le Fèvre et al. 2004, Garilli et al. 2008) as an example of a typical spec-z catalog from the literature. | 2023-03-29T19:50:27.593735Z | |
9 | 9_goldenspike_train_data_hdf5 | Goldenspike train data hdf5 | Training Set | None | gschwend | False | A mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. Test upload of files in hdf5 format. | 2023-03-29T19:12:59.746096Z | |
8 | 8_goldenspike_train_data_fits | Goldenspike train data fits | Training Set | None | gschwend | False | A mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. Test upload of files in fits format. | 2023-03-29T19:09:12.958883Z | |
7 | 7_goldenspike_train_data_parquet | Goldenspike train data parquet | Training Set | None | gschwend | False | A mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. Test upload of files in parquet format. | 2023-03-29T19:06:58.473920Z | |
6 | 6_simple_training_set | Simple training set | Training Set | LSST DP0 | gschwend | False | A simple example training set created based on the Jupyter notebook simple_pz_training_set.ipynb created by Melissa Graham, available in the repository delegate-contributions-dp02. The file contains coordinates, redshifts, magnitudes, and errors, as an illustration of a typical training set for photo-z algorithms. | 2023-03-23T19:46:48.807872Z | |
1 | 1_simple_true_z_catalog | Simple true z catalog | Spec-z Catalog | None | gschwend | False | A simple example of a spectroscopic (true) redshifts catalog created based on the Jupyter notebook simple_pz_training_set.ipynb created by Melissa Graham, available in the repository delegate-contributions-dp02. The file contains only coordinates and redshifts, as an illustration of a typical spec-z catalog. | 2023-03-23T13:19:32.050795Z |
To fetch the results of a search and attribute to a variable, just change the preffix display_
by get_
, like this:
search_results = pz_server.get_products_list(filters={"product_type": "results"}) # PZ Validation results
search_results
[{'id': 12, 'release': None, 'release_name': None, 'product_type': 3, 'product_type_name': 'Validation Results', 'uploaded_by': 'gschwend', 'is_owner': True, 'internal_name': '12_goldenspike_knn', 'display_name': 'Goldenspike KNN', 'official_product': False, 'pz_code': 'KNN', 'description': "Results of photoz validation using KNN on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository.", 'created_at': '2023-03-29T19:49:35.652295Z', 'status': 1}, {'id': 11, 'release': None, 'release_name': None, 'product_type': 3, 'product_type_name': 'Validation Results', 'uploaded_by': 'gschwend', 'is_owner': True, 'internal_name': '11_goldenspike_flexzboost', 'display_name': 'Goldenspike FlexZBoost', 'official_product': False, 'pz_code': 'FlexZBoost', 'description': "Results of photoz validation using FlexZBoost on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository.", 'created_at': '2023-03-29T19:48:34.864629Z', 'status': 1}, {'id': 10, 'release': 1, 'release_name': 'LSST DP0', 'product_type': 3, 'product_type_name': 'Validation Results', 'uploaded_by': 'gschwend', 'is_owner': True, 'internal_name': '10_goldenspike_bpz', 'display_name': 'Goldenspike BPZ', 'official_product': False, 'pz_code': 'BPZ', 'description': "Results of photoz validation using BPZ on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository.", 'created_at': '2023-03-29T19:42:04.424990Z', 'status': 1}]
The metadata of a given data product is the information provided by the user on the upload form. This information is attached to the data product contents and is available for consulting on the PZ Server page or using this Python API (pz-server-lib
).
All data products stored on PZ Server are identified by a unique id number or an unique name, a string called internal_name, which is created automatically at the moment of the upload by concatenating the product id to the name given by its owner (replacing blank spaces by "_", lowering cases, and removing special characters).
The PzServer
's method get_product_metadata()
returns a dictionary with the attibutes stored in the PZ Server about a given data product identified by its id or internal_name. For use in a Jupyter notebook, the equivalent display_product_metadata()
shows the results in a formated table.
# pz_server.display_product_metadata(<id (int or str) or internal_name (str)>)
# pz_server.display_product_metadata(6)
# pz_server.display_product_metadata("6")
pz_server.display_product_metadata("6_simple_training_set")
key | value |
---|---|
id | 6 |
release | LSST DP0 |
product_type | Training Set |
uploaded_by | gschwend |
internal_name | 6_simple_training_set |
product_name | Simple training set |
official_product | False |
pz_code | |
description | A simple example training set created based on the Jupyter notebook simple_pz_training_set.ipynb created by Melissa Graham, available in the repository delegate-contributions-dp02. The file contains coordinates, redshifts, magnitudes, and errors, as an illustration of a typical training set for photo-z algorithms. |
created_at | 2023-03-23T19:46:48.807872Z |
main_file | simple_pz_training_set.csv |
To download any data product stored in the PZ Server, use the PzServer
's method download_product
informing the product's internal_name and the path to where it will be saved (the default is the current folder). This method downloads a compressed .zip file which contais all the files uploaded by the user, including data, anciliary files and description files. The time spent to download a data product depends on the internet connections between the user and the host. Let's try it with a small data product.
pz_server.download_product(14, save_in=".")
Connecting to PZ Server... File saved as: ./14_gama_specz_subsample_af889.zip Done!
All data products uploaded to the PZ Server are imediately available and visible to all PZ Server users (people with RSP credentials) through the PZ Server website or via the API. Besides informing the product id or internal_name for programatic access, another way to share a data product is providing the product's URL, which leads to the product's download page. The URL is composed by the PZ Server website address + /products/ + internal_name:
https://pz-server.linea.org.br/product/ + internal_name
or, if still in the development phase,
https://pz-server-dev.linea.org.br/product/ + internal_name
For example:
https://pz-server-dev.linea.org.br/product/6_simple_training_set
WARNING: The URL works only with the internal name, not with the id number.
Another feature of the PZ Server API is to let users retrieve the contents of a given data product to work on memory (by atributing the results of the method get_product()
to a variable in the code). This feature is available only for tabular data (product types: Spec-z Catalog and Training Set).
By default, the method get_product
returns an object from a particular class, depending on the product's type. The classes SpeczCatalog
and TrainingSet
are simple extensions of pandas.DataFrame
(via class composition) with a couple of additional attributes and methods, such as the attribute metadata
, and the method display_metadata()
. Let's see an example:
catalog = pz_server.get_product(8)
catalog
Connecting to PZ Server... Done!
<pzserver.catalog.TrainingSet at 0x7f38a35e1030>
catalog.display_metadata()
key | value |
---|---|
id | 8 |
release | None |
product_type | Training Set |
uploaded_by | gschwend |
internal_name | 8_goldenspike_train_data_fits |
product_name | Goldenspike train data fits |
official_product | False |
pz_code | |
description | A mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. Test upload of files in fits format. |
created_at | 2023-03-29T19:09:12.958883Z |
main_file | goldenspike_train_data.fits |
The tabular data is alocated in the attribute data
, which is a pandas.DataFrame
.
catalog.data
redshift | mag_u_lsst | mag_err_u_lsst | mag_g_lsst | mag_err_g_lsst | mag_r_lsst | mag_err_r_lsst | mag_i_lsst | mag_err_i_lsst | mag_z_lsst | mag_err_z_lsst | mag_y_lsst | mag_err_y_lsst | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.769521 | 26.496852 | 0.288986 | 25.863170 | 0.056997 | 24.729555 | 0.020702 | 23.610683 | 0.012011 | 23.143518 | 0.013714 | 22.915156 | 0.024561 |
1 | 1.088857 | 26.258727 | 0.237964 | 25.509524 | 0.041668 | 24.469344 | 0.016648 | 23.532860 | 0.011344 | 22.546680 | 0.008992 | 22.070255 | 0.012282 |
2 | 1.333098 | 25.373855 | 0.112257 | 24.943293 | 0.025359 | 24.524998 | 0.017431 | 24.013649 | 0.016486 | 23.733274 | 0.022315 | 23.102123 | 0.028906 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
59 | 0.986374 | 26.050653 | 0.200164 | 25.641624 | 0.046837 | 25.161078 | 0.030090 | 24.460152 | 0.024047 | 23.977239 | 0.027567 | 23.831974 | 0.055121 |
60 | 0.474281 | 27.048056 | 0.444683 | 26.428211 | 0.093854 | 24.839984 | 0.022755 | 24.209226 | 0.019403 | 23.855082 | 0.024787 | 23.507456 | 0.041329 |
61 | 0.561923 | 24.680480 | 0.061182 | 23.958609 | 0.011430 | 22.900135 | 0.006346 | 22.143581 | 0.005820 | 21.867563 | 0.006465 | 21.612692 | 0.008967 |
62 rows × 13 columns
type(catalog.data)
pandas.core.frame.DataFrame
It preserves the useful methods from pandas.DataFrame
, such as:
catalog.data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 62 entries, 0 to 61 Data columns (total 13 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 redshift 62 non-null >f8 1 mag_u_lsst 61 non-null >f8 2 mag_err_u_lsst 61 non-null >f8 3 mag_g_lsst 62 non-null >f8 4 mag_err_g_lsst 62 non-null >f8 5 mag_r_lsst 62 non-null >f8 6 mag_err_r_lsst 62 non-null >f8 7 mag_i_lsst 62 non-null >f8 8 mag_err_i_lsst 62 non-null >f8 9 mag_z_lsst 62 non-null >f8 10 mag_err_z_lsst 62 non-null >f8 11 mag_y_lsst 61 non-null >f8 12 mag_err_y_lsst 61 non-null >f8 dtypes: float64(13) memory usage: 6.4 KB
catalog.data.describe()
redshift | mag_u_lsst | mag_err_u_lsst | mag_g_lsst | mag_err_g_lsst | mag_r_lsst | mag_err_r_lsst | mag_i_lsst | mag_err_i_lsst | mag_z_lsst | mag_err_z_lsst | mag_y_lsst | mag_err_y_lsst | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 62.000000 | 61.000000 | 61.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 61.000000 | 61.000000 |
mean | 0.780298 | 25.446008 | 0.188050 | 24.820000 | 0.038182 | 24.003970 | 0.018770 | 23.384804 | 0.016165 | 23.074481 | 0.021478 | 22.932354 | 0.054682 |
std | 0.355365 | 1.269277 | 0.193747 | 1.314112 | 0.036398 | 1.387358 | 0.013750 | 1.381587 | 0.010069 | 1.400673 | 0.014961 | 1.540284 | 0.115875 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
50% | 0.764600 | 25.577029 | 0.133815 | 25.069970 | 0.028309 | 24.470215 | 0.016660 | 23.748506 | 0.013390 | 23.514185 | 0.018540 | 23.293384 | 0.034199 |
75% | 0.948494 | 26.263284 | 0.238859 | 25.705486 | 0.049576 | 24.985225 | 0.025802 | 24.488654 | 0.024650 | 24.165944 | 0.032557 | 23.993010 | 0.063585 |
max | 1.755764 | 28.482391 | 1.154073 | 27.296152 | 0.198195 | 26.036958 | 0.065360 | 24.949645 | 0.036932 | 24.693132 | 0.051883 | 27.342151 | 0.909230 |
8 rows × 13 columns
In the prod-types you will see more details about these specific classes. For those who prefer working with astropy.Table
or pure pandas.DataFrame
, the method get_product()
gives the flexibility to choose the output format (fmt="pandas"
or fmt="astropy"
).
dataframe = pz_server.get_product(8, fmt="pandas")
print(type(dataframe))
dataframe
Connecting to PZ Server... <class 'pandas.core.frame.DataFrame'>
redshift | mag_u_lsst | mag_err_u_lsst | mag_g_lsst | mag_err_g_lsst | mag_r_lsst | mag_err_r_lsst | mag_i_lsst | mag_err_i_lsst | mag_z_lsst | mag_err_z_lsst | mag_y_lsst | mag_err_y_lsst | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.769521 | 26.496852 | 0.288986 | 25.863170 | 0.056997 | 24.729555 | 0.020702 | 23.610683 | 0.012011 | 23.143518 | 0.013714 | 22.915156 | 0.024561 |
1 | 1.088857 | 26.258727 | 0.237964 | 25.509524 | 0.041668 | 24.469344 | 0.016648 | 23.532860 | 0.011344 | 22.546680 | 0.008992 | 22.070255 | 0.012282 |
2 | 1.333098 | 25.373855 | 0.112257 | 24.943293 | 0.025359 | 24.524998 | 0.017431 | 24.013649 | 0.016486 | 23.733274 | 0.022315 | 23.102123 | 0.028906 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
59 | 0.986374 | 26.050653 | 0.200164 | 25.641624 | 0.046837 | 25.161078 | 0.030090 | 24.460152 | 0.024047 | 23.977239 | 0.027567 | 23.831974 | 0.055121 |
60 | 0.474281 | 27.048056 | 0.444683 | 26.428211 | 0.093854 | 24.839984 | 0.022755 | 24.209226 | 0.019403 | 23.855082 | 0.024787 | 23.507456 | 0.041329 |
61 | 0.561923 | 24.680480 | 0.061182 | 23.958609 | 0.011430 | 22.900135 | 0.006346 | 22.143581 | 0.005820 | 21.867563 | 0.006465 | 21.612692 | 0.008967 |
62 rows × 13 columns
table = pz_server.get_product(8, fmt="astropy")
print(type(table))
table
Connecting to PZ Server... <class 'astropy.table.table.Table'>
redshift | mag_u_lsst | mag_err_u_lsst | mag_g_lsst | mag_err_g_lsst | mag_r_lsst | mag_err_r_lsst | mag_i_lsst | mag_err_i_lsst | mag_z_lsst | mag_err_z_lsst | mag_y_lsst | mag_err_y_lsst |
---|---|---|---|---|---|---|---|---|---|---|---|---|
float64 | float64 | float64 | float64 | float64 | float64 | float64 | float64 | float64 | float64 | float64 | float64 | float64 |
0.7695210576057434 | 26.49685173335998 | 0.28898640164514966 | 25.863170180148593 | 0.0569968492513252 | 24.72955523266535 | 0.020702469899475762 | 23.610683261247523 | 0.012011391457007867 | 23.14351797933142 | 0.013714272888189844 | 22.915156068508104 | 0.02456124411372624 |
1.0888570547103882 | 26.25872690364715 | 0.23796354746659837 | 25.50952422860369 | 0.041667922409552444 | 24.46934448716597 | 0.016647621314186963 | 23.532859983884297 | 0.011343729522451391 | 22.546679503178662 | 0.008992167497723039 | 22.070255473243666 | 0.01228199507795122 |
1.3330978155136108 | 25.373855139450704 | 0.11225669597772256 | 24.94329329099596 | 0.025358932801191274 | 24.52499778455543 | 0.01743053515568277 | 24.01364895511997 | 0.016486310070442982 | 23.73327434921557 | 0.022315314171620415 | 23.102123362449476 | 0.028905678864388565 |
0.721265435218811 | 25.99409631118909 | 0.1908826681547846 | 25.61777238197774 | 0.045858038831305514 | 25.005785747799642 | 0.026268063533916104 | 24.371285501987305 | 0.022273920520691406 | 24.221670678108204 | 0.034167014800438315 | 24.065810802830256 | 0.06782119456710035 |
0.5086992383003235 | 23.45564492849768 | 0.021183649676627364 | 22.154983461483162 | 0.005456197651685896 | 20.854221072900675 | 0.005054305345075906 | 20.25151778574991 | 0.005043240089602512 | 19.987932992458255 | 0.005076700720766547 | 19.7531794486966 | 0.00522219293990742 |
1.654597520828247 | 25.577029312797993 | 0.1338145411656938 | 25.357190234659992 | 0.036422068466217185 | 24.985364097317376 | 0.025805305576530817 | 24.619865947930514 | 0.027629428515503464 | 24.31542705984716 | 0.03711719103644559 | 23.99301047249068 | 0.06358486650868432 |
0.6302117109298706 | 26.29456973098354 | 0.24508980808616634 | 25.661960742379378 | 0.04768867750224801 | 24.970350788516424 | 0.025470686194542756 | 24.44359540978862 | 0.023705065840900583 | 24.38252568961461 | 0.03938805593592655 | 24.27431455145406 | 0.08154556925795062 |
0.9446004629135132 | 23.04439144957979 | 0.01520055564490648 | 22.859827028976415 | 0.006358416100559978 | 22.392031080275324 | 0.005598497145960838 | 21.763171712833255 | 0.005443871002437149 | 21.315606488304542 | 0.005611256900621502 | 21.078853697855322 | 0.006819135385098381 |
0.785059928894043 | 26.10350506888941 | 0.20920946155191722 | 25.640570562466237 | 0.046793502527176206 | 25.224572910770192 | 0.031816876817820854 | 24.570714814956105 | 0.026469463529125777 | 24.44276676325899 | 0.041547464844769 | 24.36359719897942 | 0.08821825914170406 |
0.6398977041244507 | 24.959361919922344 | 0.07816884731386628 | 24.244064605397742 | 0.014174781356760707 | 23.44601725615679 | 0.008066495888990537 | 22.766850651619972 | 0.0071680218663888115 | 22.484472832218465 | 0.008659665351347316 | 22.24640717531644 | 0.014065524898988742 |
0.7637529373168945 | 27.05244719531529 | 0.4461580569029143 | 26.33447472955853 | 0.08643568272158875 | 25.37302615791678 | 0.036269714237240215 | 24.57116347198508 | 0.026479813705039875 | 24.324487244772236 | 0.037415823835815064 | 24.268661439857713 | 0.08114000001493893 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
0.35827815532684326 | 26.986694355880676 | 0.4244749494226304 | 25.39650987146171 | 0.03770768150771031 | 24.08007097505154 | 0.012243013726166119 | 23.53253704696021 | 0.011341074343008767 | 23.20709039946786 | 0.014418211893728716 | 23.16932714470006 | 0.03066062433201604 |
0.8610506057739258 | 27.545130772053 | 0.6379664388863776 | 26.015261610781764 | 0.06521429873554471 | 24.028767419869993 | 0.011784216384769213 | 22.89099966825509 | 0.007607711020947721 | 22.04041162182893 | 0.006910416122961773 | 21.653444566410805 | 0.009196841638425917 |
1.3468643426895142 | 25.799056669111714 | 0.16183582800185503 | 25.172747798870194 | 0.030969628744991953 | 24.73176896763009 | 0.020741528746953675 | 24.24588972463209 | 0.020013745398497446 | 23.87326184529821 | 0.025181136250533236 | 23.301751883351493 | 0.034452507469982616 |
0.9497920870780945 | 23.69882557052796 | 0.026003860754992805 | 23.42078468199699 | 0.008110795059243107 | 22.710885142422338 | 0.005998309477644766 | 21.896858815109162 | 0.005550891391829063 | 21.27308173264384 | 0.005570915565717193 | 21.040724684461797 | 0.006716225485067447 |
0.7403952479362488 | 25.443020428241766 | 0.11919338915656763 | 24.960899785834137 | 0.025748364844749856 | 24.37755640900205 | 0.015447104969121968 | 23.64996369229125 | 0.012370124053356303 | 23.551387592243383 | 0.01911789566400906 | 23.447021025450375 | 0.03917395554632945 |
0.9094947576522827 | 23.47539996365302 | 0.021535240070639578 | 23.342457978167225 | 0.0077817484631403306 | 22.761771226542443 | 0.0060823223178512 | 21.944599210682128 | 0.0055950363419199995 | 21.445133373289764 | 0.005752266599951927 | 21.230035191915672 | 0.007284651334311665 |
0.9731865525245667 | 24.915322792667247 | 0.07520573265330482 | 24.460645127806607 | 0.016865228377118156 | 23.660006166694796 | 0.009149925971566488 | 22.893359188501577 | 0.00761678230670769 | 22.41079286888505 | 0.00829697412050601 | 27.34215088878518 | 0.9092302567884765 |
0.6099322438240051 | 24.6012726318867 | 0.05706459847822567 | 23.09359402404072 | 0.006932587215462636 | 21.69608131308399 | 0.005195645207561415 | 20.69557037495898 | 0.005082744472086711 | 20.38720846897639 | 0.005140138024108047 | 20.12785634011939 | 0.00540364900110484 |
0.9768770933151245 | -- | -- | 26.84622073667464 | 0.13506185165217507 | 25.7092891194238 | 0.04886766832783457 | 24.866576205074942 | 0.03431791453068037 | 24.048870122903367 | 0.029349601464304948 | 23.78405588758275 | 0.052825335046252773 |
0.9863744974136353 | 26.050653283784623 | 0.20016426980075472 | 25.641624009646335 | 0.04683719015216041 | 25.161078181218834 | 0.030089536757067062 | 24.460152414129137 | 0.024046729003518723 | 23.977239003621 | 0.027566781051099783 | 23.831973618634528 | 0.05512066706093889 |
0.4742807149887085 | 27.048056087407986 | 0.4446825063577354 | 26.428211280519175 | 0.09385433945963481 | 24.83998360318214 | 0.02275493531289512 | 24.2092260174936 | 0.01940261275081239 | 23.855082243159934 | 0.02478730171099941 | 23.507455929574288 | 0.041328512368478044 |
0.5619226694107056 | 24.680479530543163 | 0.061181531929665633 | 23.958608997973702 | 0.01142956636817526 | 22.900134967933102 | 0.006345869773581998 | 22.143580633270624 | 0.005819630970810428 | 21.867562849329406 | 0.006465480863342269 | 21.61269159453626 | 0.008966510628950788 |
Clean up
del search_results, catalog, dataframe, table
The PZ Server API provides Python classes with useful methods to handle particular product types. Let's recap the product types available:
pz_server.display_product_types()
Product type | Description |
---|---|
Spec-z Catalog | Catalog of spectroscopic redshifts and positions (usually equatorial coordinates). |
Training Set | Training set for photo-z algorithms (tabular data). It usually contains magnitudes, errors, and true redshifts. |
Validation Results | Results of a photo-z validation procedure (free format). Usually contains photo-z estimates (single estimates and/or pdf) of a validation set and photo-z validation metrics. |
Photo-z Table | Results of a photo-z estimation procedure. If the data is larger than the file upload limit (200MB), the product entry stores only the metadata (instructions on accessing the data should be provided in the description field. |
In the context of the PZ Server, Spec-z Catalogs are defined as any catalog containing spherical equatorial coordinates and spectroscopic redshift measurements (or, analogously, true redshifts from simulations). A Spec-z Catalog can include data from a single spectroscopic survey or a combination of data from several sources. To be considered as a single Spec-z Catalog, the data should be provided as a single file to PZ Server's the upload tool. For multi-survey catalogs, it is recommended to add the survey name or identification as an extra column.
Mandatory columns:
float
float
float
Recommended columns:
float
integer
, float
, or string
Let's see an example of Spec-z Catalog:
gama = pz_server.get_product(14)
Connecting to PZ Server... Done!
gama.display_metadata()
key | value |
---|---|
id | 14 |
release | None |
product_type | Spec-z Catalog |
uploaded_by | gschwend |
internal_name | 14_gama_specz_subsample |
product_name | GAMA spec-z subsample |
official_product | False |
pz_code | |
description | A small subsample of the GAMA DR3 spec-z catalog (Baldry et al. 2018) as an example of a typical spec-z catalog from the literature. |
created_at | 2023-03-29T20:02:45.223568Z |
main_file | specz_subsample_gama_example.csv |
Display basic statistics
gama.data.describe()
ID | RA | DEC | Z | ERR_Z | FLAG_DES | |
---|---|---|---|---|---|---|
count | 2.576000e+03 | 2576.000000 | 2576.000000 | 2576.000000 | 2576.0 | 2576.000000 |
mean | 1.105526e+06 | 154.526343 | -1.101865 | 0.224811 | 99.0 | 3.949534 |
std | 4.006668e+04 | 70.783868 | 2.995036 | 0.102571 | 0.0 | 0.218947 |
... | ... | ... | ... | ... | ... | ... |
50% | 1.103558e+06 | 180.140145 | -0.480830 | 0.217804 | 99.0 | 4.000000 |
75% | 1.140619e+06 | 215.836583 | 1.170363 | 0.291810 | 99.0 | 4.000000 |
max | 1.176440e+06 | 223.497080 | 2.998180 | 0.728717 | 99.0 | 4.000000 |
8 rows × 6 columns
The spec-z catalog object has a very basic plot method for quick visualization of catalog properties
gama.plot()
The attribute data
, which is a DataFrame
preserves the plot
method from Pandas.
gama.data.plot(x="RA", y="DEC", kind="scatter")
<Axes: xlabel='RA', ylabel='DEC'>
In the context of the PZ Server, Training Sets are defined as the product of matching (spatially) a given Spec-z Catalog (single survey or compilation) to the photometric data, in this case, the LSST Objects Catalog. The PZ Server API offers a tool called Training Set Maker for users to build customized Training Sets based on the Spec-z Catalogs available. Please see the companion Jupyter Notebook pz_tsm_tutorial.ipynb
for details.
Note 1: Commonly the training set is split into two or more subsets for photo-z validation purposes. If the Training Set owner has previously defined which objects should belong to each subset (trainining and validation/test sets), this information must be available as an extra column in the table or as clear instructions for reproducing the subsets separation in the data product description.
Note 2: The PZ Server only supports catalog-level Training Sets. Image-based Training Sets, e.g., for deep-learning algorithms, are not supported yet.
Mandatory column:
float
Other expected columns
integer
float
float
float
float
integer
, float
, or string
integer
, float
, or string
train_goldenspike = pz_server.get_product(9)
Connecting to PZ Server... Done!
! conda list h5
# packages in environment at /home/julia/anaconda3/envs/pz-lib: # # Name Version Build Channel h5py 3.8.0 pypi_0 pypi
train_goldenspike.display_metadata()
key | value |
---|---|
id | 9 |
release | None |
product_type | Training Set |
uploaded_by | gschwend |
internal_name | 9_goldenspike_train_data_hdf5 |
product_name | Goldenspike train data hdf5 |
official_product | False |
pz_code | |
description | A mock training set created using the example notebook goldenspike.ipynb available in RAIL's repository. Test upload of files in hdf5 format. |
created_at | 2023-03-29T19:12:59.746096Z |
main_file | goldenspike_train_data.hdf5 |
Display basic statistics
train_goldenspike.data.describe()
mag_err_g_lsst | mag_err_i_lsst | mag_err_r_lsst | mag_err_u_lsst | mag_err_y_lsst | mag_err_z_lsst | mag_g_lsst | mag_i_lsst | mag_r_lsst | mag_u_lsst | mag_y_lsst | mag_z_lsst | redshift | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 62.000000 | 62.000000 | 62.000000 | 61.000000 | 61.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 61.000000 | 61.000000 | 62.000000 | 62.000000 |
mean | 0.038182 | 0.016165 | 0.018770 | 0.188050 | 0.054682 | 0.021478 | 24.820000 | 23.384804 | 24.003970 | 25.446008 | 22.932354 | 23.074481 | 0.780298 |
std | 0.036398 | 0.010069 | 0.013750 | 0.193747 | 0.115875 | 0.014961 | 1.314112 | 1.381587 | 1.387358 | 1.269277 | 1.540284 | 1.400673 | 0.355365 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
50% | 0.028309 | 0.013390 | 0.016660 | 0.133815 | 0.034199 | 0.018540 | 25.069970 | 23.748506 | 24.470215 | 25.577029 | 23.293384 | 23.514185 | 0.764600 |
75% | 0.049576 | 0.024650 | 0.025802 | 0.238859 | 0.063585 | 0.032557 | 25.705486 | 24.488654 | 24.985225 | 26.263284 | 23.993010 | 24.165944 | 0.948494 |
max | 0.198195 | 0.036932 | 0.065360 | 1.154073 | 0.909230 | 0.051883 | 27.296152 | 24.949645 | 26.036958 | 28.482391 | 27.342151 | 24.693132 | 1.755764 |
8 rows × 13 columns
Quick visualization of training set properties:
train_goldenspike.plot(mag_name="mag_i_lsst")
Validation Results are the outputs of any photo-z algorithm applied on a Validation Set. The format and number of files of this data product are strongly dependent on the algorithm used to create it, so there are no constraints on these two parameters. In the case of multiple files, for instance, if the user includes the results of training procedures (e.g., neural nets weights, decision trees files, or any machine learning by-product) or additional files (SED templates, filter transmission curves, theoretical magnitudes grid, Bayesian priors, etc.), it will be required to put all files together in a single compressed file (.zip or .tar, or .tar.gz) before uploading it to the Photo-z Server.
pz_server.display_products_list(filters={"product_type": "Validation Results"})
id | internal_name | product_name | product_type | release | uploaded_by | official_product | pz_code | description | created_at |
---|---|---|---|---|---|---|---|---|---|
12 | 12_goldenspike_knn | Goldenspike KNN | Validation Results | None | gschwend | False | KNN | Results of photoz validation using KNN on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository. | 2023-03-29T19:49:35.652295Z |
11 | 11_goldenspike_flexzboost | Goldenspike FlexZBoost | Validation Results | None | gschwend | False | FlexZBoost | Results of photoz validation using FlexZBoost on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository. | 2023-03-29T19:48:34.864629Z |
10 | 10_goldenspike_bpz | Goldenspike BPZ | Validation Results | LSST DP0 | gschwend | False | BPZ | Results of photoz validation using BPZ on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository. | 2023-03-29T19:42:04.424990Z |
pz_server.display_product_metadata("11_goldenspike_flexzboost")
key | value |
---|---|
id | 11 |
release | None |
product_type | Validation Results |
uploaded_by | gschwend |
internal_name | 11_goldenspike_flexzboost |
product_name | Goldenspike FlexZBoost |
official_product | False |
pz_code | FlexZBoost |
description | Results of photoz validation using FlexZBoost on a mock test set from the example notebook goldenspike.ipynb available in RAIL's repository. |
created_at | 2023-03-29T19:48:34.864629Z |
main_file | pz_valid_fzboost.tar.gz |
This product type is not necessarily (only) tabular data and can be a list of files. The methods get_product
shown above just return the data to be used on memory and only supports single tabular files. To retrieve Photo-z Validation Results, you must download the data to open locally.
# pz_server.download_product(11, save_in=".")
The Photo-z Tables are the results of photo-z estimation on photometrics samples. The data format is usually tabular, and might vary according to the phto-z estimation method used.
The size limit for uploading files on the PZ Server is 200MB, therefore it does not support large Photo-z Tables such as the photo-zs of the LSST Objects catalog. The PZ Server can host small Photo-z Tables or, in case of large datasets, a data product can be registered to contain only the Photo-z Tables' metadata. For these cases, the instructions to find and access the data must be provided in the product's description.
# pz_server.download_product(<id number or internal name>)
Is something important missing? Click here to open an issue in the PZ Server library repository on GitHub.