Skip to content

msmu._preprocessing._summarise._summariser

Aggregator

Aggregator(identification_df, quantification_df, decoy_df, agg_method, score_method)

Base class for aggregating identification and quantification data.

peptide classmethod

peptide(identification_df, quantification_df, decoy_df, agg_method, score_method, protein_col, peptide_col)

Create a peptide-level aggregator.

protein classmethod

protein(identification_df, quantification_df, decoy_df, agg_method, score_method, protein_col)

Create a protein-level aggregator.

ptm_site classmethod

ptm_site(identification_df, quantification_df, agg_method)

Create a PTM site-level aggregator.

FeatureRanker

Ranking methods for selecting top features based on quantification data.

max_intensity staticmethod

max_intensity(identification_df, quantification_df, col_to_groupby)

Rank features based on maximum intensity across all samples.

Parameters:

Name Type Description Default
identification_df DataFrame

DataFrame containing feature identifications.

required
quantification_df DataFrame

DataFrame containing feature quantifications.

required
col_to_groupby str

Column name to group by for ranking.

required

Returns:

Type Description

pd.DataFrame: DataFrame with added 'rank_score' and 'rank' columns.

median_intensity staticmethod

median_intensity(identification_df, quantification_df, col_to_groupby)

Rank features based on median intensity across all samples.

Parameters:

Name Type Description Default
identification_df DataFrame

DataFrame containing feature identifications.

required
quantification_df DataFrame

DataFrame containing feature quantifications.

required
col_to_groupby str

Column name to group by for ranking.

required

Returns:

Type Description

pd.DataFrame: DataFrame with added 'rank_score' and 'rank' columns.

total_intensity staticmethod

total_intensity(identification_df, quantification_df, col_to_groupby)

Rank features based on total intensity across all samples.

Parameters:

Name Type Description Default
identification_df DataFrame

DataFrame containing feature identifications.

required
quantification_df DataFrame

DataFrame containing feature quantifications.

required
col_to_groupby str

Column name to group by for ranking.

required

Returns:

Type Description

pd.DataFrame: DataFrame with added 'rank_score' and 'rank' columns.

PtmSummarisationPrep

PtmSummarisationPrep(adata, modi_identifier, fasta)

Bases: SummarisationPrep

Preparation steps for PTM site summarisation. 1. Filter data with only modified peptides with modi_identifier 2. Get modified sites from peptide 3. Label peptide site 4. Explode data to single protein for labeling protein site 5. Label protein site to each single protein 6. Wrap up single protein to single protein group 7. Group by modified peptide and its peptide site 8. Merge data with peptide value indexed by peptide

label_ptm_site

label_ptm_site(data)

Label PTM site to each single protein and get data arranged by peptide - peptide site

Parameters:

Name Type Description Default
data DataFrame

Peptide data from msmu mudata['peptide']

required

Returns:

Name Type Description
ptm_data DataFrame

PTM data arranged by peptide - peptide site

Scorer

Scorer(pep)

Scoring methods for aggregating PSM scores to peptide/protein scores.

picked_pep property

picked_pep

The aggregated PEP value.

picked_score property

picked_score

The −log10 transformed score.

best_pep classmethod

best_pep(values)

Factory for best PEP aggregation.

func classmethod

func(method)

Return a pure function that returns numeric PEPs (for pandas .agg).

SummarisationPrep

SummarisationPrep(adata, col_to_groupby, has_decoy)

Preparation steps for summarisation.

Attributes:

Name Type Description
mdata MuData

MuData object containing feature-level data.

filter_dict dict

Dictionary specifying filtering criteria.

rank_dict dict

Dictionary specifying ranking criteria.