Metadata-Version: 2.4
Name: awnf
Version: 0.1.3
Summary: A package for adaptive weighted similarity network fusion
Author: Sevinj Yolchuyeva
Author-email: sevinj.yolchuyeva@crchudequebec.ulaval.ca
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE.txt
Requires-Dist: numpy
Requires-Dist: scipy
Requires-Dist: scikit-learn
Requires-Dist: boruta
Requires-Dist: snfpy
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary


# Adaptive Weighted Network Fusion (AWNF): A flexible framework for integrating multi-modality.

**AWNF** is a Python package designed to implement **Similarity Network Fusion (SNF)** with an **adaptive weighting mechanism**. This package is intended for use in computational biology, machine learning, and data science tasks that involve multi-view data, such as genomics, imaging, and clinical data.

The package helps integrate heterogeneous data sources into a single, unified similarity network, which can be used for predictive modeling and analysis.


---

## Installation

To install the **awnf** package, you can use **pip**.

### Install via PyPI (if published)
```bash
pip install awnf
```
## Project links

The source code is available on [GitHub](https://github.com/Manem-Lab/AWNF/tree/main).

## Usage

Once installed, you can use the **awnf** package by importing its functions into your Python scripts. Below is an example usage:

### Example
```python

# Import necessary functions from the 'weighted_snf' package and other libraries
from awnf import feature_selection, make_affinity_with_weight, SNF_modality_weights, process_feature_weights_and_mad
import pandas as pd
import numpy as np
from sklearn.datasets import make_classification

# Example input data for affinity matrix calculation
# First, generate a synthetic dataset to demonstrate feature selection and affinity matrix calculation

# Generate a classification dataset with 100 samples, 50 features, and 2 informative features
# The dataset will have 2 classes (target variable)
X, y = make_classification(n_samples=100, n_features=50, n_informative=2, n_classes=2, random_state=42)

# Convert the data into a pandas DataFrame for easier manipulation and inspection
X_df1 = pd.DataFrame(X, columns=[f'Feat_mod1_{i+1}' for i in range(X.shape[1])])
y_df1 = pd.DataFrame(y, columns=['Target'])

# Generate another classification dataset with a different set of features
X, y = make_classification(n_samples=100, n_features=60, n_informative=2, n_classes=2, random_state=42)

# Convert the second dataset to a pandas DataFrame for ease of manipulation
X_df2 = pd.DataFrame(X, columns=[f'Feat_mod2_{i+1}' for i in range(X.shape[1])])
y_df2 = pd.DataFrame(y, columns=['Target'])

# Display the first few rows of the first dummy dataset (X_df1 and y_df1)
print(X_df1.head())
print(y_df1.head())

# Perform feature selection using the Boruta algorithm for the first dataset (X_df1)
num_features = 10  # Specify the number of features to select
selected_genes1, feature_ranks1 = feature_selection(X_df1, np.ravel(y_df1), num_features=num_features, n_estimators=100)

# Display the feature ranks for the first dataset
print(feature_ranks1)

# Perform feature selection on the second dataset (X_df2)
num_features = 18  # Specify the number of features to select
selected_genes2, feature_ranks2 = feature_selection(X_df2, np.ravel(y_df2), num_features=num_features, n_estimators=50)

# Display the feature ranks for the second dataset
print(feature_ranks2)

# Update the feature sets for both datasets by selecting only the top-ranked features
X_df1 = X_df1[selected_genes1]
X_df2 = X_df2[selected_genes2]

# Process the feature weights and calculate the feature importance
# The 'process_feature_weights_and_mad' function is assumed to calculate weights based on feature ranks
sorted_weights = process_feature_weights_and_mad(
    X_v2_list=[X_df1, X_df2],  # List of feature datasets
    feature_ranks_list=[feature_ranks1, feature_ranks2],  # Corresponding feature ranks
    betta=0.5,  # A parameter controlling the weight scaling (assumed)
)

# Display the first set of weights (for the first dataset)
print(sorted_weights[0])

# Generate the similarity (affinity) matrices for each dataset using the feature weights
similarity_view1_w = make_affinity_with_weight(X_df1, weight=sorted_weights[0]['feature_weight'].to_list())
similarity_view2_w = make_affinity_with_weight(X_df2, weight=sorted_weights[1]['feature_weight'].to_list())

# Combine the similarity matrices from both datasets using weighted SNF (Similarity Network Fusion)
# We are assigning different weights to the modalities (views) based on their importance
fused_network = SNF_modality_weights([similarity_view1_w, similarity_view2_w], weight_modality=[0.8, 0.2])

# Print the resulting fused network, which combines information from both datasets
print('fused_network', fused_network)
```


## AWNF is developed based on the following code repositories:
1. [SNFpy](https://github.com/rmarkello/snfpy)
2. [boruta_py](https://github.com/scikit-learn-contrib/boruta_py)

---

## Authors

**Sevinj Yolchuyeva, Venkata Manem**  
Email: [sevinj.yolchuyeva@crchudequebec.ulaval.ca](mailto:sevinj.yolchuyeva@crchudequebec.ulaval.ca), [venkata.manem@crchudequebec.ulaval.ca](mailto:venkata.manem@crchudequebec.ulaval.ca)


## Citation
Will published soon...
