Example 2: Weighted Gene Co-expression Network Analysis (WGCNA) Workflow with Graph Neural Network (GNN) Embeddings ==================================================================================================================== This tutorial demonstrates how to perform a comprehensive workflow using **WGCNA** for graph generation, followed by **GNN-based embedding generation** and subject representation integration. The process integrates the generated embeddings into subject-level omics data, enhancing downstream analytical capabilities. **Workflow Overview:** 1. **Network Construction (WGCNA):** Generates a network from multi-omics data using `WGCNA`. The resulting adjacency matrix represents relationships between features. 2. **GNN-Based Embedding Generation:** Utilizes Graph Neural Networks (GNNs) to create embeddings from the constructed network, capturing intricate feature relationships. 3. **Subject Representation Integration:** Integrates the generated embeddings into subject-level omics data, enhancing the dataset for downstream analyses such as clustering or disease prediction. **Step-by-Step Guide:** 1. **Setup Input Data:** - Prepare your omics data (`omics_data`), phenotype data (`phenotype_data`), and clinical data (`clinical_data`) as Pandas DataFrames or Series. - These data structures should be loaded or created within your application or script. 2. **Run WGCNA Workflow:** .. literalinclude:: ../examples/example_2.py :language: python :lines: 23-33 :caption: Running WGCNA to generate the adjacency matrix. This step instantiates the `WGCNA` class and generates an adjacency matrix using your multi-omics and phenotype data. 3. **Run GNN Embedding Generation:** .. literalinclude:: ../examples/example_2.py :language: python :lines: 35-51 :caption: Generating GNN Embeddings from the Adjacency Matrix. This section computes node features based on correlations and employs a Graph Neural Network (GNN) to generate embeddings from the adjacency matrix. 4. **Integrate Embeddings into Subject Representation:** .. literalinclude:: ../examples/example_2.py :language: python :lines: 53-61 :caption: Integrating GNN Embeddings into Subject-Level Omics Data. Here, the generated embeddings are integrated into the subject-level omics data, enhancing the dataset for downstream analyses such as clustering or disease prediction. 5. **Complete Workflow Execution:** .. literalinclude:: ../examples/example_2.py :language: python :lines: 63-83 :caption: Complete WGCNA Workflow Execution with Sample Data. This section demonstrates the full execution of the workflow using sample data. It initializes the input data, runs the WGCNA workflow, and outputs the enhanced omics data integrated with GNN embeddings. **Running the Example:** .. literalinclude:: ../examples/example_2.py :language: python :caption: Complete WGCNA Workflow Execution with Sample Data. Upon successful execution, you will find: - **Adjacency Matrix**: Generated by WGCNA, stored as a DataFrame. - **GNN Embeddings**: Created using GNNs, stored as `embeddings_df`. - **Enhanced Omics Data**: Subject-level data enriched with embeddings, stored as `enhanced_omics_data`. **Result Interpretation:** - **Adjacency Matrix**: Represents the constructed network from multi-omics data, indicating the strength and presence of relationships between features. - **GNN Embeddings**: Numerical representations capturing the structural and feature-based intricacies of the network, facilitating advanced analyses. - **Enhanced Omics Data**: Combines original omics data with embedding information, providing a richer dataset for downstream tasks like clustering or predictive modeling.