M³Surv: How AI Revolutionizes Cancer Survival Prediction with Multi-Slide and Multi-Omics Integration

Introduction

Cancer remains one of the leading causes of mortality worldwide, yet advances in personalized medicine and artificial intelligence are fundamentally transforming how physicians predict patient survival and recommend treatment strategies. Traditional prognostic approaches rely on limited clinical variables and single-source data, often missing the complex biological heterogeneity that characterizes modern cancer. Recent breakthroughs in computational pathology and multimodal machine learning are changing this landscape.

Researchers have developed M³Surv, a sophisticated artificial intelligence framework that integrates histopathological images with comprehensive genomic, transcriptomic, and proteomic data to deliver unprecedented accuracy in cancer survival prediction. What makes this innovation particularly significant is its ability to function effectively even when critical data is missing—a real-world challenge that has long plagued multimodal AI systems in clinical settings.

This comprehensive article explores the groundbreaking methodology behind M³Surv, its clinical implications, and why this technology represents a paradigm shift in precision oncology.

Understanding the Challenge: Why Traditional Survival Prediction Falls Short

The Complexity of Cancer Heterogeneity

Cancer is inherently heterogeneous. Two patients diagnosed with the same cancer type can exhibit vastly different molecular profiles, treatment responses, and survival outcomes. Traditional survival prediction models often rely on basic clinical factors—tumor grade, size, and stage—which provide only a superficial understanding of disease biology.

Modern cancer research has revealed that patient prognosis depends on intricate interactions between:

Tumor morphology (cellular architecture visible in tissue slides)
Gene expression patterns (which genes are active)
Genomic mutations (DNA-level alterations)
Protein abundance (functional molecular components)

The fundamental problem: No single data source captures all this information. Histopathology images reveal cellular organization but lack molecular details. Gene sequencing provides molecular insights but misses morphological context. Survival prediction requires synthesizing all these dimensions simultaneously.

The Missing Data Problem in Real-World Clinics

Hospital laboratories face genuine resource constraints. Preparing Fresh-Frozen (FF) tissue samples takes 15 minutes, while Formalin-Fixed Paraffin-Embedded (FFPE) processing requires approximately 36 hours. Genetic sequencing can be expensive, transcriptomic analysis demanding, and proteomic profiling resource-intensive. In practice, not every patient has complete multimodal data available.

Existing multimodal AI systems break down when data is incomplete. They rely on inter-modality correlations that disappear when an entire data type is missing. This creates a critical clinical gap: the most sophisticated predictive tools become unusable precisely when circumstances prevent complete data collection.

M³Surv: A Breakthrough Framework for Robust Multimodal Prediction

The Three Pillars of M³Surv

M³Surv addresses these challenges through three innovative components:

1. Multi-Slide Hypergraph Learning

Rather than treating FF and FFPE slides as separate entities, M³Surv employs sophisticated hypergraph neural networks to model them as complementary views of the same patient’s tissue.

Hypergraphs extend traditional graph theory by allowing a single edge (called a hyperedge) to connect multiple nodes simultaneously. This is more realistic for biological tissue, where clusters of similar cells interact together rather than in simple pairwise relationships.

The framework constructs two types of hyperedges within each slide:

Topological hyperedges: Connect adjacent patches based on spatial proximity, capturing local tissue architecture
Structural hyperedges: Link patches with similar morphological features, identifying semantically similar regions regardless of location

These intra-slide representations are then connected through inter-slide hyperedges that align semantically similar regions across FF and FFPE samples from the same patient. This ensures the model recognizes that the same biological structures appear in both slide types, despite staining and preparation differences.

Mathematical representation:

For a given patch p_i, topological neighbors are identified as:

$$\mathcal{N}T(p_i) = {p_j \mid |\zeta{p_j} – \zeta_{p_i}|_2 \leq \delta}$$

where ζ_pi and ζ_pj denote spatial coordinates and δ is a distance threshold.

Similarly, structural neighbors based on feature similarity are:

$$\mathcal{N}_F(p_i) = {p_j \mid \text{sim}(p_i, p_j) \geq \gamma}$$

2. Comprehensive Multi-Omics Integration

M³Surv moves beyond single-modality genomics analysis by integrating three complementary molecular layers:

Genomics: Copy number variations and RNA-seq data grouped into six functional categories (Tumor Suppression, Oncogenesis, Protein Kinases, Cellular Differentiation, Transcription, and Cytokines/Growth)
Transcriptomics: 331 biological pathways from Reactome and MSigDB databases, capturing dynamic gene expression across known cellular processes
Proteomics: Top 100 proteins selected by expression level, with sequence embeddings generated through evolutionary-scale modeling

This comprehensive approach captures biology at multiple regulatory layers. Genomic alterations represent permanent mutations; transcriptomic changes reflect current gene activity; proteomic profiles represent the actual functional molecules driving cell behavior. Together, they paint a complete molecular picture unavailable through any single modality.

3. Prototype-Based Memory Bank for Missing Modality Imputation

Perhaps the most clinically valuable innovation is the memory bank—a learned repository of representative pathology-omics feature patterns discovered during training.

During training, the system clusters paired pathology-genomic features using k-means clustering, generating modality-specific prototypes that represent distinct patient subgroups. These prototypes are stored alongside their features in the memory bank with momentum-based updates that refine them across epochs.

The inference magic: When a modality is missing during prediction, the model computes a prototype from available data and queries the memory bank for the most similar stored prototype. It then retrieves the missing modality’s corresponding features from the most similar patients in the training cohort:

$$\widehat{M}m = \sum{j=1}^{\mu} \frac{\exp(\text{sim}(\mathbf{C}_a, \mathbf{C}a^{(j)}))}{\sum{h=1}^{\mu} \exp(\text{sim}(\mathbf{C}_a, \mathbf{C}_a^{(h)}))} M_m^{(j)}$$

This similarity-weighted imputation is more robust than trying to predict missing data from a single patient’s partial information, because it leverages patterns from the entire training population.

Clinical Evidence: M³Surv Outperforms State-of-the-Art

Extensive experiments across multiple cancer types demonstrate M³Surv’s superior predictive power:

Performance on Five TCGA Datasets

Evaluation across five large The Cancer Genome Atlas (TCGA) cohorts shows consistent improvements:

Cancer Type	Sample Size	M³Surv C-Index	Previous SOTA	Improvement
Bladder Carcinoma (BLCA)	384	67.4%	65.1%	+2.3%
Breast Carcinoma (BRCA)	968	73.2%	72.7%	+0.5%
Colorectal Adenocarcinoma (CRAD)	298	78.8%	78.6%	+0.2%
Head/Neck Cancer (HNSC)	392	66.4%	65.6%	+0.8%
Gastric Adenocarcinoma (STAD)	317	67.7%	66.8%	+0.9%
Average	2359	70.7%	68.5%	+2.2%

The Concordance Index (C-Index) measures agreement between predicted risk scores and observed survival, ranging from 0.5 (random prediction) to 1.0 (perfect prediction). A 2.2% average improvement represents a clinically meaningful enhancement in prognostic accuracy.

Robustness Across Missing Data Scenarios

The memory bank’s effectiveness becomes apparent when evaluating performance under realistic clinical constraints:

With missing pathology data:

30% missing: 65.8% C-Index
50% missing: 67.3% C-Index
70% missing: 66.5% C-Index
100% missing: 66.9% C-Index

With missing multi-omics data:

30% missing: 68.0% C-Index
50% missing: 67.4% C-Index
70% missing: 68.5% C-Index
100% missing: 66.6% C-Index

Remarkably, performance at 100% missing data (where only one modality exists) often exceeded intermediate missing scenarios. This counterintuitive finding reveals that the memory bank works most effectively with complete information from a single modality, avoiding confusion from contradictory partial information.

Validation on In-House Data

Testing on an Anhui Provincial Hospital liver cancer cohort (APH-LC, n=302) corroborated findings from public datasets, confirming that M³Surv’s benefits generalize beyond academic research settings to real clinical populations.

Key Technical Innovations Driving Performance

Divide-and-Conquer Multi-Slide Processing

Rather than forcing FF and FFPE slides into a single representation, M³Surv employs a divide-and-conquer strategy:

Aggregation: Each slide type is processed independently through hypergraph convolution layers, capturing slide-specific patterns
Alignment: Cross-attention mechanisms refine FF and FFPE features using inter-slide features as shared guidance, ensuring biological consistency
Integration: Refined representations are concatenated into a unified patient-level pathology representation

This approach respects the unique characteristics of each slide type—FFPE slides preserve morphology better; FF slides preserve molecular information better—while exploiting their complementarity at the patient level.

Adaptive Weighting for Dynamic Slide Contribution

An often-overlooked challenge: slides vary in diagnostic relevance. A single hypergene region might carry more prognostic weight than surrounding normal tissue. M³Surv addresses this through adaptive weighting:

$$w_{ff}, w_{ffpe} = \frac{\exp(W_{ff}P_{ff}), \exp(W_{ffpe}P_{ffpe})}{\exp(W_{ff}P_{ff}) + \exp(W_{ffpe}P_{ffpe})}$$

Trainable weight matrices automatically learn which slides matter most for each patient, capturing disease heterogeneity at the individual level.

Mutual Cross-Attention for Bidirectional Fusion

Multimodal fusion typically privileges one modality as “primary.” M³Surv instead employs mutual cross-attention, where pathology and omics modalities enhance each other bidirectionally:

Pathology enhanced by omics:

$$\mathbf{F}c^p = \text{CrossAttn}{p \to g}(\mathbf{F}_m^p, \mathbf{G})$$

Omics enhanced by pathology:

$$\mathbf{F}c^g = \text{CrossAttn}{g \to p}(\mathbf{G}, \mathbf{F}_m^p)$$

Both enhanced representations are concatenated and processed together, ensuring neither modality dominates the final prediction.

Interpretability: Understanding Why M³Surv Works

Visualization of Pathological Attention

Heatmap visualizations reveal that M³Surv’s hypergraph learning identifies broader, more coherent activation regions compared to baseline approaches. Rather than scattered high-attention patches, the model recognizes meaningful histological patterns spanning multiple tissue regions.

Biological Pathway Interpretation

For breast cancer patients, the framework identified Complement System and Iron Transport pathways as highly influential—patterns directly reflecting BRCA’s complex immune microenvironment and nutrient-demanding proliferation. Conversely, suppressed Apoptotic Cleavage pathways indicated loss of cell death mechanisms, a known hallmark of malignancy.

This interpretability is crucial for clinical adoption. Physicians can understand not just that M³Surv predicts poor survival, but why, facilitating confident treatment decisions.

Kaplan-Meier Stratification Power

Across all five TCGA datasets, M³Surv achieved log-rank test p-values < 0.05, confirming it genuinely separates high-risk from low-risk populations—not merely achieving statistical discrimination on academic metrics.

Clinical Implementation and Future Directions

Computational Efficiency

A common concern with advanced AI: computational cost. M³Surv adds only 4.3% overhead to training time and 2.2% to inference time through its memory bank, making deployment in busy hospital workflows feasible.

Path to Clinical Deployment

For M³Surv to transform oncology practice, several steps remain:

Prospective clinical trials validating that predicted risk scores actually drive better treatment decisions and improved outcomes
Integration with hospital information systems, making predictions available alongside standard pathology reports
Interpretability enhancements, developing tools that explain predictions in language clinicians understand
Regulatory approval, navigating FDA or equivalent pathways for AI-based clinical decision support

Promising Extensions

Researchers suggest several promising directions:

Multimodal expansion: Incorporating radiology images and clinical notes alongside pathology and genomics
Longitudinal monitoring: Updating survival predictions dynamically as new patient data become available
Temporal dynamics: Modeling how molecular features evolve throughout treatment, enabling mid-course prognosis adjustment

What This Means for Patients and Clinicians

For patients, M³Surv offers more accurate prognosis, enabling better-informed discussions about treatment intensity, clinical trial eligibility, and life planning decisions.

For oncologists, it provides a principled second opinion integrating vast amounts of complex data that no human could synthesize manually. The ability to function with incomplete data is particularly valuable—clinicians need not delay treatment while awaiting expensive molecular tests.

For researchers, it demonstrates that respecting biological complexity—through multi-slide integration and comprehensive omics—yields real performance gains while remaining computationally practical.

The Bottom Line: A Paradigm Shift in Precision Oncology

M³Surv represents a genuine advance in how artificial intelligence can enhance clinical cancer care. By thoughtfully integrating multiple data modalities, intelligently handling real-world incompleteness, and maintaining interpretability, it achieves what previous approaches could not: a robust, practical, accurate survival prediction system suitable for heterogeneous patient populations and imperfect clinical environments.

The 2.2% average improvement in predictive accuracy might seem modest in isolation. But in oncology—where each percentage point improvement in risk stratification translates to more targeted treatment, fewer unnecessary toxic therapies, and better patient outcomes—this represents meaningful clinical progress.

As this technology matures through clinical validation and integration into hospital workflows, it will likely join the arsenal of tools that drive increasingly personalized, data-driven cancer care.

Ready to Understand More About AI in Healthcare?

The intersection of artificial intelligence and clinical medicine is evolving rapidly. If you work in oncology, computational pathology, or healthcare AI, understanding frameworks like M³Surv is essential for staying current with the field’s trajectory.

Share your thoughts: Are you involved in developing or implementing AI for cancer prognostication? Have you encountered challenges with multimodal data integration in clinical settings? Leave a comment below or reach out to discuss how these technologies might apply to your specific clinical or research context.

Stay informed: Subscribe to receive updates on emerging developments in precision oncology and healthcare AI—the field advances faster than ever before.

If you want read the full paper, then click the link here.

Below is a complete, end-to-end implementation of the M³Surv framework with all major components.

"""
M³Surv: Memory-Augmented Multimodal Survival Prediction Framework
Complete end-to-end implementation with multi-slide hypergraph learning,
multi-omics integration, and prototype-based memory bank for handling missing modalities.
"""

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.optim import Adam
from torch_geometric.nn import GCNConv, HypergraphConv
import numpy as np
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
from typing import Dict, List, Tuple, Optional
import warnings
warnings.filterwarnings('ignore')


# ============================================================================
# FEATURE EXTRACTION ENCODERS
# ============================================================================

class PathologyEncoder(nn.Module):
    """
    Encodes pathology features from WSI patches using ResNet50 backbone.
    Extracts d-dimensional features from each patch at 20x magnification.
    """
    def __init__(self, feature_dim: int = 768, pretrained: bool = True):
        super(PathologyEncoder, self).__init__()
        import torchvision.models as models
        
        resnet = models.resnet50(pretrained=pretrained)
        self.backbone = nn.Sequential(*list(resnet.children())[:-1])
        self.feature_dim = feature_dim
        self.projection = nn.Linear(2048, feature_dim)
        
    def forward(self, patch_features: torch.Tensor) -> torch.Tensor:
        """
        Args:
            patch_features: (N_patches, 3, 256, 256) image tensor
        Returns:
            embeddings: (N_patches, feature_dim) feature embeddings
        """
        x = self.backbone(patch_features)  # (N_patches, 2048, 1, 1)
        x = x.squeeze(-1).squeeze(-1)  # (N_patches, 2048)
        embeddings = self.projection(x)  # (N_patches, feature_dim)
        return embeddings


class GenomicsEncoder(nn.Module):
    """
    Encodes genomic data into functional categories using Spiking Neural Networks.
    Maps 6 functional gene groups to feature space.
    """
    def __init__(self, feature_dim: int = 768):
        super(GenomicsEncoder, self).__init__()
        self.feature_dim = feature_dim
        # 6 functional gene categories
        self.genomics_proj = nn.Linear(6, feature_dim)
        self.norm = nn.BatchNorm1d(feature_dim)
        
    def forward(self, genomics: torch.Tensor) -> torch.Tensor:
        """
        Args:
            genomics: (batch_size, 6) functional gene features
        Returns:
            embeddings: (batch_size, feature_dim)
        """
        x = self.genomics_proj(genomics)
        x = self.norm(x)
        return x


class TranscriptomicsEncoder(nn.Module):
    """
    Encodes transcriptomic pathway data from Reactome and MSigDB.
    Maps 331 biological pathways to feature space.
    """
    def __init__(self, num_pathways: int = 331, feature_dim: int = 768):
        super(TranscriptomicsEncoder, self).__init__()
        self.feature_dim = feature_dim
        self.pathway_embedding = nn.Linear(num_pathways, feature_dim)
        self.norm = nn.BatchNorm1d(feature_dim)
        
    def forward(self, transcriptomics: torch.Tensor) -> torch.Tensor:
        """
        Args:
            transcriptomics: (batch_size, 331) pathway expression
        Returns:
            embeddings: (batch_size, feature_dim)
        """
        x = self.pathway_embedding(transcriptomics)
        x = self.norm(x)
        return x


class ProteomicsEncoder(nn.Module):
    """
    Encodes protein expression and sequence information.
    Uses protein sequence embeddings (ESM) combined with expression levels.
    """
    def __init__(self, num_proteins: int = 100, feature_dim: int = 768):
        super(ProteomicsEncoder, self).__init__()
        self.feature_dim = feature_dim
        self.protein_embedding = nn.Linear(num_proteins + 1280, feature_dim)  # 1280 = ESM embed dim
        self.norm = nn.BatchNorm1d(feature_dim)
        
    def forward(self, protein_expr: torch.Tensor, 
                protein_seq_embed: torch.Tensor) -> torch.Tensor:
        """
        Args:
            protein_expr: (batch_size, 100) protein expression levels
            protein_seq_embed: (batch_size, 100, 1280) sequence embeddings
        Returns:
            embeddings: (batch_size, feature_dim)
        """
        # Average sequence embeddings across proteins
        seq_avg = protein_seq_embed.mean(dim=1)  # (batch_size, 1280)
        combined = torch.cat([protein_expr, seq_avg], dim=1)  # (batch_size, 1380)
        x = self.protein_embedding(combined)
        x = self.norm(x)
        return x


# ============================================================================
# HYPERGRAPH LEARNING MODULES
# ============================================================================

class HypergraphConstruction(nn.Module):
    """
    Constructs intra-slide hypergraphs with topological and structural hyperedges.
    Topological: spatial proximity (Euclidean distance)
    Structural: morphological similarity (cosine distance)
    """
    def __init__(self, spatial_threshold: float = 100.0, 
                 similarity_threshold: float = 0.7, k_neighbors: int = 9):
        super(HypergraphConstruction, self).__init__()
        self.spatial_threshold = spatial_threshold
        self.similarity_threshold = similarity_threshold
        self.k_neighbors = k_neighbors
        
    def forward(self, patches: torch.Tensor, 
                coordinates: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
        """
        Constructs hyperedge indices for topological and structural connections.
        
        Args:
            patches: (N_patches, feature_dim)
            coordinates: (N_patches, 2) spatial coordinates
            
        Returns:
            topo_edges: topological hyperedge indices
            struct_edges: structural hyperedge indices
        """
        N = patches.shape[0]
        device = patches.device
        
        # Topological hyperedges (spatial proximity)
        distances = torch.cdist(coordinates, coordinates)  # (N, N)
        topo_mask = distances <= self.spatial_threshold
        topo_edges = torch.nonzero(topo_mask, as_tuple=False).t()  # (2, num_edges)
        
        # Structural hyperedges (feature similarity)
        # Normalize patches for cosine similarity
        patches_norm = F.normalize(patches, p=2, dim=1)
        similarity = torch.mm(patches_norm, patches_norm.t())  # (N, N)
        struct_mask = similarity >= self.similarity_threshold
        struct_edges = torch.nonzero(struct_mask, as_tuple=False).t()
        
        return topo_edges.to(device), struct_edges.to(device)


class IntraSlideHypergraphConv(nn.Module):
    """
    Hypergraph convolution layer for intra-slide feature aggregation.
    Captures higher-order cellular structures through hyperedges.
    """
    def __init__(self, in_channels: int, out_channels: int, num_layers: int = 3):
        super(IntraSlideHypergraphConv, self).__init__()
        self.layers = nn.ModuleList([
            HypergraphConv(in_channels if i == 0 else out_channels, out_channels)
            for i in range(num_layers)
        ])
        self.norm = nn.BatchNorm1d(out_channels)
        self.relu = nn.ReLU()
        
    def forward(self, x: torch.Tensor, hyperedge_index: torch.Tensor) -> torch.Tensor:
        """
        Args:
            x: (N_patches, in_channels) node features
            hyperedge_index: (2, num_edges) hyperedge indices
            
        Returns:
            aggregated: (N_patches, out_channels)
        """
        for layer in self.layers:
            x = layer(x, hyperedge_index)
            x = self.norm(x)
            x = self.relu(x)
        return x


class InterSlideHypergraph(nn.Module):
    """
    Models relationships between FF and FFPE slides using inter-slide hyperedges.
    Aligns semantically similar patches across slide types.
    """
    def __init__(self, feature_dim: int = 768, similarity_threshold: float = 0.8):
        super(InterSlideHypergraph, self).__init__()
        self.similarity_threshold = similarity_threshold
        self.feature_dim = feature_dim
        
    def forward(self, ff_patches: torch.Tensor, 
                ffpe_patches: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
        """
        Identifies inter-slide correspondences based on feature similarity.
        
        Args:
            ff_patches: (N_ff, feature_dim)
            ffpe_patches: (N_ffpe, feature_dim)
            
        Returns:
            inter_edges: hyperedge indices
            similarity_scores: alignment confidence scores
        """
        # Normalize for cosine similarity
        ff_norm = F.normalize(ff_patches, p=2, dim=1)
        ffpe_norm = F.normalize(ffpe_patches, p=2, dim=1)
        
        # Cross-slide similarity
        cross_similarity = torch.mm(ff_norm, ffpe_norm.t())  # (N_ff, N_ffpe)
        
        # Find high-confidence matches
        mask = cross_similarity >= self.similarity_threshold
        inter_edges = torch.nonzero(mask, as_tuple=False).t()
        sim_scores = cross_similarity[mask]
        
        return inter_edges, sim_scores


# ============================================================================
# MULTIMODAL FUSION
# ============================================================================

class CrossAttentionFusion(nn.Module):
    """
    Bidirectional cross-attention fusion between pathology and omics modalities.
    Enables rich inter-modality interactions.
    """
    def __init__(self, feature_dim: int = 768, num_heads: int = 8):
        super(CrossAttentionFusion, self).__init__()
        self.feature_dim = feature_dim
        self.num_heads = num_heads
        
        # Query, Key, Value projections for pathology -> omics
        self.W_q_p = nn.Linear(feature_dim, feature_dim)
        self.W_k_g = nn.Linear(feature_dim, feature_dim)
        self.W_v_g = nn.Linear(feature_dim, feature_dim)
        
        # Query, Key, Value projections for omics -> pathology
        self.W_q_g = nn.Linear(feature_dim, feature_dim)
        self.W_k_p = nn.Linear(feature_dim, feature_dim)
        self.W_v_p = nn.Linear(feature_dim, feature_dim)
        
        self.scale = np.sqrt(feature_dim)
        self.softmax = nn.Softmax(dim=-1)
        
    def forward(self, pathology: torch.Tensor, 
                omics: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
        """
        Args:
            pathology: (batch_size, num_patches, feature_dim) or (batch_size, feature_dim)
            omics: (batch_size, num_omic_features, feature_dim) or (batch_size, feature_dim)
            
        Returns:
            pathology_enhanced: pathology attended to by omics
            omics_enhanced: omics attended to by pathology
        """
        # Handle 2D inputs (flattened features)
        if pathology.dim() == 2:
            pathology = pathology.unsqueeze(1)
        if omics.dim() == 2:
            omics = omics.unsqueeze(1)
        
        # Pathology attended to by omics (multi-head simulation)
        Q_p = self.W_q_p(pathology)  # (batch, seq_p, feature_dim)
        K_g = self.W_k_g(omics)      # (batch, seq_g, feature_dim)
        V_g = self.W_v_g(omics)      # (batch, seq_g, feature_dim)
        
        scores_pg = torch.matmul(Q_p, K_g.transpose(-2, -1)) / self.scale
        attn_pg = self.softmax(scores_pg)
        pathology_enhanced = torch.matmul(attn_pg, V_g)  # (batch, seq_p, feature_dim)
        
        # Omics attended to by pathology
        Q_g = self.W_q_g(omics)
        K_p = self.W_k_p(pathology)
        V_p = self.W_v_p(pathology)
        
        scores_gp = torch.matmul(Q_g, K_p.transpose(-2, -1)) / self.scale
        attn_gp = self.softmax(scores_gp)
        omics_enhanced = torch.matmul(attn_gp, V_p)  # (batch, seq_g, feature_dim)
        
        # Flatten back to (batch, feature_dim)
        pathology_enhanced = pathology_enhanced.mean(dim=1)
        omics_enhanced = omics_enhanced.mean(dim=1)
        
        return pathology_enhanced, omics_enhanced


# ============================================================================
# PROTOTYPE-BASED MEMORY BANK
# ============================================================================

class PrototypeMemoryBank(nn.Module):
    """
    Memory bank storing representative pathology-omics feature prototypes.
    Enables robust imputation for missing modalities during inference.
    """
    def __init__(self, num_prototypes: int = 10, feature_dim: int = 768, 
                 momentum: float = 0.75):
        super(PrototypeMemoryBank, self).__init__()
        self.num_prototypes = num_prototypes
        self.feature_dim = feature_dim
        self.momentum = momentum
        
        # Initialize memory bank
        self.register_buffer('proto_pathology', 
                           torch.randn(num_prototypes, feature_dim) / np.sqrt(feature_dim))
        self.register_buffer('proto_omics',
                           torch.randn(num_prototypes, feature_dim) / np.sqrt(feature_dim))
        
        self.pathology_features_buffer = []
        self.omics_features_buffer = []
        self.cluster_assignments = []
        
    def update_prototypes(self, pathology_feats: torch.Tensor, 
                         omics_feats: torch.Tensor):
        """
        Updates prototypes via momentum-based update using k-means clustering.
        
        Args:
            pathology_feats: (batch_size, feature_dim)
            omics_feats: (batch_size, feature_dim)
        """
        device = pathology_feats.device
        
        # Store features
        self.pathology_features_buffer.append(pathology_feats.detach().cpu().numpy())
        self.omics_features_buffer.append(omics_feats.detach().cpu().numpy())
        
        # Perform k-means clustering periodically (every N batches)
        if len(self.pathology_features_buffer) >= 5:
            # Concatenate all stored features
            all_path = np.concatenate(self.pathology_features_buffer, axis=0)
            all_omics = np.concatenate(self.omics_features_buffer, axis=0)
            
            # Cluster pathology features
            kmeans_p = KMeans(n_clusters=self.num_prototypes, random_state=42, n_init=10)
            kmeans_p.fit(all_path)
            proto_path_new = torch.tensor(kmeans_p.cluster_centers_, 
                                         dtype=torch.float32, device=device)
            
            # Cluster omics features
            kmeans_g = KMeans(n_clusters=self.num_prototypes, random_state=42, n_init=10)
            kmeans_g.fit(all_omics)
            proto_omics_new = torch.tensor(kmeans_g.cluster_centers_,
                                          dtype=torch.float32, device=device)
            
            # Momentum update
            self.proto_pathology.data = (self.momentum * self.proto_pathology.data + 
                                        (1 - self.momentum) * proto_path_new)
            self.proto_omics.data = (self.momentum * self.proto_omics.data + 
                                    (1 - self.momentum) * proto_omics_new)
            
            # Clear buffers
            self.pathology_features_buffer = []
            self.omics_features_buffer = []
    
    def retrieve_missing_modality(self, available_feats: torch.Tensor, 
                                 missing_type: str = 'omics', 
                                 top_k: int = 1) -> torch.Tensor:
        """
        Retrieves missing modality features from memory bank using available modality.
        
        Args:
            available_feats: (batch_size, feature_dim) available modality features
            missing_type: 'omics' or 'pathology'
            top_k: number of prototypes to retrieve and average
            
        Returns:
            imputed_feats: (batch_size, feature_dim) imputed features
        """
        available_feats_norm = F.normalize(available_feats, p=2, dim=1)
        
        if missing_type == 'omics':
            # Available: pathology, retrieve: omics
            proto_available = F.normalize(self.proto_pathology, p=2, dim=1)
            proto_missing = self.proto_omics
        else:
            # Available: omics, retrieve: pathology
            proto_available = F.normalize(self.proto_omics, p=2, dim=1)
            proto_missing = self.proto_pathology
        
        # Similarity-weighted retrieval
        similarity = torch.mm(available_feats_norm, proto_available.t())  # (batch, num_proto)
        top_sim, top_indices = torch.topk(similarity, k=min(top_k, self.num_prototypes), dim=1)
        
        # Softmax weights for weighted combination
        weights = F.softmax(top_sim, dim=1)  # (batch, top_k)
        
        # Weighted sum of retrieved prototypes
        imputed = torch.zeros_like(available_feats)
        for b in range(available_feats.shape[0]):
            imputed[b] = (weights[b].unsqueeze(1) * 
                         proto_missing[top_indices[b]]).sum(dim=0)
        
        return imputed


# ============================================================================
# SURVIVAL PREDICTION HEAD
# ============================================================================

class SurvivalHead(nn.Module):
    """
    Predicts hazard and survival functions using negative log-likelihood loss.
    Supports discrete-time survival analysis.
    """
    def __init__(self, feature_dim: int = 768, num_time_bins: int = 50,
                 hidden_dim: int = 256):
        super(SurvivalHead, self).__init__()
        
        self.feature_dim = feature_dim
        self.num_time_bins = num_time_bins
        
        # Prediction network
        self.fc1 = nn.Linear(feature_dim, hidden_dim)
        self.dropout = nn.Dropout(0.2)
        self.fc2 = nn.Linear(hidden_dim, hidden_dim)
        
        # Hazard and survival outputs
        self.hazard_head = nn.Linear(hidden_dim, num_time_bins)
        self.survival_head = nn.Linear(hidden_dim, num_time_bins)
        
        self.relu = nn.ReLU()
        
    def forward(self, features: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
        """
        Args:
            features: (batch_size, feature_dim) fused features
            
        Returns:
            hazard: (batch_size, num_time_bins)
            survival: (batch_size, num_time_bins)
        """
        x = self.fc1(features)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc2(x)
        x = self.relu(x)
        
        hazard = self.hazard_head(x)
        survival = F.sigmoid(self.survival_head(x))  # Constrain to [0, 1]
        
        return hazard, survival


# ============================================================================
# MAIN M³SURV MODEL
# ============================================================================

class M3Surv(nn.Module):
    """
    Complete M³Surv framework integrating:
    - Multi-slide hypergraph learning (FF + FFPE)
    - Multi-omics fusion (genomics + transcriptomics + proteomics)
    - Prototype-based memory bank for missing modality imputation
    """
    def __init__(self, feature_dim: int = 768, num_time_bins: int = 50,
                 num_prototypes: int = 10, momentum: float = 0.75,
                 spatial_threshold: float = 100.0, 
                 similarity_threshold: float = 0.7):
        super(M3Surv, self).__init__()
        
        self.feature_dim = feature_dim
        self.num_time_bins = num_time_bins
        
        # Feature encoders
        self.pathology_encoder = PathologyEncoder(feature_dim)
        self.genomics_encoder = GenomicsEncoder(feature_dim)
        self.transcriptomics_encoder = TranscriptomicsEncoder(feature_dim=feature_dim)
        self.proteomics_encoder = ProteomicsEncoder(feature_dim=feature_dim)
        
        # Hypergraph modules
        self.hypergraph_construction = HypergraphConstruction(
            spatial_threshold=spatial_threshold,
            similarity_threshold=similarity_threshold
        )
        self.intra_slide_hypergraph = IntraSlideHypergraphConv(
            feature_dim, feature_dim, num_layers=3
        )
        self.inter_slide_hypergraph = InterSlideHypergraph(
            feature_dim=feature_dim,
            similarity_threshold=0.8
        )
        
        # Adaptive slide weighting
        self.slide_weight_ff = nn.Parameter(torch.ones(feature_dim, 1) * 0.5)
        self.slide_weight_ffpe = nn.Parameter(torch.ones(feature_dim, 1) * 0.5)
        
        # Cross-attention fusion
        self.cross_attention = CrossAttentionFusion(feature_dim, num_heads=8)
        
        # Memory bank for missing modality handling
        self.memory_bank = PrototypeMemoryBank(
            num_prototypes=num_prototypes,
            feature_dim=feature_dim,
            momentum=momentum
        )
        
        # Survival prediction head
        self.survival_head = SurvivalHead(feature_dim * 2, num_time_bins)
        
    def forward(self, ff_patches: torch.Tensor, ffpe_patches: torch.Tensor,
                ff_coords: torch.Tensor, ffpe_coords: torch.Tensor,
                genomics: torch.Tensor, transcriptomics: torch.Tensor,
                proteomics: torch.Tensor, protein_seq_embed: torch.Tensor,
                missing_modality: Optional[str] = None) -> Tuple[torch.Tensor, torch.Tensor]:
        """
        Complete forward pass through M³Surv.
        
        Args:
            ff_patches: (batch_size, N_ff_patches, 3, 256, 256) or (batch_size, feat_dim)
            ffpe_patches: (batch_size, N_ffpe_patches, 3, 256, 256) or (batch_size, feat_dim)
            ff_coords: (batch_size, N_ff_patches, 2)
            ffpe_coords: (batch_size, N_ffpe_patches, 2)
            genomics: (batch_size, 6)
            transcriptomics: (batch_size, 331)
            proteomics: (batch_size, 100)
            protein_seq_embed: (batch_size, 100, 1280)
            missing_modality: None, 'pathology', or 'omics'
            
        Returns:
            hazard: (batch_size, num_time_bins)
            survival: (batch_size, num_time_bins)
        """
        batch_size = genomics.shape[0]
        device = genomics.device
        
        # ===== PATHOLOGY FEATURE EXTRACTION =====
        # Encode FF slides
        if ff_patches.dim() == 4:  # Raw images
            ff_features = self.pathology_encoder(ff_patches)
        else:  # Pre-extracted features
            ff_features = ff_patches
        
        # Encode FFPE slides
        if ffpe_patches.dim() == 4:  # Raw images
            ffpe_features = self.pathology_encoder(ffpe_patches)
        else:  # Pre-extracted features
            ffpe_features = ffpe_patches
        
        # ===== INTRA-SLIDE HYPERGRAPH PROCESSING =====
        # FF slides
        ff_topo_edges, ff_struct_edges = self.hypergraph_construction(
            ff_features, ff_coords
        )
        ff_combined_edges = torch.cat([ff_topo_edges, ff_struct_edges], dim=1)
        ff_agg = self.intra_slide_hypergraph(ff_features, ff_combined_edges)
        
        # FFPE slides
        ffpe_topo_edges, ffpe_struct_edges = self.hypergraph_construction(
            ffpe_features, ffpe_coords
        )
        ffpe_combined_edges = torch.cat([ffpe_topo_edges, ffpe_struct_edges], dim=1)
        ffpe_agg = self.intra_slide_hypergraph(ffpe_features, ffpe_combined_edges)
        
        # Aggregate to patient level
        ff_patient = ff_agg.mean(dim=0, keepdim=True)  # (1, feature_dim)
        ffpe_patient = ffpe_agg.mean(dim=0, keepdim=True)  # (1, feature_dim)
        
        # Expand to batch
        ff_patient = ff_patient.expand(batch_size, -1)
        ffpe_patient = ffpe_patient.expand(batch_size, -1)
        
        # ===== INTER-SLIDE ALIGNMENT =====
        # Adaptive weighting
        w_ff = torch.softmax(torch.cat([
            (self.slide_weight_ff.t() @ ff_patient.t()).squeeze(),
            (self.slide_weight_ffpe.t() @ ffpe_patient.t()).squeeze()
        ]).unsqueeze(0), dim=-1)
        
        # Cross-attention alignment using inter-slide features
        inter_edges, inter_sim = self.inter_slide_hypergraph(ff_patient, ffpe_patient)
        inter_features = (ff_patient + ffpe_patient) / 2  # Shared guidance
        
        # Align FF and FFPE using inter-slide guidance
        ff_aligned = ff_patient + w_ff[:, 0:1] * (
            F.linear(ff_patient, self.cross_attention.W_q_p.weight) @ 
            F.linear(inter_features, self.cross_attention.W_k_p.weight).t() / 
            np.sqrt(self.feature_dim)
        ).softmax(dim=-1) @ F.linear(inter_features, self.cross_attention.W_v_p.weight)
        
        ffpe_aligned = ffpe_patient + w_ff[:, 1:2] * (
            F.linear(ffpe_patient, self.cross_attention.W_q_p.weight) @ 
            F.linear(inter_features, self.cross_attention.W_k_p.weight).t() / 
            np.sqrt(self.feature_dim)
        ).softmax(dim=-1) @ F.linear(inter_features, self.cross_attention.W_v_p.weight)
        
        # Concatenate aligned representations
        pathology_feat = torch.cat([ff_aligned, ffpe_aligned], dim=1)  # (batch, 2*feature_dim)
        pathology_feat_fused = pathology_feat[:, :self.feature_dim]  # Extract for encoding
        
        # ===== OMICS FEATURE EXTRACTION =====
        if missing_modality != 'omics':
            genomics_feat = self.genomics_encoder(genomics)
            transcriptomics_feat = self.transcriptomics_encoder(transcriptomics)
            proteomics_feat = self.proteomics_encoder(proteomics, protein_seq_embed)
            
            # Aggregate multi-omics
            omics_feat = (genomics_feat + transcriptomics_feat + proteomics_feat) / 3
        else:
            # Impute missing omics from memory bank
            omics_feat = self.memory_bank.retrieve_missing_modality(
                pathology_feat_fused, missing_type='omics', top_k=1
            )
        
        # ===== MULTIMODAL FUSION =====
        if missing_modality != 'pathology':
            pathology_enhanced, omics_enhanced = self.cross_attention(
                pathology_feat_fused, omics_feat
            )
        else:
            # Impute missing pathology from memory bank
            pathology_enhanced = self.memory_bank.retrieve_missing_modality(
                omics_feat, missing_type='pathology', top_k=1
            )
            omics_enhanced = omics_feat
        
        # Concatenate fused features
        fused_features = torch.cat([pathology_enhanced, omics_enhanced], dim=1)
        
        # ===== UPDATE MEMORY BANK =====
        self.memory_bank.update_prototypes(pathology_feat_fused, omics_feat)
        
        # ===== SURVIVAL PREDICTION =====
        hazard, survival = self.survival_head(fused_features)
        
        return hazard, survival


# ============================================================================
# TRAINING UTILITIES
# ============================================================================

class SurvivalLoss(nn.Module):
    """
    Negative log-likelihood loss for survival prediction.
    Handles both event and censoring terms.
    """
    def __init__(self):
        super(SurvivalLoss, self).__init__()
        
    def forward(self, hazard: torch.Tensor, survival: torch.Tensor,
                event: torch.Tensor, time: torch.Tensor) -> torch.Tensor:
        """
        Args:
            hazard: (batch_size, num_time_bins)
            survival: (batch_size, num_time_bins)
            event: (batch_size,) binary event indicator
            time: (batch_size,) observed time points
            
        Returns:
            loss: scalar loss value
        """
        # Discretize time to bins
        batch_size = event.shape[0]
        num_bins = survival.shape[1]
        time_bins = (time.float() / time.max() * (num_bins - 1)).long()
        time_bins = torch.clamp(time_bins, 0, num_bins - 1)
        
        # Event log-likelihood: log(hazard * survival)
        event_ll = torch.zeros(batch_size, device=hazard.device)
        for i in range(batch_size):
            t = time_bins[i]
            if event[i] == 1:  # Event occurred
                event_ll[i] = torch.log(hazard[i, t] + 1e-8)
            
        # Censoring log-likelihood: log(survival)
        cens_ll = torch.zeros(batch_size, device=survival.device)
        for i in range(batch_size):
            t = time_bins[i]
            if event[i] == 0:  # Censored
                cens_ll[i] = torch.log(survival[i, t] + 1e-8)
        
        # Combined loss
        loss = -(event[event == 1].shape[0] * event_ll[event == 1].mean() + 
                 event[event == 0].shape[0] * cens_ll[event == 0].mean() + 1e-8)
        
        return loss


def train_m3surv(model: M3Surv, train_loader, val_loader, 
                 num_epochs: int = 30, learning_rate: float = 1e-4,
                 weight_decay: float = 1e-5, device: str = 'cuda'):
    """
    Training loop for M³Surv model.
    
    Args:
        model: M³Surv instance
        train_loader: DataLoader for training data
        val_loader: DataLoader for validation data
        num_epochs: number of training epochs
        learning_rate: optimization learning rate
        weight_decay: L2 regularization weight
        device: 'cuda' or 'cpu'
    """
    model = model.to(device)
    optimizer = Adam(model.parameters(), lr=learning_rate, weight_decay=weight_decay)
    criterion = SurvivalLoss()
    
    best_val_loss = float('inf')
    
    for epoch in range(num_epochs):
        # Training phase
        model.train()
        train_loss = 0.0
        train_count = 0
        
        for batch_idx, batch in enumerate(train_loader):
            optimizer.zero_grad()
            
            # Unpack batch
            ff_patches = batch['ff_patches'].to(device)
            ffpe_patches = batch['ffpe_patches'].to(device)
            ff_coords = batch['ff_coords'].to(device)
            ffpe_coords = batch['ffpe_coords'].to(device)
            genomics = batch['genomics'].to(device)
            transcriptomics = batch['transcriptomics'].to(device)
            proteomics = batch['proteomics'].to(device)
            protein_seq_embed = batch['protein_seq_embed'].to(device)
            event = batch['event'].to(device)
            time = batch['time'].to(device)
            
            # Forward pass
            hazard, survival = model(
                ff_patches, ffpe_patches, ff_coords, ffpe_coords,
                genomics, transcriptomics, proteomics, protein_seq_embed,
                missing_modality=None
            )
            
            # Loss computation
            loss = criterion(hazard, survival, event, time)
            
            # Backward pass
            loss.backward()
            optimizer.step()
            
            train_loss += loss.item()
            train_count += 1
        
        avg_train_loss = train_loss / train_count
        
        # Validation phase
        model.eval()
        val_loss = 0.0
        val_count = 0
        
        with torch.no_grad():
            for batch in val_loader:
                ff_patches = batch['ff_patches'].to(device)
                ffpe_patches = batch['ffpe_patches'].to(device)
                ff_coords = batch['ff_coords'].to(device)
                ffpe_coords = batch['ffpe_coords'].to(device)
                genomics = batch['genomics'].to(device)
                transcriptomics = batch['transcriptomics'].to(device)
                proteomics = batch['proteomics'].to(device)
                protein_seq_embed = batch['protein_seq_embed'].to(device)
                event = batch['event'].to(device)
                time = batch['time'].to(device)
                
                hazard, survival = model(
                    ff_patches, ffpe_patches, ff_coords, ffpe_coords,
                    genomics, transcriptomics, proteomics, protein_seq_embed,
                    missing_modality=None
                )
                
                loss = criterion(hazard, survival, event, time)
                val_loss += loss.item()
                val_count += 1
        
        avg_val_loss = val_loss / val_count
        
        # Log progress
        print(f"Epoch {epoch+1}/{num_epochs} | "
              f"Train Loss: {avg_train_loss:.4f} | "
              f"Val Loss: {avg_val_loss:.4f}")
        
        # Early stopping
        if avg_val_loss < best_val_loss:
            best_val_loss = avg_val_loss
            torch.save(model.state_dict(), 'best_m3surv_model.pth')
            print(f"  → Saved best model (val_loss: {best_val_loss:.4f})")
    
    return model


def evaluate_concordance_index(model: M3Surv, test_loader, device: str = 'cuda') -> float:
    """
    Evaluates model using Concordance Index (C-Index) metric.
    Measures agreement between predicted risk and observed survival.
    
    Args:
        model: M³Surv instance
        test_loader: DataLoader for test data
        device: 'cuda' or 'cpu'
        
    Returns:
        c_index: Concordance Index score (0.5 to 1.0)
    """
    from scipy.stats import rankdata
    
    model = model.to(device)
    model.eval()
    
    all_risk_scores = []
    all_times = []
    all_events = []
    
    with torch.no_grad():
        for batch in test_loader:
            ff_patches = batch['ff_patches'].to(device)
            ffpe_patches = batch['ffpe_patches'].to(device)
            ff_coords = batch['ff_coords'].to(device)
            ffpe_coords = batch['ffpe_coords'].to(device)
            genomics = batch['genomics'].to(device)
            transcriptomics = batch['transcriptomics'].to(device)
            proteomics = batch['proteomics'].to(device)
            protein_seq_embed = batch['protein_seq_embed'].to(device)
            
            hazard, survival = model(
                ff_patches, ffpe_patches, ff_coords, ffpe_coords,
                genomics, transcriptomics, proteomics, protein_seq_embed,
                missing_modality=None
            )
            
            # Risk score = cumulative hazard
            risk_scores = hazard.sum(dim=1).cpu().numpy()
            
            all_risk_scores.extend(risk_scores)
            all_times.extend(batch['time'].numpy())
            all_events.extend(batch['event'].numpy())
    
    all_risk_scores = np.array(all_risk_scores)
    all_times = np.array(all_times)
    all_events = np.array(all_events)
    
    # Compute C-Index
    c_index = 0.0
    pair_count = 0
    
    for i in range(len(all_times)):
        for j in range(len(all_times)):
            if all_events[i] == 1 and all_times[i] < all_times[j]:
                pair_count += 1
                if all_risk_scores[i] > all_risk_scores[j]:
                    c_index += 1
    
    if pair_count > 0:
        c_index = c_index / pair_count
    else:
        c_index = 0.5
    
    return c_index


# ============================================================================
# EXAMPLE USAGE
# ============================================================================

if __name__ == "__main__":
    print("=" * 70)
    print("M³SURV: Memory-Augmented Multimodal Survival Prediction")
    print("=" * 70)
    
    # Hyperparameters
    feature_dim = 768
    num_time_bins = 50
    num_prototypes = 10
    batch_size = 4
    num_epochs = 30
    device = 'cuda' if torch.cuda.is_available() else 'cpu'
    
    print(f"\n✓ Device: {device}")
    
    # Initialize model
    model = M3Surv(
        feature_dim=feature_dim,
        num_time_bins=num_time_bins,
        num_prototypes=num_prototypes,
        momentum=0.75,
        spatial_threshold=100.0,
        similarity_threshold=0.7
    )
    
    print(f"✓ Model initialized with {sum(p.numel() for p in model.parameters()):,} parameters")
    
    # Create dummy data for demonstration
    print("\n" + "=" * 70)
    print("DEMONSTRATION WITH DUMMY DATA")
    print("=" * 70)
    
    # Simulate batch
    batch = {
        'ff_patches': torch.randn(batch_size, 300, feature_dim),  # Pre-extracted features
        'ffpe_patches': torch.randn(batch_size, 4096, feature_dim),
        'ff_coords': torch.randn(batch_size, 300, 2) * 1000,
        'ffpe_coords': torch.randn(batch_size, 4096, 2) * 2000,
        'genomics': torch.randn(batch_size, 6),
        'transcriptomics': torch.randn(batch_size, 331),
        'proteomics': torch.randn(batch_size, 100),
        'protein_seq_embed': torch.randn(batch_size, 100, 1280),
        'event': torch.randint(0, 2, (batch_size,)).float(),
        'time': torch.randint(1, 100, (batch_size,)).float()
    }
    
    # Move to device
    for key in batch:
        if isinstance(batch[key], torch.Tensor):
            batch[key] = batch[key].to(device)
    
    model = model.to(device)
    
    # Forward pass
    print("\n→ Forward pass with complete data...")
    hazard, survival = model(
        batch['ff_patches'], batch['ffpe_patches'],
        batch['ff_coords'], batch['ffpe_coords'],
        batch['genomics'], batch['transcriptomics'],
        batch['proteomics'], batch['protein_seq_embed'],
        missing_modality=None
    )
    print(f"  Hazard shape: {hazard.shape}")
    print(f"  Survival shape: {survival.shape}")
    print(f"  Sample predictions - Hazard: {hazard[0, :5]}")
    print(f"  Sample predictions - Survival: {survival[0, :5]}")
    
    # Forward pass with missing omics
    print("\n→ Forward pass with MISSING OMICS (imputed from pathology)...")
    hazard_missing, survival_missing = model(
        batch['ff_patches'], batch['ffpe_patches'],
        batch['ff_coords'], batch['ffpe_coords'],
        batch['genomics'], batch['transcriptomics'],
        batch['proteomics'], batch['protein_seq_embed'],
        missing_modality='omics'
    )
    print(f"  Hazard shape: {hazard_missing.shape}")
    print(f"  Model gracefully handled missing modality ✓")
    
    # Forward pass with missing pathology
    print("\n→ Forward pass with MISSING PATHOLOGY (imputed from omics)...")
    hazard_missing_path, survival_missing_path = model(
        batch['ff_patches'], batch['ffpe_patches'],
        batch['ff_coords'], batch['ffpe_coords'],
        batch['genomics'], batch['transcriptomics'],
        batch['proteomics'], batch['protein_seq_embed'],
        missing_modality='pathology'
    )
    print(f"  Hazard shape: {hazard_missing_path.shape}")
    print(f"  Model gracefully handled missing modality ✓")
    
    print("\n" + "=" * 70)
    print("M³SURV implementation complete!")
    print("=" * 70)

Related posts, You May like to read

Share on Facebook

Post on X

Save