EG-VAN Transforms Skin Cancer Diagnosis

The complete workflow of proposed EG-VAN model.

Skin cancer diagnosis faces critical challenges: subtle variations within the same cancer type, striking similarities between benign and malignant lesions, and limited access to specialist dermatologists. Traditional methods often struggle with accuracy and scalability, leading to delayed or missed diagnoses. Enter EG-VAN – a groundbreaking AI system achieving 98.20% accuracy in classifying nine skin cancer types. This breakthrough technology promises faster, more reliable detection, potentially saving thousands of lives annually.

Why Skin Cancer Diagnosis Needs an AI Revolution

Melanoma, the deadliest skin cancer, caused 57,000 global deaths in 2020 alone. By 2040, cases could surge by 50%. Current diagnostic hurdles include:

  • Intra-class Variation: Melanomas can look drastically different (size, color, texture) based on growth stage or patient factors.
  • Inter-class Similarity: Harmless moles often resemble dangerous carcinomas, confusing even experts.
  • Resource Limitations: Scarce specialists and expensive equipment delay screenings.
  • Image Quality Issues: Hair, lighting artifacts, and low contrast obscure critical lesion details.

Existing AI solutions often sacrifice accuracy for speed or vice versa. Most can’t handle diverse datasets or complex multi-class classification effectively.

EG-VAN: The Dual-Path Powerhouse

Researchers from Prince Sultan University and Northeastern University developed EG-VAN (Efficient Global-Vision Attention-Net) to overcome these limitations. Its core innovation lies in a dual-branch architecture that mimics how dermatologists analyze lesions:

  1. The Global Analyst (EfficientNetV2S Branch):
    • Processes images with high efficiency using optimized “Fused-MB” convolution layers.
    • Captures broad patterns and overall lesion structure rapidly.
  2. The Detail Detective (Modified ResNet50 Branch):
    • Enhanced with two specialized modules:
      • Spatial-Context Group Attention (SCGA): Scans lesions horizontally and vertically like a grid search, pinpointing fine-grained details (e.g., irregular borders, subtle color shifts).
      • Non-Local Block (NLB): Identifies long-range dependencies across the entire image – crucial for spotting relationships between distant lesion features.
    • Maintains computational efficiency despite enhanced capabilities.

https://ieeexplore.ieee.org/abstract/document/10965692EG-VAN’s architecture synergizes global context and local precision.

The Fusion Advantage: Seeing the Whole Picture

EG-VAN doesn’t just run two models side-by-side. Its Multi-Scale Feature Fusion (MFF) Module intelligently combines outputs from both branches at different processing stages:

  • Integrates coarse-grained (global shape, location) and fine-grained (texture, micro-patterns) features.
  • Uses spatial attention and group-wise mechanisms to weight the most diagnostically relevant information.
  • Creates a comprehensive “feature map” far richer than any single model could produce.

This fusion is key to EG-VAN’s ability to distinguish between highly similar cancer types and recognize diverse presentations within a single class.

Seeing Clearly: Advanced Color Balancing

Dermoscopic images often suffer from color distortions due to lighting or equipment. EG-VAN incorporates a novel preprocessing step inspired by Gray World theory and Retinex theory:

  1. Gray World Adjustment: Corrects overall color casts by assuming the average image hue should be neutral.
  2. Retinex Enhancement: Separates lighting effects from true lesion reflectance, boosting contrast and revealing hidden details.

This combo significantly outperformed previous methods (Wiener filters, ESRGAN), achieving:

  • 32.5 dB PSNR (vs. 28.1 dB) – Higher signal clarity.
  • 0.92 SSIM (vs. 0.85) – Better structural preservation.
  • 18.4 MSE (vs. 24.9) – Lower distortion.

 Color balancing and hair removing dramatically improves lesion visibility for AI analysis.

Fig: (A) Original Image (B) gray world algorithm generated images (C) Retinex theory generated images (D) combined gray world algorithm + Retinex theory generated image.

Unmatched Performance: The Numbers Speak

EG-VAN was rigorously tested on combined datasets (HAM10000 + ISIC 2017), creating a robust 9-class benchmark:

  • Overall Accuracy: 98.20%
  • Recall (Sensitivity): 95.74%
  • F1-Score: 96.68%
  • ROC-AUC: 99.96%

Class-Specific Excellence:

  • Dermatofibroma (DF) & Vascular Lesions (VASC): 100% Accuracy, Precision, Recall, F1-Score
  • Basal Cell Carcinoma (BCC): 99.77% Accuracy, 98.04% Recall
  • Melanoma (MEL): 98.20% Accuracy, 92.79% Recall
  • Benign vs. Malignant Distinction: 99.85% Accuracy

Outperforming the Competition:

ModelAccuracy (9-class)Recall (9-class)Key Limitation
EG-VAN (Proposed)98.20%95.74%
MAFCNN-SCD [26]92.22%77.07%Lower recall
Modified ResNet50 [28]86%86%Modest accuracy
Ensemble + Max Voting [33]95.80%95.04%Lower accuracy
Xception+ResNet50+… [45]96.50%92.60%Lower recall

Ablation Studies Prove Value: Removing core components caused significant drops:

  • -MFF Module: Accuracy fell to 96.92%
  • -SCGA Module: Accuracy fell to 97.60%
  • -NLB Module: Accuracy fell to 97.75%

Real-World Impact: Beyond the Lab

EG-VAN isn’t just an academic exercise. Its design prioritizes clinical utility:

  • Computational Efficiency: Despite advanced features, parameter count remains optimized, enabling potential deployment on hospital-grade hardware (not just supercomputers).
  • Explainability: Grad-CAM visualizations (Figure 13/14) show EG-VAN focuses precisely on clinically relevant lesion areas (e.g., asymmetric regions, color variegation), building trust with doctors.
  • Handling Imbalance: Using Focal Loss instead of standard cross-entropy ensures the model doesn’t ignore rarer cancer types (like AKIEC) in favor of common ones (like NV).
  • Scalability: Successfully handles both 7-class and expanded 9-class datasets, showing adaptability to incorporate new lesion types.

The Future of AI-Powered Dermatology

EG-VAN represents a paradigm shift. Its dual-path, attention-driven approach tackles the core challenges of skin cancer recognition that stymied earlier AI systems. By achieving near-human (and often super-human) accuracy while providing visual explanations, it paves the way for:

  1. Primary Care Augmentation: GPs could use EG-VAN-powered tools for initial screenings, referring only high-risk cases to dermatologists.
  2. Tele-Dermatology Enhancement: Enable reliable remote diagnosis, especially in underserved areas.
  3. Faster Specialist Workflows: Assist dermatologists by prioritizing urgent cases and providing second opinions.
  4. Early Detection Programs: Integrate into mobile or community screening initiatives for broader population coverage.

If you’re Interested in skin cancer detection using Horizontal and Vertical Attention, you may also find this article helpful: Enhancing Skin Lesion Detection Accuracy

Ready to See the Future of Dermatology?

The battle against skin cancer demands faster, more accurate tools. EG-VAN demonstrates the immense potential of sophisticated, clinically-informed AI. Medical institutions and AI developers must collaborate to translate such breakthroughs from research papers into life-saving clinical applications.

  • For Healthcare Leaders: Explore pilot programs integrating advanced diagnostic AI like EG-VAN into your dermatology workflow.
  • For Researchers: Build upon this open-access work (licensed CC BY 4.0) – refine the models, test on broader datasets, and explore real-time deployment.
  • For Clinicians: Stay informed about validated AI tools that can enhance, not replace, your expert judgment.

Prioritize innovation in skin cancer diagnosis. Support the development, validation, and ethical deployment of AI systems like EG-VAN. The next leap in survival rates starts now. Explore the Full Research | Contact the Researchers

Based on the detailed information provided in the paper, I will reconstruct the complete code for the proposed model.

import tensorflow as tf
from tensorflow.keras import layers, Model, backend

def SCGA_module(input_tensor, groups=8):
    """Spatial-Context Group Attention Module"""
    channels = input_tensor.shape[-1]
    
    # Spatial Attention Branch
    x = layers.Conv2D(64, (1, 1), activation='relu')(input_tensor)
    x = layers.Conv2D(16, (1, 1), dilation_rate=2, activation='relu')(x)
    x = layers.Conv2D(8, (1, 1), activation='relu')(x)
    attn_mask = layers.Conv2D(1, (1, 1), activation='sigmoid')(x)
    
    # Global Context Integration
    pooled = layers.GlobalAveragePooling2D(keepdims=True)(attn_mask)
    attn_mask = layers.Multiply()([attn_mask, pooled])
    attn_mask = layers.Conv2D(channels, (1, 1), activation='linear')(attn_mask)
    
    # Apply attention
    x = layers.Multiply()([input_tensor, attn_mask])
    
    # Group-wise Mean-Max Attention
    group_size = channels // groups
    group_outputs = []
    
    for g in range(groups):
        group = x[:, :, :, g*group_size:(g+1)*group_size]
        
        # Horizontal features
        h_mean = tf.reduce_mean(group, axis=1, keepdims=True)
        h_max = tf.reduce_max(group, axis=1, keepdims=True)
        
        # Vertical features
        v_mean = tf.reduce_mean(group, axis=2, keepdims=True)
        v_max = tf.reduce_max(group, axis=2, keepdims=True)
        
        # Concatenate features
        features = tf.concat([h_mean, h_max, v_mean, v_max], axis=-1)
        
        # Attention scores
        h_attn = layers.Conv2D(1, (1, 1), activation='sigmoid')(features)
        v_attn = layers.Conv2D(1, (1, 1), activation='sigmoid')(features)
        combined_attn = layers.Activation('sigmoid')(h_attn + v_attn)
        
        # Apply attention
        attended_group = layers.Multiply()([group, combined_attn])
        group_outputs.append(attended_group)
    
    return layers.Concatenate(axis=-1)(group_outputs)

def NonLocalBlock(input_tensor):
    """Non-Local Attention Block"""
    channels = input_tensor.shape[-1]
    batch_size, height, width, _ = tf.shape(input_tensor)
    
    # Query, Key, Value transformations
    theta = layers.Conv2D(channels//8, (1, 1))(input_tensor)
    phi = layers.Conv2D(channels//8, (1, 1))(input_tensor)
    g = layers.Conv2D(channels//2, (1, 1))(input_tensor)
    
    # Reshape for matrix operations
    theta = tf.reshape(theta, (batch_size, -1, channels//8))
    phi = tf.reshape(phi, (batch_size, -1, channels//8))
    g = tf.reshape(g, (batch_size, -1, channels//2))
    
    # Attention map
    attn = tf.matmul(theta, phi, transpose_b=True)
    attn = tf.nn.softmax(attn, axis=-1)
    
    # Weighted sum
    y = tf.matmul(attn, g)
    y = tf.reshape(y, (batch_size, height, width, channels//2))
    y = layers.Conv2D(channels, (1, 1))(y)
    
    # Residual connection
    return layers.Add()([input_tensor, y])

def MFF_module(efficientnet_feat, resnet_feat):
    """Multi-Scale Feature Fusion Module"""
    # EfficientNet branch processing
    eff_attn = layers.Conv2D(256, (1, 1), activation='relu')(efficientnet_feat)
    
    # ResNet branch processing
    res_attn = layers.Conv2D(256, (1, 1), activation='relu')(resnet_feat)
    
    # Concatenate and process
    x = layers.Concatenate()([eff_attn, res_attn])
    x = layers.SeparableConv2D(512, (3, 3), strides=2, padding='same')(x)
    x = layers.BatchNormalization()(x)
    
    return SCGA_module(x)

def build_modified_resnet50():
    """Build ResNet50 branch with SCGA and Non-Local blocks"""
    input_tensor = layers.Input(shape=(224, 224, 3))
    
    # Stage 0
    x = layers.Conv2D(64, (7, 7), strides=2, padding='same')(input_tensor)
    x = layers.MaxPooling2D((3, 3), strides=2, padding='same')(x)
    
    # Stage 1
    for _ in range(3):
        residual = x
        x = layers.Conv2D(64, (1, 1))(x)
        x = layers.Conv2D(64, (3, 3), padding='same')(x)
        x = layers.Conv2D(256, (1, 1))(x)
        x = layers.Add()([x, residual])
    x = SCGA_module(x)
    
    # Stage 2
    for _ in range(4):
        residual = x
        x = layers.Conv2D(128, (1, 1))(x)
        x = layers.Conv2D(128, (3, 3), padding='same')(x)
        x = layers.Conv2D(512, (1, 1))(x)
        if residual.shape[-1] != 512:
            residual = layers.Conv2D(512, (1, 1))(residual)
        x = layers.Add()([x, residual])
    x = SCGA_module(x)
    
    # Stage 3
    for _ in range(6):
        residual = x
        x = layers.Conv2D(256, (1, 1))(x)
        x = layers.Conv2D(256, (3, 3), padding='same')(x)
        x = layers.Conv2D(1024, (1, 1))(x)
        if residual.shape[-1] != 1024:
            residual = layers.Conv2D(1024, (1, 1))(residual)
        x = layers.Add()([x, residual])
    x = NonLocalBlock(x)
    
    # Stage 4
    for _ in range(3):
        residual = x
        x = layers.Conv2D(512, (1, 1))(x)
        x = layers.Conv2D(512, (3, 3), padding='same')(x)
        x = layers.Conv2D(2048, (1, 1))(x)
        if residual.shape[-1] != 2048:
            residual = layers.Conv2D(2048, (1, 1))(residual)
        x = layers.Add()([x, residual])
    x = NonLocalBlock(x)
    
    return Model(input_tensor, x)

def build_egvan(num_classes=9):
    """Build complete EG-VAN model"""
    input_tensor = layers.Input(shape=(224, 224, 3))
    
    # Dual-branch backbone
    effnetv2s = tf.keras.applications.EfficientNetV2S(
        include_top=False, weights='imagenet', input_tensor=input_tensor)
    resnet50 = build_modified_resnet50()
    
    # Feature extraction points
    effnet_features = [
        effnetv2s.get_layer('block1a_project_conv').output,  # Stage 1
        effnetv2s.get_layer('block2b_expand_conv').output,   # Stage 2
        effnetv2s.get_layer('block4a_expand_conv').output,   # Stage 4
        effnetv2s.get_layer('block6a_expand_conv').output     # Stage 6
    ]
    
    resnet_features = [
        resnet50.get_layer(index=5).output,   # After Stage 1
        resnet50.get_layer(index=15).output,  # After Stage 2
        resnet50.get_layer(index=38).output,  # After Stage 3
        resnet50.output                       # Final output
    ]
    
    # Multi-Scale Feature Fusion
    fused_features = []
    for eff_feat, res_feat in zip(effnet_features, resnet_features):
        # Ensure compatible dimensions
        eff_feat = layers.Conv2D(256, (1, 1))(eff_feat)
        res_feat = layers.Conv2D(256, (1, 1))(res_feat)
        fused = MFF_module(eff_feat, res_feat)
        fused_features.append(fused)
    
    # Feature aggregation
    x = layers.Concatenate()([
        layers.GlobalAveragePooling2D()(f) for f in fused_features
    ])
    
    # Classification head
    x = layers.Dense(512, activation='relu')(x)
    x = layers.Dropout(0.5)(x)
    output = layers.Dense(num_classes, activation='softmax')(x)
    
    return Model(input_tensor, output)

# Compile the model
model = build_egvan()
model.compile(
    optimizer=tf.keras.optimizers.Adamax(learning_rate=0.001),
    loss=tf.keras.losses.CategoricalFocalCrossentropy(alpha=0.25, gamma=2),
    metrics=['accuracy', tf.keras.metrics.Recall(name='recall')]
)

model.summary()

Leave a Comment

Your email address will not be published. Required fields are marked *