Transforming Diabetic Foot Ulcer Care with AI-Powered Healing Phase Classification

Revolutionizing Diabetic Foot Ulcer Management: How Machine Learning Classifies Healing Phases Using Clinical Metadata

Diabetic foot ulcers (DFUs) are one of the most severe and costly complications of diabetes, affecting up to 25% of people with the condition during their lifetime. Left untreated or mismanaged, DFUs can progress to infection, gangrene, and ultimately lead to lower-limb amputation—a devastating outcome that impacts millions globally. Despite advances in wound care, a critical gap remains: the inability to accurately assess the biological healing phase of a DFU in real time, especially in resource-limited settings.

Traditional classification systems like the Wagner or University of Texas (UT) scales focus on structural depth and presence of infection but offer little insight into the underlying cellular processes driving healing. This limitation hinders personalized treatment, as therapies effective in one healing phase may be ineffective—or even harmful—in another.

Now, groundbreaking research by Basiri et al., published in Computers in Biology and Medicine, introduces an accessible, automated machine learning (ML) model that classifies DFU healing phases—inflammation, proliferation, and remodeling—using only routinely collected clinical metadata. This innovation promises to democratize advanced wound assessment, enabling early triage, continuous monitoring, and optimized treatment without reliance on expensive lab tests or imaging.

In this comprehensive guide, we’ll unpack the study’s methodology, results, and clinical implications, presenting actionable insights for healthcare providers, researchers, and patients. We’ll also explore the key features that drive accurate classification and how this AI-powered approach can transform diabetic foot care worldwide.

Why Healing Phase Classification Matters in Diabetic Foot Ulcers

The Three Stages of Wound Healing: A Biological Roadmap

Wound healing is not a linear checklist—it’s a dynamic, overlapping process governed by specific cellular and molecular events. Understanding which phase a DFU is in allows clinicians to tailor treatments precisely:

Inflammatory Phase (I):
- Characterized by neutrophil infiltration, platelet aggregation, and initial hemostasis.
- Goal: Control inflammation without suppressing necessary immune activity.
- Treatments: Debridement, anti-inflammatory agents, infection control.
Proliferative Phase (P):
- Marked by angiogenesis, fibroblast proliferation, collagen synthesis, and re-epithelialization.
- Goal: Promote tissue growth and vascularization.
- Treatments: Growth factors, moist wound dressings, offloading.
Remodeling Phase (R):
- Focuses on collagen maturation, tissue strengthening, and scar formation.
- Goal: Restore functional integrity.
- Treatments: Compression therapy, physical rehabilitation, long-term monitoring.

Key Insight: Phase-specific interventions—such as growth factor therapy during proliferation or controlled inflammation modulation—have been shown to significantly improve healing outcomes over generic wound care.

Limitations of Traditional DFU Classification Systems

While widely used, systems like Wagner Grade and UT Classification have notable drawbacks:

SYSTEM	FOCUS	LIMITATION
Wagner	Depth of ulcer, presence of bone involvement, gangrene	Does not reflect biological healing status; static assessment
UT Classification	Infection, ischemia, depth	Lacks granularity on cellular progression; not designed for phase-based treatment

These systems are essential for risk stratification but fall short in guiding biologically informed therapy.

Bridging the Gap: Machine Learning Meets Clinical Metadata

The Vision: Accessible, Real-Time Healing Assessment

The study by Basiri et al. addresses a critical unmet need: Can we classify DFU healing phases using only data that’s already collected during routine clinic visits?

Unlike prior ML models that rely on lab biomarkers (e.g., C-reactive protein, IL-10) or medical imaging—both costly and inaccessible in many settings—this research leverages 80 readily available clinical features, including:

Patient demographics (age, sex, BMI)
Comorbidities (neuropathy, heart disease)
Wound characteristics (exudate, odor, tunneling)
Peri-ulcer conditions (erythema, maceration)
Offloading methods and dressing types

By focusing on electronic health record (EHR)-ready data, the model ensures scalability across diverse healthcare environments—from urban hospitals to rural clinics and home care.

Methodology: Building an Explainable AI Model for Healing Phase Prediction

Dataset Overview: The Zivot DFU Cohort

The model was trained and validated on the Zivot DFU dataset, a rich retrospective collection from 268 patients and 890 wound assessment events at the Zivot Limb Preservation Centre in Calgary, Canada.

Key Dataset Characteristics:

Healing Phase Labels: 9 detailed labels grouped into 3 main classes:
- Inflammation (I): 31%
- Proliferative (P): 55.7%
- Remodeling (R): 13.3%
Features: 58 original + 22 engineered = 80 total features
Exclusions: Patients with gangrene, eschar, or wounds too deep for visual inspection were excluded to ensure reliable phase labeling.

Feature Engineering: Enhancing Predictive Power

To maximize model performance, the researchers applied several transformations:

1. Body Mass Index (BMI) Calculation

\[ \text{BMI} = \frac{\text{Weight (kg)}}{\text{Height (m)}^2} \]

2. Binning Continuous Variables

Age: grouped in 20-year increments
Weight: 20 kg bins
Height: 10 cm bins
Additional binary flag: Age > 60 (a known DFU risk factor)

3. Temperature Normalization

Temperatures were normalized to each patient’s intact skin temperature to account for individual variability.

4. Composite Score Features

Several categorical variables were aggregated into summary scores:

SCORE TYPES	COMPONENTS
Foot Bone Score	Total toe/arch deformities
Foot General Score	All foot abnormalities
Offloading Score	Number of offloading methods used
Habit Score	Smoking + alcohol consumption ratings
Clinical Score	Diabetes type, heart condition, cancer, neuropathy
Ulcer Score	Number of peri-ulcer conditions present

These scores helped reduce dimensionality while preserving clinical relevance.

5. Categorical Encoding

One-hot encoding was applied to categorical variables (e.g., dressing type, location).

Machine Learning Models Evaluated

Five ML algorithms were tested under two classification frameworks:

Standard Multiclass Classification
Ordinal One-vs-Rest (OvR) – leveraging the natural order of healing phases

Algorithms Tested:

Support Vector Machine (SVM)
XGBoost
CatBoost
Neural Networks (NNs)
Random Forest (RF)

Evaluation Framework: Leave-One-Patient-Out Cross-Validation (LOPOCV)

To prevent data leakage and ensure generalizability, the team used LOPOCV:

At each fold, one entire patient (and all their follow-up visits) was held out for testing.
The model was trained on the remaining 267 patients.
This mimics real-world deployment where predictions are made on new patients.

Performance was measured using:

AUC-ROC (Area Under the Curve)
Accuracy
F1-Score (harmonic mean of precision and recall, ideal for imbalanced data)

Results: The Best-Performing Model and Key Insights

Top Model: OvR with Random Forest (OvR-RF)

MODEL	AVG. AUC	ACCURACY	AVG. F1
OvR-RF	0.66	0.65	0.59
CatBoost	0.64	0.62	0.56
RF (standard)	0.61	0.62	0.51
XGBoost	0.51	0.25	0.25

✅ OvR-RF outperformed all others, demonstrating that treating the problem as two sequential binary decisions improves accuracy:

clf1: Inflammation vs. (Proliferation + Remodeling)

clf2: Proliferation vs. Remodeling

Feature Importance Analysis Using SHAP

The team used SHapley Additive exPlanations (SHAP) to identify the most influential features. SHAP values quantify how much each feature contributes to pushing the prediction toward a specific class.

Top 22 Most Impactful Features (Prescription-Inclusive)

RANK	FEATURES	SHAPE VALUE	INTERPRETATION
1	Primary Dressing Category	141.1	Antiseptic dressings strongly linked to active healing phases
2	Peri-Ulcer Temperature	90.3	Higher temps correlate with inflammation
3	Age	28.8	Older age associated with slower progression
4	Exudate Appearance (Bloody)	22.5	Bright red exudate indicates active inflammation
5	Wound Tunneling	22.1	Severe tunneling delays progression to remodeling

🔍 Key Finding: Only 22 features were needed to achieve ~63% accuracy, statistically similar to the full 56-feature model. This minimal set enhances deployability in low-resource settings.

Prescription-Inclusive vs. Prescription-Exclusive Models

To avoid potential data leakage (e.g., dressings prescribed based on lab results), two scenarios were analyzed:

APPROACH	# FEATURES	ACCURACY	KEY FEATURES INCLUDED
Prescription-Inclusive	56	0.65	Dressing type, offloading method
Prescription-Exclusive	45	0.61	Only intrinsic wound/patient traits

Even without treatment data, the model maintained strong performance, proving its robustness.

What the Data Reveals: Healing Phase-Specific Clinical Signatures

Using a one-vs-all binary classification, the researchers identified unique predictors for each phase.

Table: Phase-Specific Predictive Features

Inflammation (I)	Proliferative (P)	Remodeling (R)
• Bloody exudate	• Moist exudate	• Low exudate
• Peri-ulcer erythema	• Normalized wound temp	• Age > 60
• Maceration	• No tunneling	• Arch deformity absence
• High peri-ulcer temp	• Absorbent dressing	• Therapeutic footwear use
• Severe tunneling	• Midfoot location	• Callus presence
• Pale peri-ulcer skin	• DFU count = 1	• Foot bone score low

💡 Insight: The remodeling phase was classified with higher accuracy than inflammation, likely because its clinical signs (stable tissue, reduced exudate, mature granulation) are more distinct and less variable than the dynamic, fluctuating markers of inflammation.

Why This Approach Is a Game-Changer for Diabetic Foot Care

1. Accessibility: No Labs, No Imaging, Just EHR Data

Unlike previous studies that required biomarkers like Interleukin-10 (IL-10) or C-reactive protein (CRP), this model uses only routinely documented clinical observations—making it viable for:

Community clinics
Telehealth platforms
Home nursing assessments
Low-income regions

2. Actionable Intelligence for Personalized Treatment

By identifying the current healing phase, clinicians can:

Avoid unnecessary antibiotics in non-infected inflammatory wounds
Initiate growth factor therapy during proliferation
Transition to scar management in remodeling
Adjust offloading strategies based on phase-specific needs

3. Continuous Medical Triage System

When integrated into EHRs, this model enables:

Automated alerts for stalled healing
Early intervention before complications arise
Longitudinal tracking of healing progression

Addressing Challenges and Limitations

Class Imbalance: Proliferative Phase Dominance

With 55.7% of cases in the proliferative phase, the model could bias toward this class. To mitigate:

Random oversampling was applied
F1-score prioritized over accuracy for evaluation

Missing Data: KNN Imputation Strategy

Average missing data per feature: 2.5%
Imputed using K-Nearest Neighbors (KNN) with k=5, trained only on the training set to prevent leakage.

Ethical and Privacy Considerations

Data cannot be publicly shared due to ethics restrictions.
However, preprocessing code and feature descriptions are available on GitHub: DFUMetaClassification
The team is working toward controlled dataset release for research validation.

Future Directions and Clinical Integration

Potential Applications:

Smart EHR dashboards with real-time healing phase indicators
Mobile apps for nurses to input wound data and receive AI-driven guidance
Integration with wearable sensors (e.g., smart bandages with temperature monitoring)

Research Opportunities:

Expand to multimodal models combining metadata with wound images
Validate in diverse populations across different healthcare systems
Develop phase-specific treatment recommendation engines

Conclusion: A New Era in Diabetic Wound Management

The study by Basiri et al. represents a paradigm shift in DFU care. By harnessing machine learning and routinely collected clinical metadata, it delivers a reliable, accessible, and automated method to classify healing phases—something previously reserved for specialized labs.

Its success lies not just in algorithmic performance, but in clinical relevance:

It identifies 22 essential features sufficient for accurate classification.
It highlights age, temperature, exudate, and tunneling as key drivers.
It proves that treatment-independent assessment is possible and effective.

This approach paves the way for early triage, personalized therapy, and improved outcomes, reducing the global burden of DFU-related amputations.

📣 Call to Action: Join the Conversation

Are you a clinician, researcher, or patient advocate working in diabetic foot care?
👉 Share your thoughts below: How could AI-powered healing phase classification impact your practice or community?

🔗 Explore the code: Visit the DFUMetaClassification GitHub repo to see the preprocessing and modeling pipeline.

📚 Stay updated: Subscribe for future insights on AI in wound care, diabetic management, and digital health innovation.

👉 Read the Complete Paper here.

Together, we can turn data into healing—and prevent amputations, one smart decision at a time.

Based on the paper “Accessible healing phase classification of diabetic foot ulcer” by Basiri et al., here is a complete Python implementation of the proposed Ordinal Random Forest model.

#
# ==============================================================================
# Implementation of "Accessible healing phase classification of diabetic foot ulcer"
# Paper by: Reza Basiri et al. (Computers in Biology and Medicine, 2025)
# Code Author: Gemini
# Date: September 16, 2025
#
# This script reconstructs the proposed Ordinal Random Forest model based on the
# methodology described in the paper. Since the Zivot DFU dataset is not public,
# a mock dataset is generated to demonstrate the model's functionality.
# ==============================================================================

import numpy as np
import pandas as pd
from sklearn.model_selection import LeaveOneGroupOut
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.impute import KNNImputer
from sklearn.base import BaseEstimator, ClassifierMixin
from sklearn.metrics import classification_report, accuracy_score
from imblearn.over_sampling import RandomOverSampler
from imblearn.pipeline import Pipeline as ImbPipeline

# --- 1. Custom Ordinal Classifier ---
# Implements the One-versus-Rest (OvR) approach described in Section 2.3 and
# the probability calculation from Figure 3 of the paper[cite: 182, 188].

class OrdinalRandomForest(BaseEstimator, ClassifierMixin):
    """
    An ordinal classifier using two binary Random Forest models.

    This classifier follows the structure proposed in the paper where the
    3-class ordinal problem (Inflammation < Proliferation < Remodeling) is
    decomposed into two binary classification problems.
    """
    def __init__(self, **kwargs):
        # Using default RandomForestClassifier parameters as mentioned in the paper [cite: 184]
        self.clf1 = RandomForestClassifier(**kwargs, random_state=42)
        self.clf2 = RandomForestClassifier(**kwargs, random_state=42)
        self.classes_ = np.array([0, 1, 2]) # 0: I, 1: P, 2: R

    def fit(self, X, y):
        """
        Fits the two internal classifiers based on ordinal targets.
        Target y is expected to be encoded as 0 (Inflammation),
        1 (Proliferation), 2 (Remodeling).
        """
        # Create binary targets for the two classifiers
        # clf1 distinguishes Inflammation (0) from Proliferation/Remodeling (>0)
        y1 = (y > 0).astype(int)

        # clf2 distinguishes Inflammation/Proliferation (<2) from Remodeling (2)
        y2 = (y > 1).astype(int)

        self.clf1.fit(X, y1)
        self.clf2.fit(X, y2)
        return self

    def predict_proba(self, X):
        """
        Calculates class probabilities based on the formula from Figure 3[cite: 187].
        Note: The paper's formula `prob_P = clf1_prob * (1 - clf2_prob)` is non-standard
        and may not result in probabilities that sum perfectly to 1. A standard approach
        is `prob_P = prob(>I) - prob(>P)`. However, we implement the paper's specified method.
        """
        # Get probabilities from the binary classifiers
        # clf1_prob is the probability of the class being > Inflammation
        # clf2_prob is the probability of the class being > Proliferation (i.e., Remodeling)
        clf1_prob = self.clf1.predict_proba(X)[:, 1]
        clf2_prob = self.clf2.predict_proba(X)[:, 1]

        # Calculate probabilities for each of the three main classes
        prob_I = 1 - clf1_prob
        prob_P = clf1_prob * (1 - clf2_prob)
        prob_R = clf2_prob

        # Combine probabilities into a single array
        probs = np.vstack([prob_I, prob_P, prob_R]).T
        
        # Normalize probabilities to ensure they sum to 1
        return probs / probs.sum(axis=1, keepdims=True)


    def predict(self, X):
        """
        Predicts the class label by selecting the class with the highest probability.
        """
        return np.argmax(self.predict_proba(X), axis=1)

# --- 2. Data Simulation ---
# Generate a mock dataset as the real Zivot DFU dataset is not available.
# The features are the top 22 identified in the "prescription inclusive"
# scenario from Figure 5 (top)[cite: 478].

def create_mock_dfu_dataset(n_patients=268, n_records=890):
    """
    Creates a simulated DataFrame with the structure of the Zivot DFU dataset.
    """
    print(f"Generating mock dataset with {n_patients} patients and {n_records} records...")
    # List of the 22 essential features from Figure 5 (top) [cite: 478]
    features = [
        "Ulcer Dressing", "Temperature: Norm. Peri-ulcer", "Age", "Exudate Appearance",
        "Tunneling", "Temperature: Intact Skin", "BMI", "Ulcer Onset (Days)",
        "Height (cm)", "History of Cancer", "Peri-ulcer: Erythema",
        "Temperature: Norm. Ulcer Centre", "Side", "Weight (kg)", "Ulcer Score",
        "Offloading: Therapeutic Footwear", "Location", "Peri-ulcer: Maceration",
        "Sex", "Offloading Score", "Exudate Amount", "Type of Diabetes"
    ]

    patient_ids = np.sort(np.random.randint(1, n_patients + 1, n_records))

    data = {
        'patient_id': patient_ids,
        # Target variable with imbalance similar to paper: 31% I, 55.7% P, 13.3% R [cite: 126]
        'healing_phase': np.random.choice(['Inflammation', 'Proliferation', 'Remodeling'],
                                          size=n_records, p=[0.31, 0.557, 0.133]),
        # Numerical features
        "Temperature: Norm. Peri-ulcer": np.random.uniform(-0.5, 0.5, n_records),
        "Age": np.random.randint(40, 85, n_records),
        "Temperature: Intact Skin": np.random.uniform(28, 35, n_records),
        "BMI": np.random.uniform(20, 45, n_records),
        "Ulcer Onset (Days)": np.random.randint(10, 1500, n_records),
        "Height (cm)": np.random.uniform(150, 190, n_records),
        "Temperature: Norm. Ulcer Centre": np.random.uniform(-1, 0.5, n_records),
        "Weight (kg)": np.random.uniform(60, 130, n_records),
        "Ulcer Score": np.random.randint(0, 5, n_records),
        "Offloading Score": np.random.randint(0, 3, n_records),
        # Categorical features
        "Ulcer Dressing": np.random.choice(['None', 'Antiseptic', 'Advanced', 'Absorbent'], size=n_records),
        "Exudate Appearance": np.random.choice(['Serous', 'Bloody', 'Thick', 'Haemoserous'], size=n_records),
        "Tunneling": np.random.choice(['None', 'Minor', 'Medium', 'Severe'], size=n_records),
        "History of Cancer": np.random.choice(['Yes', 'No'], size=n_records, p=[0.11, 0.89]),
        "Peri-ulcer: Erythema": np.random.choice(['Yes', 'No'], size=n_records),
        "Side": np.random.choice(['Left', 'Right'], size=n_records),
        "Offloading: Therapeutic Footwear": np.random.choice(['Yes', 'No'], size=n_records),
        "Location": np.random.choice(['Hallux', 'Toes', 'Midfoot', 'Heel', 'Ankle'], size=n_records),
        "Peri-ulcer: Maceration": np.random.choice(['Yes', 'No'], size=n_records),
        "Sex": np.random.choice(['Male', 'Female'], size=n_records, p=[0.79, 0.21]),
        "Exudate Amount": np.random.choice(['None', 'Moist', 'Heavy'], size=n_records),
        "Type of Diabetes": np.random.choice([1, 2], size=n_records, p=[0.17, 0.83]),
    }

    df = pd.DataFrame(data)

    # Introduce missing values (~2.5% avg as per paper) [cite: 516]
    for col in df.columns:
        if col not in ['patient_id', 'healing_phase']:
            mask = np.random.rand(len(df)) < 0.025
            df.loc[mask, col] = np.nan
            
    return df

# --- 3. Preprocessing and Model Pipeline ---
# Define numerical and categorical features for preprocessing
df = create_mock_dfu_dataset()

numerical_features = df.select_dtypes(include=np.number).columns.drop('patient_id').tolist()
categorical_features = df.select_dtypes(include=['object', 'category']).columns.drop('healing_phase').tolist()

# The paper used KNN Imputer with k=5 [cite: 127] and one-hot encoding [cite: 162]
preprocessor = ColumnTransformer(
    transformers=[
        ('num', KNNImputer(n_neighbors=5), numerical_features),
        ('cat', OneHotEncoder(handle_unknown='ignore'), categorical_features)
    ],
    remainder='passthrough'
)

# --- 4. LOPOCV Evaluation ---
# Use Leave-One-Patient-Out Cross-Validation (LOPOCV) as described in Section 2.4 [cite: 194]

X = df.drop(columns=['healing_phase', 'patient_id'])
# Map target labels to numerical values for the ordinal classifier
y = df['healing_phase'].map({'Inflammation': 0, 'Proliferation': 1, 'Remodeling': 2})
groups = df['patient_id']

# Define the cross-validation strategy
lopocv = LeaveOneGroupOut()

# Lists to store results from each fold
all_true_labels = []
all_pred_labels = []

print("\nStarting Leave-One-Patient-Out Cross-Validation...")
for i, (train_index, test_index) in enumerate(lopocv.split(X, y, groups)):
    X_train, X_test = X.iloc[train_index], X.iloc[test_index]
    y_train, y_test = y.iloc[train_index], y.iloc[test_index]

    # Create the full pipeline including preprocessing, oversampling, and classification
    # RandomOversampler is used to address class imbalance [cite: 185]
    # The OrdinalRandomForest is our custom classifier
    model_pipeline = ImbPipeline(steps=[
        ('preprocessor', preprocessor),
        ('sampler', RandomOverSampler(random_state=42)),
        ('classifier', OrdinalRandomForest())
    ])

    # Fit the pipeline on the training data of the current fold
    model_pipeline.fit(X_train, y_train)

    # Make predictions on the test data (the held-out patient)
    y_pred = model_pipeline.predict(X_test)

    all_true_labels.extend(y_test)
    all_pred_labels.extend(y_pred)
    
    # Provide progress update
    if (i + 1) % 25 == 0:
        print(f"  Completed fold {i+1}/{groups.nunique()}...")
        
print("Cross-validation finished.")

# --- 5. Final Results ---
print("\n--- Model Evaluation Results ---")
target_names = ['Inflammation', 'Proliferation', 'Remodeling']
# Calculate final accuracy
accuracy = accuracy_score(all_true_labels, all_pred_labels)
print(f"Overall Accuracy (similar to Table 3/4): {accuracy:.2f}")

# Display detailed classification report
print("\nClassification Report (F1 scores similar to Table 3):")
print(classification_report(all_true_labels, all_pred_labels, target_names=target_names))
print("="*40)
print("NOTE: These results are based on simulated data. Performance will differ on the real dataset.")
print("The code structure, however, faithfully represents the paper's methodology.")

Related posts, You May like to read

Share on Facebook

Post on X

Save