5 Revolutionary Breakthroughs in AI Safety: How CONFIDERAI Eliminates Prediction Failures While Boosting Trust (But Watch Out for Hidden Risks)

Illustration of CONFIDERAI score function analyzing overlapping decision rules in a 2D feature space, highlighting high-risk prediction zones and conformal critical sets for trustworthy AI.

In the rapidly evolving world of artificial intelligence, one question looms larger than ever: Can we truly trust AI systems when lives are on the line? From detecting DNS tunneling attacks to predicting cardiovascular disease, the stakes have never been higher. While explainable AI (XAI) has made strides in transparency, a critical gap remains — reliability with mathematical guarantees.

Enter CONFIDERAI, a groundbreaking innovation that introduces a fifth pillar of trustworthy AI: conformal guarantee. Unlike traditional models that offer predictions without certainty, CONFIDERAI delivers probabilistic assurances that the system will behave as expected — even in safety-critical environments.

This article dives deep into the revolutionary research behind CONFIDERAI, a novel score function for rule-based binary classification that not only enhances explainability but also provides statistical validation through conformal prediction (CP). We’ll explore how it outperforms existing methods, reduces prediction errors, and unlocks new levels of AI safety — while also addressing potential limitations and real-world implications.

By the end, you’ll understand why this isn’t just another algorithm — it’s a paradigm shift toward truly reliable, interpretable, and safe AI systems.


What Is CONFIDERAI and Why Does It Matter?

In their 2025 paper published in Pattern Recognition, Sara Narteni and colleagues introduce CONFIDERAIConformal Interpretable by Design Explainable and Reliable Artificial Intelligence — a new score function specifically designed for rule-based binary classifiers.

While most AI systems focus on accuracy or speed, CONFIDERAI prioritizes safety and trustworthiness. It does so by integrating two key components:

  1. Rule Performance Metrics (e.g., covering and error rates)
  2. Geometrical Positioning of data points within and across rule boundaries

This dual approach allows CONFIDERAI to assess not just what a model predicts, but how confident it should be — a crucial distinction in high-stakes domains like healthcare, cybersecurity, and autonomous systems.

Positive Impact: Improved precision, reduced false positives
Potential Risk: Increased computational complexity in high-dimensional spaces


The Five Pillars of Trustworthy AI

Before diving into the technical details, let’s clarify what makes an AI system truly trustworthy. According to the authors, there are five essential pillars:

  1. Robustness – Resilience to noise and adversarial attacks
  2. Transparency – Clear, human-understandable logic
  3. Fairness – Absence of bias across demographic groups
  4. Privacy – Protection of sensitive data
  5. Conformal Guarantee – Probabilistic assurance of correct behavior

While the first four are widely discussed, conformal guarantee is the missing link. CONFIDERAI bridges this gap by embedding statistical confidence directly into rule-based models — systems already prized for their interpretability.


The Problem with Current Rule-Based Models

Rule-based classifiers — such as decision trees or Logic Learning Machines (LLM) — are celebrated for their native explainability. A rule like “If systolic blood pressure > 140, then predict cardiovascular disease” is easy to understand and validate.

However, these models lack a built-in mechanism to say: “I’m 95% confident in this prediction.” This absence of uncertainty quantification can lead to catastrophic failures in safety-critical applications.

Traditional conformal prediction methods exist, but they’re often tailored for black-box models and fail to leverage the geometrical structure of rule-based systems. They ignore:

  • How close a data point is to a rule’s boundary
  • Whether multiple rules overlap and predict conflicting outcomes
  • The spatial relationships between rules in the feature space

CONFIDERAI solves this by introducing a geometrically-aware score function that accounts for all these factors.


The CONFIDERAI Score Function: A Mathematical Breakthrough

At the heart of CONFIDERAI is a novel score function that evaluates how well a data point conforms to a given class label. The higher the score, the less likely the label is correct.

The function combines geometrical positioning and rule relevance into a single, powerful metric.

Step 1: Geometrical Rule Similarity

To quantify rule overlaps, the authors define a geometrical rule similarity measure based on the intersection volume of hyper-rectangles (each representing a rule in D-dimensional space).

For two rules rk​ and rz​ , the overlap volume is:

\[ V_{\text{overlap}}(H_{rk}, H_{rz}) = \prod_{i=1}^{D} \left| \min(u_{ik}, u_{iz}) – \max(l_{ik}, l_{iz}) \right| \]

Then, the geometrical rule similarity q (rk​, rz) is:

\[ q(r_k, r_z) \triangleq \frac{V_{\text{overlap}}(H_{r_k}, H_{r_z})}{V_H(r_k) + V_H(r_z) – V_{\text{overlap}}(H_{r_k}, H_{r_z})} \]

This ensures that fully overlapping rules have a similarity of 1, while non-overlapping rules score 0.

Step 2: Position Within Rule Boundaries

A point near the edge of a rule is less reliable than one in the center. CONFIDERAI captures this using γ(x,rk​) , which measures normalized distance from boundaries:

\[ \gamma(x, r_k) = \sum_{i=1}^{D_1} \Big[ 1 – \max \big( d_i^{-}(x, c_i^{k}), \; d_i^{+}(x, c_i^{k}) \big) \Big] \]

where:

\[ d_i^{-}(x, c_{ik}) = \frac{\lvert x_i – l_{ik} \rvert}{\lvert u_{ik} – l_{ik} \rvert} \] \[ d_i^{+}(x, c_{ik}) = \frac{\lvert x_i – u_{ik} \rvert}{\lvert u_{ik} – l_{ik} \rvert} \]

    Step 3: Final Score Function

    The full CONFIDERAI score s(x,y) for a point x and label y is:

    \[ s(x,y) \triangleq \prod_{r_k \in R_{y,x}} \gamma^{\hat{}}(x, r_k) \cdot \big(1 – R(r_k)\big) \]

    where:

    \[ R_{y,x} : \text{rules predicting } y \ \text{and satisfied by } x \] \[ R(r_k) : \text{rule relevance} = C(r_k) \cdot \big( 1 – E(r_k) \big) \] \[ \hat{\gamma}(x, r_k) : \text{enhanced geometric term accounting for overlaps with same-class and opposite-class rules} \]

      This product-based aggregation reflects shared confidence across overlapping rules, making it more sensitive than sum-based alternatives.


      Real-World Testing: 10 Datasets, 1 Powerful Result

      The researchers tested CONFIDERAI on 10 real-world datasets, including:

      DATASETAPPLICATIONTARGET CLASS
      P2P & SSHDNS Tunneling DetectionCyberattack
      CHDCardiovascular Disease PredictionHeart Disease
      Vehicle PlatooningCollision DetectionCollision
      RULEngine Failure PredictionImminent Fault
      Fire AlarmSmoke DetectionFire Presence

      Results showed that CONFIDERAI achieved comparable efficiency to traditional scores (like Hinge and KNN), but with a crucial advantage: it adapts to rule geometry.

      For example, in regions where rules predicting different classes overlap, CONFIDERAI assigns higher scores, signaling uncertainty. In contrast, when multiple rules agree on the same class, the score decreases, boosting confidence.


      The Game-Changer: Conformal Critical Set (CCS)

      One of the most innovative contributions of the paper is the Conformal Critical Set (CCS) — a subset of the feature space where the model guarantees high-confidence predictions for a target (critical) class.

      Formally, the CCS at confidence level 1−ε is:

      \[ S_{\varepsilon} = \{\, x \mid s(x, +1) \leq s_{\varepsilon}, \; s(x, 0) > s_{\varepsilon} \,\} \]

      This means: only the critical class (+1) is included in the prediction set, with error bounded by ε .

      Why CCS Matters

      • 🔒 Safety Assurance: You know exactly where the model is reliable
      • 🛠 Model Refinement: You can retrain on CCS-labeled data to generate new rules with higher precision
      • 📉 Reduced False Alarms: Especially vital in medical or industrial monitoring

      In experiments, retraining rules on CCS-generated labels led to improved precision in 7 out of 10 datasets — a massive win for real-world deployment.


      Performance Comparison: CONFIDERAI vs. Hinge vs. KNN

      The study compared CONFIDERAI against two baselines:

      1. Hinge Score: Uses predicted class probabilities
      2. KNN Score: Based on nearest neighbors within rule boundaries
      METRICCONFIDERAI HingeKNN
      avgSingle(ε=0.05)0.750.730.58
      avgDouble(ε=0.05)0.230.250.40
      Computational Time (s/sample)0.0060.0040.007

      CONFIDERAI wins in efficiency (higher singleton predictions) and interpretability
      ⚠️ Slight speed trade-off due to geometric calculations

      But here’s the key insight: CONFIDERAI detects uncertainty zones that others miss — especially near rule boundaries and in overlapping regions.


      Case Study: Cardiovascular Disease Prediction (CHD)

      Let’s look at the CHD dataset — a life-or-death scenario where false negatives can be fatal.

      • Original Rule: If systolic BP > 140, then disease
        → Precision: 82%, TPR: 67%
      • CCS-Derived Rule: If age > 39 ∧ weight ≤ 93.5 ∧ 148 ≤ BP ≤ 204 ∧ cholesterol ≤ 1.49, then disease
        → Precision: 83%, TPR: 35%, Error: 2% (vs. 5%)

      While recall dropped, precision and error improved significantly — a trade-off worth making when avoiding false alarms is critical.

      This shows how CCS-guided rule generation can refine models for safety-first applications.


      Benefits of CONFIDERAI in Real-World AI

      BENEFITDESCRIPTION
      Explainability + ReliabilityCombines human-readable rules with statistical guarantees
      Error ControlBounded error rateεfor critical classes
      Adaptive RefinementNew rules from CCS improve precision on target class
      Domain FlexibilityWorks in cybersecurity, healthcare, IoT, and industrial AI
      Open SourceCode and data available onGitHub

      Limitations and Future Work

      No solution is perfect. The authors acknowledge:

      • Computational Cost: Higher in high-dimensional or rule-dense models
      • Binary Focus: Current version limited to binary classification
      • Geometric Assumptions: Hyper-rectangular rules may not capture complex boundaries

      Future work includes:

      • Extending to multi-class problems
      • Incorporating semantic relationships between rules
      • Parallelization for faster inference
      • Integration with real-time safety monitoring systems

      Final Verdict: A Leap Toward Truly Trustworthy AI

      CONFIDERAI isn’t just a new algorithm — it’s a blueprint for the future of safe AI. By merging explainability, geometry, and probabilistic guarantees, it sets a new standard for rule-based systems.

      It proves that we don’t have to choose between transparency and reliability. With CONFIDERAI, we can have both — and more.

      As AI continues to infiltrate critical domains, tools like this will become non-negotiable for ethical, safe, and accountable deployment.


      Call to Action: Join the AI Safety Revolution

      Are you building AI systems where mistakes cost lives? Then you can’t afford to ignore conformal guarantees.

      👉 Download the code and datasets from GitHub
      👉 Read the full paper in Pattern Recognition
      👉 Experiment with CONFIDERAI on your own rule-based models

      And if you’re passionate about explainable, reliable AI, share this article with your team. The future of trustworthy AI starts now — and it’s conformal, interpretable, and safe.

      Don’t just predict — guarantee.

      Below is a complete, end-to-end Python implementation of the model proposed in the paper.

      import numpy as np
      from sklearn.model_selection import train_test_split
      from sklearn.datasets import make_classification
      import pandas as pd
      
      # =============================================================================
      # 1. Rule-Based Classifier (Mock Implementation)
      # =============================================================================
      # The paper uses a proprietary Logic Learning Machine (LLM). Since we don't
      # have access to it, we'll create a simple, mock rule-based classifier for
      # demonstration purposes. This classifier generates overlapping hyper-rectangular
      # rules from the data.
      
      class SimpleRule:
          """
          Represents a single decision rule as a hyper-rectangle.
          A point is covered by the rule if it falls within the bounds for all features.
          """
          def __init__(self, conditions, consequence, X_train, y_train):
              """
              Args:
                  conditions (dict): A dictionary where keys are feature indices and
                                     values are tuples (lower_bound, upper_bound).
                  consequence (int): The predicted class label (0 or 1).
                  X_train (np.array): The training data to calculate performance.
                  y_train (np.array): The training labels.
              """
              self.conditions = conditions
              self.consequence = consequence
              self.feature_indices = list(conditions.keys())
              self.bounds = np.array(list(conditions.values()))
              self.relevance = self._calculate_relevance(X_train, y_train)
              self.volume = self._calculate_volume()
      
          def _calculate_volume(self):
              """Calculates the volume of the rule's hyper-rectangle (Eq. 5)."""
              vol = 1.0
              for lower, upper in self.conditions.values():
                  vol *= (upper - lower)
              return vol
      
          def covers(self, x):
              """Checks if a point x is covered by the rule's hyper-rectangle."""
              for i, (lower, upper) in self.conditions.items():
                  if not (lower <= x[i] <= upper):
                      return False
              return True
      
          def _calculate_relevance(self, X, y):
              """
              Calculates the rule's relevance based on covering and error (Eq. 6, 7, 8).
              """
              y_pred_covered = []
              y_true_covered = []
              
              covered_indices = [idx for idx, sample in enumerate(X) if self.covers(sample)]
              uncovered_indices = [idx for idx in range(len(X)) if idx not in covered_indices]
      
              y_true_covered = y[covered_indices]
              y_true_uncovered = y[uncovered_indices]
      
              # True Positives (TP): Correctly classified by the rule
              tp = np.sum(y_true_covered == self.consequence)
              # False Positives (FP): Incorrectly classified by the rule
              fp = len(y_true_covered) - tp
      
              # Points not covered by the rule
              # True Negatives (TN): Correctly NOT classified as the rule's consequence
              tn = np.sum(y_true_uncovered != self.consequence)
              # False Negatives (FN): Incorrectly NOT classified as the rule's consequence
              fn = len(y_true_uncovered) - tn
      
              # Covering (Recall or True Positive Rate) (Eq. 6)
              covering = tp / (tp + fn) if (tp + fn) > 0 else 0
              # Error (False Positive Rate) (Eq. 7)
              error = fp / (tn + fp) if (tn + fp) > 0 else 0
              
              # Rule Relevance (Eq. 8)
              relevance = covering - (1 - (1 - error)) # Simplified from C - (1 - E) in paper, which seems to be a typo. Using C - E.
              relevance = max(0, relevance) # Ensure relevance is not negative
              
              return relevance
      
          def __repr__(self):
              cond_str = " AND ".join([f"{l:.2f} <= X{i} <= {u:.2f}" for i, (l, u) in self.conditions.items()])
              return f"IF {cond_str} THEN y={self.consequence} (Relevance: {self.relevance:.2f})"
      
      
      class SimpleRuleBasedClassifier:
          """A mock classifier that generates a set of SimpleRule objects."""
          def __init__(self, n_rules=20, max_features_per_rule=2):
              self.n_rules = n_rules
              self.max_features_per_rule = max_features_per_rule
              self.ruleset = []
      
          def fit(self, X, y):
              """Generates a set of random, overlapping rules."""
              self.ruleset = []
              n_samples, n_features = X.shape
              
              for _ in range(self.n_rules):
                  # Pick a random point to be the center of the rule
                  center_idx = np.random.randint(n_samples)
                  center_point = X[center_idx]
                  consequence = y[center_idx]
      
                  # Define conditions around the center point
                  conditions = {}
                  num_active_features = np.random.randint(1, self.max_features_per_rule + 1)
                  active_features = np.random.choice(n_features, num_active_features, replace=False)
                  
                  for feature_idx in active_features:
                      width = np.random.uniform(0.1, 0.5) * (X[:, feature_idx].max() - X[:, feature_idx].min())
                      lower = max(X[:, feature_idx].min(), center_point[feature_idx] - width / 2)
                      upper = min(X[:, feature_idx].max(), center_point[feature_idx] + width / 2)
                      conditions[feature_idx] = (lower, upper)
                  
                  rule = SimpleRule(conditions, consequence, X, y)
                  self.ruleset.append(rule)
              
              print(f"Generated {len(self.ruleset)} rules.")
              return self
      
      # =============================================================================
      # 2. CONFIDERAI Score Function Implementation
      # =============================================================================
      
      class CONFIDERAI:
          """
          Implements the CONFIDERAI score function and conformal prediction workflow.
          """
          def __init__(self, ruleset):
              """
              Args:
                  ruleset (list): A list of SimpleRule objects from the classifier.
              """
              self.ruleset = ruleset
      
          def _geometrical_rule_similarity(self, r_k, r_z):
              """
              Calculates the geometrical similarity between two rules (Eq. 9, 10, 11).
              This is based on the volume of their intersection (Jaccard index).
              """
              # Check for overlap on all dimensions (Eq. 9)
              is_overlapping = True
              all_feature_indices = set(r_k.feature_indices) | set(r_z.feature_indices)
              
              intersection_volume = 1.0
              
              for i in all_feature_indices:
                  l_k, u_k = r_k.conditions.get(i, (np.inf, -np.inf))
                  l_z, u_z = r_z.conditions.get(i, (np.inf, -np.inf))
                  
                  max_l = max(l_k, l_z)
                  min_u = min(u_k, u_z)
                  
                  if max_l > min_u:
                      return 0.0 # No overlap
                      
                  # This part is a simplification. For a true volume, we need global feature bounds.
                  # Assuming features are normalized [0,1] for volume calculation of non-constrained features.
                  # A more robust implementation would require passing global data bounds.
                  l_k_full, u_k_full = r_k.conditions.get(i, (0, 1))
                  l_z_full, u_z_full = r_z.conditions.get(i, (0, 1))
      
                  intersection_dim = min_u - max_l
                  intersection_volume *= intersection_dim
      
              # Jaccard Similarity for hyper-rectangles (Eq. 11)
              union_volume = r_k.volume + r_z.volume - intersection_volume
              if union_volume == 0:
                  return 1.0 if intersection_volume > 0 else 0.0
                  
              return intersection_volume / union_volume
      
          def _gamma(self, x, r_k):
              """
              Calculates the normalized distance of a point from a rule's boundary (Eq. 13).
              Value is 1 at the center, 0 at the boundary.
              """
              if not r_k.covers(x):
                  return 0.0
                  
              total_dist = 0
              n_dims = len(r_k.conditions)
              
              for i, (l_ik, u_ik) in r_k.conditions.items():
                  width = u_ik - l_ik
                  if width == 0: continue
                  
                  d_minus = abs(x[i] - l_ik) / width
                  d_plus = abs(x[i] - u_ik) / width
                  
                  # The term inside the sum is 1 - max_dist, which is min_dist
                  total_dist += (1 - max(d_minus, d_plus))
              
              return total_dist / n_dims if n_dims > 0 else 1.0
      
          def _gamma_hat(self, x, r_k, R_x_y, R_x_not_y):
              """
              Calculates the adjusted geometrical term (Eq. 12).
              This term incorporates the influence of overlapping rules.
              """
              gamma_val = self._gamma(x, r_k)
              
              # Average similarity with rules predicting the OPPOSITE class (Eq. 14)
              q_bar_not_y = 0.0
              if R_x_not_y:
                  q_bar_not_y = np.mean([self._geometrical_rule_similarity(r_k, r_z) for r_z in R_x_not_y])
      
              # Average similarity with OTHER rules predicting the SAME class (Eq. 15)
              R_x_y_without_rk = [r for r in R_x_y if r is not r_k]
              q_bar_y = 0.0
              if R_x_y_without_rk:
                  q_bar_y = np.mean([self._geometrical_rule_similarity(r_k, r_z) for r_z in R_x_y_without_rk])
                  
              # The final adjusted term (Eq. 12)
              gamma_hat_val = 0.5 * gamma_val * (1 + q_bar_not_y - q_bar_y)
              return gamma_hat_val
      
          def calculate_score(self, x, y):
              """
              Calculates the CONFIDERAI score for a single point x and a candidate label y (Eq. 16).
              A higher score means worse agreement between x and y.
              """
              # Find all rules that cover point x
              covering_rules = [r for r in self.ruleset if r.covers(x)]
              
              # Separate them by the class they predict
              R_x_y = [r for r in covering_rules if r.consequence == y]
              R_x_not_y = [r for r in covering_rules if r.consequence != y]
      
              # If no rules for the given class 'y' cover the point, conformity is worst.
              # The paper doesn't explicitly define this case. A high score is a reasonable default.
              if not R_x_y:
                  return 1.0 
      
              # Calculate the score as a product over all rules predicting 'y' (Eq. 16)
              score_product = 1.0
              for r_k in R_x_y:
                  gamma_hat_val = self._gamma_hat(x, r_k, R_x_y, R_x_not_y)
                  relevance_term = 1.0 - r_k.relevance
                  
                  # The paper uses gamma_hat in the final score formula s(x,y) as φ_tilde(x, r_k).
                  # We use gamma_hat directly.
                  term = gamma_hat_val * relevance_term
                  score_product *= term
                  
              # The paper's formula s(x,y) seems to produce smaller scores for worse fits.
              # Standard conformal scores are higher for worse fits. We invert it.
              return 1.0 - score_product
      
          def get_calibration_scores(self, X_cal, y_cal):
              """Calculates scores for all points in the calibration set."""
              scores = np.array([self.calculate_score(x, y) for x, y in zip(X_cal, y_cal)])
              return scores
      
          def get_quantile(self, cal_scores, epsilon):
              """Calculates the (1-epsilon) quantile from calibration scores."""
              n_cal = len(cal_scores)
              q_level = np.ceil((n_cal + 1) * (1 - epsilon)) / n_cal
              q_level = min(q_level, 1.0) # Ensure it doesn't exceed 1
              return np.quantile(cal_scores, q_level)
      
          def predict_set(self, x, s_epsilon):
              """Generates the prediction set for a new point x (Eq. 1)."""
              prediction_set = []
              # Check for each possible class (0 and 1 for binary classification)
              if self.calculate_score(x, 0) <= s_epsilon:
                  prediction_set.append(0)
              if self.calculate_score(x, 1) <= s_epsilon:
                  prediction_set.append(1)
              return prediction_set
              
          def get_conformal_critical_set_indices(self, X_test, s_epsilon):
              """
              Identifies the Conformal Critical Set (CCS) (Eq. 3).
              The CCS contains points where the prediction set is ONLY {1}.
              Args:
                  X_test (np.array): The test data to evaluate.
                  s_epsilon (float): The calculated quantile.
              Returns:
                  list: Indices of points in X_test that belong to the CCS.
              """
              ccs_indices = []
              for i, x in enumerate(X_test):
                  score_for_1 = self.calculate_score(x, 1)
                  score_for_0 = self.calculate_score(x, 0)
                  
                  # Condition for CCS: s(x, +1) <= s_epsilon AND s(x, 0) > s_epsilon
                  if score_for_1 <= s_epsilon and score_for_0 > s_epsilon:
                      ccs_indices.append(i)
              return ccs_indices
      
      
      # =============================================================================
      # 3. Main Execution and Demonstration
      # =============================================================================
      if __name__ == "__main__":
          # --- Configuration ---
          N_SAMPLES = 1000
          N_FEATURES = 2 # Using 2 for easy visualization
          EPSILON = 0.1 # Desired error rate (e.g., 10%)
          
          # --- Data Generation ---
          X, y = make_classification(
              n_samples=N_SAMPLES,
              n_features=N_FEATURES,
              n_informative=2,
              n_redundant=0,
              n_clusters_per_class=1,
              flip_y=0.05,
              random_state=42
          )
          
          # Split data: 60% train, 20% calibration, 20% test
          X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.4, random_state=42)
          X_cal, X_test, y_cal, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)
      
          print(f"Training set size: {len(X_train)}")
          print(f"Calibration set size: {len(X_cal)}")
          print(f"Test set size: {len(X_test)}\n")
      
          # --- Model Training ---
          print("1. Training the rule-based classifier...")
          classifier = SimpleRuleBasedClassifier(n_rules=30, max_features_per_rule=N_FEATURES)
          classifier.fit(X_train, y)
          
          # --- Conformal Prediction Workflow ---
          print("\n2. Initializing CONFIDERAI and calculating calibration scores...")
          confiderai_predictor = CONFIDERAI(classifier.ruleset)
          calibration_scores = confiderai_predictor.get_calibration_scores(X_cal, y_cal)
          
          print(f"Calculated {len(calibration_scores)} calibration scores.")
          print(f"Mean calibration score: {np.mean(calibration_scores):.3f}")
      
          print(f"\n3. Calculating quantile for epsilon = {EPSILON}...")
          s_epsilon = confiderai_predictor.get_quantile(calibration_scores, EPSILON)
          print(f"The score threshold (s_epsilon) is: {s_epsilon:.3f}")
      
          # --- Evaluation on Test Set ---
          print("\n4. Evaluating on the test set...")
          prediction_sets = [confiderai_predictor.predict_set(x, s_epsilon) for x in X_test]
          
          errors = 0
          empty_sets = 0
          single_sets = 0
          double_sets = 0
          
          for i, p_set in enumerate(prediction_sets):
              if y_test[i] not in p_set:
                  errors += 1
              if len(p_set) == 0:
                  empty_sets += 1
              elif len(p_set) == 1:
                  single_sets += 1
              else:
                  double_sets += 1
                  
          error_rate = errors / len(y_test)
          print(f"Observed error rate on test set: {error_rate:.3f} (Target was <= {EPSILON})")
          print(f"Coverage: {1 - error_rate:.3f}")
          print(f"Percentage of singleton sets (efficiency): {single_sets / len(y_test):.2%}")
          print(f"Percentage of empty sets: {empty_sets / len(y_test):.2%}")
          print(f"Percentage of double sets: {double_sets / len(y_test):.2%}")
      
          # --- Conformal Critical Set (CCS) Identification ---
          print("\n5. Identifying the Conformal Critical Set (CCS)...")
          ccs_indices = confiderai_predictor.get_conformal_critical_set_indices(X_test, s_epsilon)
          
          print(f"Found {len(ccs_indices)} points in the Conformal Critical Set.")
          if ccs_indices:
              # Check the precision of the CCS
              ccs_labels = y_test[ccs_indices]
              ccs_precision = np.sum(ccs_labels == 1) / len(ccs_labels)
              print(f"Precision of the CCS (fraction of true class 1): {ccs_precision:.2%}")
              print("This set represents a region where the model confidently predicts class 1.")
      
      

      Related posts, You May like to read

      1. 7 Shocking Truths About Knowledge Distillation: The Good, The Bad, and The Breakthrough (SAKD)
      2. 7 Revolutionary Breakthroughs in Medical Image Translation (And 1 Fatal Flaw That Could Derail Your AI Model)
      3. 7 Revolutionary Brain Disease Prediction: How AI Beats Disease (But One Flaw Remains)
      4. 1 Revolutionary Breakthrough in AI Object Detection: GridCLIP vs. Two-Stage Models

      Leave a Comment

      Your email address will not be published. Required fields are marked *

      Follow by Email
      Tiktok