Conformal Prediction¶

Overview¶

Conformal Prediction (CP)은 머신러닝 모델의 예측에 대해 통계적으로 유효한 불확실성 정량화(Uncertainty Quantification)를 제공하는 프레임워큼. 1998년 Gammerman, Vovk, Vapnik에 의해 처음 제안되었으며, 모델에 구애받지 않고(model-agnostic) 분포 가정 없이(distribution-free) 작동한다는 점에서 실용적 가치가 높음.

속성	설명
제안	Gammerman, Vovk, Vapnik (1998)
핵심 가정	Exchangeability (교환가능성)
출력	Prediction Set (분류) / Prediction Interval (회귀)
보장	Coverage Guarantee (커버리지 보장)

Core Concepts¶

Nonconformity Score¶

비적합도 점수(Nonconformity Score)는 새로운 데이터 포인트가 기존 학습 데이터와 얼마나 "다른지"를 수치화함.

회귀 문제:

alpha_i = |y_i - f(x_i)|

분류 문제:

alpha_i = 1 - f(x_i)[y_i]

여기서 f(x_i)[y_i]는 정답 클래스에 대한 예측 확률.

Coverage Guarantee¶

CP의 핵심 이론적 보장은 다음과 같다:

P(Y_test in C(X_test)) >= 1 - alpha

사용자가 지정한 신뢰수준 1 - alpha에 대해, 실제 값이 예측 집합/구간 안에 포함될 확률이 최소 1 - alpha 이상임을 보장함.

Theorem (Vovk et al., 2005): 데이터가 교환가능(exchangeable)하고 n개의 보정 데이터와 1개의 테스트 데이터가 있을 때:

1 - alpha <= P(Y_test in C(X_test)) <= 1 - alpha + 1/(n+1)

Algorithm Variants¶

Split Conformal Prediction (SCP)¶

가장 실용적이고 널리 사용되는 변형.

절차:

학습 데이터를 Proper Training Set과 Calibration Set으로 분할
Proper Training Set으로 모델 학습
Calibration Set에서 Nonconformity Score 계산
테스트 시 quantile을 이용해 예측 구간/집합 생성

장점:

계산 효율적 (모델 1회 학습)
구현 단순
대규모 데이터셋에 적합

단점:

데이터 분할로 인한 효율성 손실
Calibration Set 크기에 따른 커버리지 변동

Inductive Conformal Prediction (ICP)¶

SCP의 초기 명칭으로, 동일한 알고리즘을 지칭함.

Cross-Conformal Prediction (CCP)¶

k-fold 교차검증과 유사한 방식으로 여러 분할에서 CP를 수행하고 결과를 집계함.

Final Interval = (median(y_hat) - median(d), median(y_hat) + median(d))

장점:

단일 분할 대비 효율성 향상
전체 데이터 활용

단점:

자동 유효성(validity) 보장 손실
계산 비용 증가

Mondrian Conformal Prediction¶

클래스별로 별도의 nonconformity score 분포를 유지함. 클래스 불균형 문제에서 유용함.

Mathematical Foundation¶

Quantile Function¶

Split CP에서 예측 구간은 nonconformity score의 분위수를 이용해 계산함:

q = ceil((n + 1) * (1 - alpha)) / n
quantile_score = sorted_scores[int(q * n)]

Prediction Interval (Regression)¶

C(X_test) = [f(X_test) - q, f(X_test) + q]

여기서 q는 calibration set의 nonconformity score에서 계산된 (1-alpha) 분위수다.

Prediction Set (Classification)¶

C(X_test) = {y : s(X_test, y) <= q}

softmax 확률 기반 score를 사용할 경우:

C(X_test) = {y : 1 - f(X_test)[y] <= q}

Python Implementation¶

Basic Split Conformal Prediction (Regression)¶

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor

class SplitConformalRegressor:
    """Split Conformal Prediction for Regression"""

    def __init__(self, base_model, alpha=0.1):
        """
        Args:
            base_model: sklearn-compatible regressor
            alpha: significance level (default 0.1 for 90% coverage)
        """
        self.base_model = base_model
        self.alpha = alpha
        self.calibration_scores = None

    def fit(self, X, y, calib_size=0.2, random_state=42):
        """
        Fit the model and compute calibration scores.

        Args:
            X: feature matrix
            y: target values
            calib_size: fraction for calibration set
            random_state: random seed
        """
        # Split into proper training and calibration sets
        X_train, X_calib, y_train, y_calib = train_test_split(
            X, y, test_size=calib_size, random_state=random_state
        )

        # Fit base model on proper training set
        self.base_model.fit(X_train, y_train)

        # Compute nonconformity scores on calibration set
        y_pred_calib = self.base_model.predict(X_calib)
        self.calibration_scores = np.abs(y_calib - y_pred_calib)

        return self

    def predict(self, X):
        """
        Generate prediction intervals.

        Args:
            X: feature matrix for prediction

        Returns:
            tuple: (lower_bounds, upper_bounds, point_predictions)
        """
        n = len(self.calibration_scores)

        # Compute quantile with finite sample correction
        q_level = np.ceil((n + 1) * (1 - self.alpha)) / n
        q_level = min(q_level, 1.0)

        quantile = np.quantile(self.calibration_scores, q_level)

        # Point predictions
        y_pred = self.base_model.predict(X)

        # Prediction intervals
        lower = y_pred - quantile
        upper = y_pred + quantile

        return lower, upper, y_pred

    def coverage(self, X_test, y_test):
        """Compute empirical coverage on test set."""
        lower, upper, _ = self.predict(X_test)
        covered = (y_test >= lower) & (y_test <= upper)
        return np.mean(covered)


# Usage Example
if __name__ == "__main__":
    from sklearn.datasets import make_regression

    # Generate synthetic data
    X, y = make_regression(n_samples=1000, n_features=10, noise=10, random_state=42)
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

    # Initialize and fit
    model = RandomForestRegressor(n_estimators=100, random_state=42)
    cp = SplitConformalRegressor(model, alpha=0.1)
    cp.fit(X_train, y_train)

    # Predict and evaluate
    lower, upper, y_pred = cp.predict(X_test)
    coverage = cp.coverage(X_test, y_test)
    avg_width = np.mean(upper - lower)

    print(f"Target Coverage: {1 - cp.alpha:.1%}")
    print(f"Empirical Coverage: {coverage:.1%}")
    print(f"Average Interval Width: {avg_width:.2f}")

Split Conformal Prediction (Classification)¶

import numpy as np
from sklearn.model_selection import train_test_split

class SplitConformalClassifier:
    """Split Conformal Prediction for Classification"""

    def __init__(self, base_model, alpha=0.1):
        """
        Args:
            base_model: sklearn-compatible classifier with predict_proba
            alpha: significance level
        """
        self.base_model = base_model
        self.alpha = alpha
        self.calibration_scores = None

    def fit(self, X, y, calib_size=0.2, random_state=42):
        """Fit model and compute calibration scores."""
        X_train, X_calib, y_train, y_calib = train_test_split(
            X, y, test_size=calib_size, random_state=random_state
        )

        self.base_model.fit(X_train, y_train)

        # Softmax probabilities on calibration set
        proba_calib = self.base_model.predict_proba(X_calib)

        # Nonconformity: 1 - probability of true class
        self.calibration_scores = 1 - proba_calib[np.arange(len(y_calib)), y_calib]

        return self

    def predict_set(self, X):
        """
        Generate prediction sets.

        Args:
            X: feature matrix

        Returns:
            list of prediction sets (one per sample)
        """
        n = len(self.calibration_scores)
        q_level = np.ceil((n + 1) * (1 - self.alpha)) / n
        q_level = min(q_level, 1.0)

        quantile = np.quantile(self.calibration_scores, q_level)

        proba = self.base_model.predict_proba(X)
        scores = 1 - proba  # nonconformity for each class

        prediction_sets = []
        for i in range(len(X)):
            pred_set = np.where(scores[i] <= quantile)[0].tolist()
            prediction_sets.append(pred_set)

        return prediction_sets

    def coverage(self, X_test, y_test):
        """Compute empirical coverage."""
        pred_sets = self.predict_set(X_test)
        covered = [y_test[i] in pred_sets[i] for i in range(len(y_test))]
        return np.mean(covered)

    def avg_set_size(self, X_test):
        """Average prediction set size."""
        pred_sets = self.predict_set(X_test)
        return np.mean([len(s) for s in pred_sets])


# Usage Example
if __name__ == "__main__":
    from sklearn.datasets import make_classification
    from sklearn.ensemble import RandomForestClassifier

    X, y = make_classification(n_samples=1000, n_classes=5, n_informative=10, 
                                random_state=42)
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

    model = RandomForestClassifier(n_estimators=100, random_state=42)
    cp = SplitConformalClassifier(model, alpha=0.1)
    cp.fit(X_train, y_train)

    coverage = cp.coverage(X_test, y_test)
    avg_size = cp.avg_set_size(X_test)

    print(f"Target Coverage: {1 - cp.alpha:.1%}")
    print(f"Empirical Coverage: {coverage:.1%}")
    print(f"Average Set Size: {avg_size:.2f}")

Using MAPIE Library¶

# pip install mapie

from mapie.regression import MapieRegressor
from mapie.classification import MapieClassifier
from sklearn.ensemble import GradientBoostingRegressor, GradientBoostingClassifier

# Regression
reg_model = GradientBoostingRegressor()
mapie_reg = MapieRegressor(reg_model, method="plus", cv=5)
mapie_reg.fit(X_train, y_train)
y_pred, y_pis = mapie_reg.predict(X_test, alpha=0.1)

# Classification  
clf_model = GradientBoostingClassifier()
mapie_clf = MapieClassifier(clf_model, method="score", cv="prefit")
mapie_clf.fit(X_train, y_train)
y_pred, y_ps = mapie_clf.predict(X_test, alpha=0.1)

Advanced Topics¶

Conformalized Quantile Regression (CQR)¶

표준 CP는 조건부 커버리지(conditional coverage)를 보장하지 않음. CQR은 quantile regression과 CP를 결합하여 이종분산성(heteroscedasticity)이 있는 데이터에서 더 나은 성능을 제공함.

# Lower and upper quantile models
model_lo = QuantileRegressor(quantile=alpha/2)
model_hi = QuantileRegressor(quantile=1-alpha/2)

# Conformity score
score = max(q_lo - y, y - q_hi)

Adaptive Conformal Inference (ACI)¶

시계열 데이터에서 distribution shift에 적응하는 방법. 온라인 학습 환경에서 alpha를 동적으로 조정함.

Conformal Prediction for Deep Learning¶

신경망에서 CP를 적용할 때 고려사항:

Temperature Scaling으로 보정된 확률 사용
MC Dropout으로 불확실성 추정 후 CP 적용
대규모 calibration set 필요

Applications¶

분야	응용
의료	진단 불확실성 정량화, 약물 효과 예측
금융	리스크 평가, 신용 스코어링
자율주행	객체 탐지 신뢰도
약물 발견	분자 특성 예측 신뢰구간
사이버보안	이상 탐지, 악성코드 분류

Recent Research (2025)¶

논문/발표	학회	주요 기여
Reliable UQ via CP	AAAI 2025	Non-i.i.d. 환경에서의 CP 확장
Model UQ by CP in Continual Learning	ICML 2025	연속 학습에서의 보정 문제 해결
Adaptive Quantum CP	arXiv 2025	양자 컴퓨팅 환경에서의 CP
CP and Trustworthy AI	arXiv 2025	AI 거버넌스와 편향 식별
CP for Privacy-Preserving ML	arXiv 2025	프라이버시 보존 학습에서의 CP

Summary¶

Conformal Prediction의 핵심 특성:

Distribution-free: 데이터 분포에 대한 가정 불필요
Model-agnostic: 모든 ML 모델에 적용 가능
Finite-sample guarantee: 유한 표본에서도 유효한 보장
Exchangeability assumption: i.i.d.보다 약한 가정
Practical trade-off: 커버리지 vs 구간/집합 크기

References¶

Vovk, V., Gammerman, A., & Shafer, G. (2005). Algorithmic Learning in a Random World. Springer.
Shafer, G., & Vovk, V. (2008). A Tutorial on Conformal Prediction. JMLR, 9, 371-421.
Angelopoulos, A. N., & Bates, S. (2021). A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification. arXiv:2107.07511.
Romano, Y., Patterson, E., & Candes, E. (2019). Conformalized Quantile Regression. NeurIPS.
Barber, R. F., et al. (2023). Conformal Prediction Beyond Exchangeability. Annals of Statistics.