Normalizing Flows¶

확률 분포 변환 기반 생성 모델 정리.

개요¶

항목	내용
분류	Generative Model
핵심 아이디어	가역 변환으로 복잡한 분포 학습
특징	Exact likelihood 계산 가능
적용 분야	밀도 추정, 생성, 이상탐지
대표 논문	RealNVP (2017), Glow (2018), TarFlow (2025)

핵심 아이디어¶

단순한 기초 분포(예: Gaussian)를 가역(invertible) 변환을 통해 복잡한 데이터 분포로 매핑한다.

z ~ N(0, I)     단순 분포 (latent)
    │
    │  f (가역 변환)
    ▼
x = f(z)        복잡한 분포 (data)

변수 변환 공식: $$\log p(x) = \log p(z) - \log \left| \det \frac{\partial f}{\partial z} \right|$$

수학적 배경¶

Change of Variables¶

$f: \mathbb{R}^d \to \mathbb{R}^d$가 미분 가능하고 가역일 때:

\[p_X(x) = p_Z(f^{-1}(x)) \left| \det \frac{\partial f^{-1}}{\partial x} \right|\]

Flow 체인¶

여러 변환을 연결:

\[z_K = f_K \circ f_{K-1} \circ \cdots \circ f_1(z_0)\]

\[\log p(x) = \log p(z_0) - \sum_{k=1}^{K} \log \left| \det J_{f_k} \right|\]

주요 아키텍처¶

1. RealNVP (2017)¶

Affine Coupling Layer:

입력: x = [x_1, x_2]

y_1 = x_1
y_2 = x_2 ⊙ exp(s(x_1)) + t(x_1)

s, t: 신경망 (scale, translation)

특징: - Jacobian이 삼각 행렬 → $O(d)$ 계산 - 반복적 분할로 모든 차원 혼합

2. Glow (2018)¶

RealNVP 확장:

┌─────────────────────────────┐
│        Glow Block           │
├─────────────────────────────┤
│  1. ActNorm                 │
│  2. 1x1 Invertible Conv     │
│  3. Affine Coupling         │
└─────────────────────────────┘
     ↓ (K번 반복)

개선점: - ActNorm: 배치 정규화 대체 - 1x1 Conv: 채널 순서 학습 - 멀티스케일 아키텍처

3. MAF / IAF (2017)¶

Autoregressive 기반:

MAF (Masked Autoregressive Flow): 밀도 추정에 효율적, 샘플링 느림
IAF (Inverse Autoregressive Flow): 샘플링 빠름, 밀도 추정 느림

4. Continuous NF (Neural ODE, 2019)¶

연속 변환:

\[\frac{dz}{dt} = f(z(t), t; \theta)\]

\[\log p(z(t_1)) = \log p(z(t_0)) - \int_{t_0}^{t_1} \text{tr}\left(\frac{\partial f}{\partial z}\right) dt\]

5. TarFlow (2025)¶

Apple/ICML 2025. Normalizing Flow의 한계 극복:

┌─────────────────────────────┐
│         TarFlow             │
├─────────────────────────────┤
│  - Scalable architecture    │
│  - High-quality generation  │
│  - GAN/Diffusion 수준 품질   │
└─────────────────────────────┘

핵심 기여: - NF가 기존 믿음보다 강력함을 증명 - 이미지 생성에서 경쟁력 있는 결과

구현 예시¶

PyTorch (nflows 라이브러리)¶

import torch
from nflows import flows, transforms, distributions

# 기초 분포
base_distribution = distributions.StandardNormal([2])

# 변환 정의
transform = transforms.CompositeTransform([
    transforms.MaskedAffineAutoregressiveTransform(
        features=2,
        hidden_features=64
    ),
    transforms.RandomPermutation(features=2),
    transforms.MaskedAffineAutoregressiveTransform(
        features=2,
        hidden_features=64
    ),
])

# Flow 모델
flow = flows.Flow(transform, base_distribution)

# 학습
optimizer = torch.optim.Adam(flow.parameters(), lr=1e-3)

for epoch in range(1000):
    optimizer.zero_grad()
    loss = -flow.log_prob(data).mean()  # NLL
    loss.backward()
    optimizer.step()

# 샘플링
samples = flow.sample(100)

# 밀도 추정
log_prob = flow.log_prob(test_data)

RealNVP 직접 구현¶

import torch.nn as nn

class AffineCoupling(nn.Module):
    def __init__(self, dim, hidden_dim=256):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(dim // 2, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, dim)  # s와 t 출력
        )

    def forward(self, x, reverse=False):
        x1, x2 = x.chunk(2, dim=-1)
        st = self.net(x1)
        s, t = st.chunk(2, dim=-1)
        s = torch.tanh(s)  # 안정성

        if not reverse:
            y2 = x2 * torch.exp(s) + t
            log_det = s.sum(dim=-1)
        else:
            y2 = (x2 - t) * torch.exp(-s)
            log_det = -s.sum(dim=-1)

        return torch.cat([x1, y2], dim=-1), log_det

class RealNVP(nn.Module):
    def __init__(self, dim, n_flows=8):
        super().__init__()
        self.flows = nn.ModuleList([
            AffineCoupling(dim) for _ in range(n_flows)
        ])

    def forward(self, x):
        log_det_total = 0
        for flow in self.flows:
            x, log_det = flow(x)
            log_det_total += log_det
            x = x.flip(-1)  # 차원 교환
        return x, log_det_total

    def log_prob(self, x):
        z, log_det = self.forward(x)
        log_pz = -0.5 * (z ** 2 + np.log(2 * np.pi)).sum(dim=-1)
        return log_pz + log_det

비교: 생성 모델¶

모델	Likelihood	샘플링 속도	품질	학습 안정성
VAE	Approximate	빠름	중간	높음
GAN	없음	빠름	높음	낮음
Diffusion	Approximate	느림	매우 높음	높음
NF	Exact	빠름	중간-높음	높음

적용 분야¶

1. 밀도 추정¶

# 이상 탐지
log_probs = flow.log_prob(test_data)
anomaly_scores = -log_probs
anomalies = anomaly_scores > threshold

2. 데이터 증강¶

# 학습 데이터와 유사한 샘플 생성
augmented_samples = flow.sample(1000)

3. 변분 추론¶

# 사후 분포 근사
class FlowPosterior(nn.Module):
    def __init__(self, flow, prior):
        self.flow = flow
        self.prior = prior

    def sample(self, n):
        z = self.prior.sample(n)
        x, _ = self.flow.inverse(z)
        return x

4. 조건부 생성¶

class ConditionalFlow(nn.Module):
    def __init__(self, x_dim, c_dim):
        self.embedding = nn.Linear(c_dim, 64)
        self.flow = RealNVP(x_dim, condition_dim=64)

    def forward(self, x, c):
        c_emb = self.embedding(c)
        return self.flow(x, condition=c_emb)

장단점¶

장점¶

Exact likelihood: 밀도 추정, 이상탐지에 강점
Invertible: 잠재 공간 해석 가능
학습 안정적: GAN 대비 mode collapse 없음
단일 패스: Diffusion 대비 빠른 샘플링

단점¶

차원 제약: 입출력 차원 동일해야 함
표현력 한계: 복잡한 토폴로지 변환 어려움
메모리: 역전파 시 모든 중간값 저장
아키텍처 제약: 가역성 유지 필요

핵심 논문¶

논문	연도	기여
NICE	2015	Additive coupling
RealNVP	2017	Affine coupling
MAF/IAF	2017	Autoregressive flows
Glow	2018	1x1 conv, high-res
FFJORD	2019	Continuous NF (ODE)
TarFlow	2025	Scalable high-quality NF

요약¶

Normalizing Flows는 가역 변환으로 복잡한 분포를 학습하며, exact likelihood 계산이 가능한 것이 핵심 장점이다. 밀도 추정, 이상탐지, 변분 추론에 적합하며, TarFlow(2025) 등 최신 연구로 생성 품질도 크게 향상되었다.