Agentic AI¶

자율적으로 목표를 달성하는 AI 에이전트 시스템.

개요¶

Agentic AI는 단순한 질의응답을 넘어 계획, 도구 사용, 반복적 추론을 통해 복잡한 작업을 자율적으로 수행하는 AI 시스템이다. 2024-2026년 LLM 적용의 핵심 트렌드로 부상했다.

Chatbot vs Agent¶

특성	Chatbot	Agent
상호작용	단일 턴	다중 턴
목표	응답 생성	태스크 완료
도구 사용	제한적	핵심 기능
자율성	수동적	능동적
환경 인식	없음	있음
메모리	세션 제한	장기

핵심 구성요소¶

1. 계획 (Planning)¶

복잡한 목표를 단계별 태스크로 분해.

# ReAct 패턴
def react_loop(goal, llm, tools):
    context = []

    while not is_complete(context):
        # Thought: 현재 상황 분석
        thought = llm.generate(f"Given {context}, what should I do next?")

        # Action: 도구 선택 및 실행
        action = llm.generate(f"Select tool for: {thought}")
        result = execute_tool(action, tools)

        # Observation: 결과 관찰
        context.append((thought, action, result))

    return context

계획 전략:

전략	설명	적합한 상황
Zero-shot	즉시 실행	단순한 태스크
Chain-of-Thought	단계별 추론	논리적 태스크
Plan-and-Execute	먼저 계획, 후 실행	복잡한 태스크
Tree-of-Thoughts	여러 경로 탐색	불확실한 상황

2. 도구 사용 (Tool Use)¶

외부 API, 코드 실행, 데이터베이스 접근.

tools = {
    "search": {
        "description": "Search the web for information",
        "parameters": {"query": "string"},
        "function": web_search
    },
    "calculator": {
        "description": "Perform mathematical calculations",
        "parameters": {"expression": "string"},
        "function": calculate
    },
    "code_executor": {
        "description": "Execute Python code",
        "parameters": {"code": "string"},
        "function": exec_python
    }
}

3. 메모리 (Memory)¶

과거 상호작용과 학습 내용 저장.

메모리 계층
├── Short-term: 현재 대화 컨텍스트
├── Long-term: 벡터 DB에 저장된 이전 상호작용
├── Episodic: 특정 작업 수행 기록
└── Semantic: 학습된 지식/사실

4. 반성 (Reflection)¶

자기 출력 평가 및 개선.

def reflection_loop(task, llm, max_iterations=3):
    response = llm.generate(task)

    for _ in range(max_iterations):
        critique = llm.generate(
            f"Critique this response: {response}\n"
            f"What could be improved?"
        )

        if "looks good" in critique.lower():
            break

        response = llm.generate(
            f"Improve based on feedback: {critique}\n"
            f"Original: {response}"
        )

    return response

에이전트 아키텍처¶

Single Agent¶

User → LLM ↔ Tools → Response
         ↓
      Memory

Multi-Agent¶

여러 에이전트가 협력하여 복잡한 태스크 수행.

Orchestrator
├── Research Agent → Web Search
├── Analyst Agent → Data Processing
├── Writer Agent → Content Generation
└── Critic Agent → Quality Check

협력 패턴:

패턴	설명	예시
Sequential	순차 실행	파이프라인
Hierarchical	상위-하위 구조	Manager-Worker
Debate	토론 후 합의	의사결정
Voting	다수결	앙상블

Agentic RAG¶

RAG를 에이전트 프레임워크에 통합.

def agentic_rag(query, agent):
    # 1. 쿼리 분석
    analysis = agent.analyze(query)

    # 2. 검색 전략 결정
    if analysis.needs_multi_step:
        sub_queries = agent.decompose(query)
        results = [retrieve(q) for q in sub_queries]
    else:
        results = retrieve(query)

    # 3. 충분성 판단
    if not agent.is_sufficient(results):
        # 추가 검색 또는 다른 도구 사용
        results = agent.expand_search(query, results)

    # 4. 답변 생성 및 검증
    answer = agent.generate(query, results)

    if not agent.verify(answer, results):
        answer = agent.refine(answer)

    return answer

프레임워크¶

LangGraph¶

상태 기반 에이전트 워크플로우.

from langgraph.graph import StateGraph, END

workflow = StateGraph(AgentState)

# 노드 정의
workflow.add_node("research", research_agent)
workflow.add_node("analyze", analyze_agent)
workflow.add_node("write", write_agent)

# 엣지 정의
workflow.add_edge("research", "analyze")
workflow.add_conditional_edges(
    "analyze",
    should_continue,
    {"continue": "write", "end": END}
)

app = workflow.compile()

AutoGen (Microsoft)¶

멀티 에이전트 대화.

from autogen import AssistantAgent, UserProxyAgent

assistant = AssistantAgent("assistant", llm_config=config)
user_proxy = UserProxyAgent("user", code_execution_config={"work_dir": "coding"})

user_proxy.initiate_chat(assistant, message="Create a data visualization")

CrewAI¶

역할 기반 에이전트 팀.

from crewai import Agent, Task, Crew

researcher = Agent(
    role="Research Analyst",
    goal="Find relevant information",
    backstory="Expert at finding and synthesizing information"
)

analyst = Agent(
    role="Data Analyst", 
    goal="Analyze and interpret data",
    backstory="Expert at statistical analysis"
)

crew = Crew(agents=[researcher, analyst], tasks=[...])
result = crew.kickoff()

평가¶

벤치마크¶

벤치마크	측정 대상
GAIA	범용 에이전트 능력
AgentBench	다양한 환경에서 태스크 수행
WebArena	웹 자동화
SWE-bench	소프트웨어 엔지니어링
MINT	도구 사용 + 추론

평가 지표¶

지표	설명
Task Success Rate	태스크 완료율
Steps to Completion	완료까지 단계 수
Tool Accuracy	올바른 도구 선택률
Reasoning Quality	추론 논리성
Safety	위험 행동 회피

실무 고려사항¶

안전성¶

# 도구 실행 제한
ALLOWED_TOOLS = ["search", "calculator", "read_file"]
FORBIDDEN_ACTIONS = ["delete", "execute_code", "send_email"]

def safe_execute(action, tool):
    if tool not in ALLOWED_TOOLS:
        return "Tool not allowed"
    if any(f in action for f in FORBIDDEN_ACTIONS):
        return "Action forbidden"
    return execute(action, tool)

Human-in-the-Loop¶

중요한 결정에 인간 승인 요구.

def agent_with_approval(task, agent, approval_threshold=0.7):
    plan = agent.plan(task)

    for step in plan:
        if step.risk_score > approval_threshold:
            if not get_human_approval(step):
                return "Aborted by human"

        agent.execute(step)

    return agent.result()

비용 관리¶

전략	설명
Token Budget	최대 토큰 수 제한
Step Limit	최대 단계 수 제한
Model Routing	간단한 작업에 작은 모델
Caching	반복 쿼리 캐싱

적용 사례¶

도메인	사례
소프트웨어 개발	Devin, GitHub Copilot Workspace
고객 지원	자율 티켓 해결
연구	논문 조사, 실험 설계
데이터 분석	자동 리포트 생성
운영	IT 자동화, 인시던트 대응

참고 자료¶

논문¶

"ReAct: Synergizing Reasoning and Acting in Language Models" (2023)
"Toolformer: Language Models Can Teach Themselves to Use Tools" (2023)
"AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation" (2023)