본문으둜 κ±΄λ„ˆλ›°κΈ°
-
skycave's Blog
skycave's Blog
  • Home
  • Investment
  • IT
    • Data engineering
    • AI
    • Programing
  • Leisure
    • Camping
    • Fishing
  • Travel
    • Domestic
    • Overseas
  • Book
  • Product
  • Hot keyword in google
  • Home
  • Investment
  • IT
    • Data engineering
    • AI
    • Programing
  • Leisure
    • Camping
    • Fishing
  • Travel
    • Domestic
    • Overseas
  • Book
  • Product
  • Hot keyword in google
λ‹«κΈ°

검색

AI

[AI Paper] πŸ“„ AutoAgents: A Framework for Automatic Agent Generation

By skycave
2026λ…„ 01μ›” 25일 9 Min Read
0

πŸ“„ AutoAgents: A Framework for Automatic Agent Generation

πŸ“‹ 메타 정보

ν•­λͺ© λ‚΄μš©
λ…Όλ¬Έ 제λͺ© AutoAgents: A Framework for Automatic Agent Generation
μ €μž Guangyao Chen, Siwei Dong, Yu Shu, Ge Zhang, Jaward Sesay, BΓΆrje F. Karlsson, Jie Fu, Yemin Shi
μ†Œμ† κΈ°κ΄€ Peking University, Hong Kong University of Science and Technology (HKUST), Beijing Academy of Artificial Intelligence, University of Waterloo
λ°œν‘œμ²˜ IJCAI 2024 (Main Track), Pages 22-30
λ°œν‘œ 연도 2024
arXiv arXiv:2309.17288
DOI 10.24963/ijcai.2024/3
GitHub Link-AGI/AutoAgents
Corresponding Authors gy.chen@pku.edu.cn, ymshi@pku.edu.cn, jiefu@ust.hk

🎯 ν•œμ€„ μš”μ•½

νƒœμŠ€ν¬μ— λ§žλŠ” μ „λ¬Έ μ—μ΄μ „νŠΈ νŒ€μ„ μžλ™μœΌλ‘œ μƒμ„±ν•˜κ³  μ‘°μœ¨ν•˜λŠ” μ μ‘ν˜• λ©€ν‹°μ—μ΄μ „νŠΈ ν”„λ ˆμž„μ›Œν¬λ‘œ, 사전 μ •μ˜λœ μ—μ΄μ „νŠΈμ— μ˜μ‘΄ν•˜μ§€ μ•Šκ³  λ™μ μœΌλ‘œ 역할을 μƒμ„±ν•˜μ—¬ λ³΅μž‘ν•œ νƒœμŠ€ν¬λ₯Ό ν•΄κ²°ν•œλ‹€.


πŸ” 연ꡬ λ°°κ²½ 및 동기

기쑴 문제점

  1. 사전 μ •μ˜λœ μ—μ΄μ „νŠΈμ˜ ν•œκ³„
    • λŒ€λΆ€λΆ„μ˜ κΈ°μ‘΄ LLM 기반 λ©€ν‹°μ—μ΄μ „νŠΈ μ‹œμŠ€ν…œμ€ 사전에 μ •μ˜λœ(predefined) μ—μ΄μ „νŠΈμ— 의쑴
    • λ‹¨μˆœν•œ νƒœμŠ€ν¬ μ²˜λ¦¬μ—λŠ” μ ν•©ν•˜λ‚˜, λ‹€μ–‘ν•œ μ‹œλ‚˜λ¦¬μ˜€μ— λŒ€ν•œ 적응성이 λΆ€μ‘±
    • μˆ˜λ™μœΌλ‘œ λ‹€μˆ˜μ˜ μ „λ¬Έκ°€ μ—μ΄μ „νŠΈλ₯Ό μƒμ„±ν•˜λŠ” 것은 λ§Žμ€ λ¦¬μ†ŒμŠ€ μ†Œλͺ¨
  2. ν˜‘μ—… λ²”μœ„μ˜ μ œν•œ
    • νŠΉμ • μ—­ν• κ³Ό 인간 감독이 ν•„μš”ν•œ μˆ˜λ™ 섀계 μ—μ΄μ „νŠΈ
    • ν˜‘μ—… μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ λ²”μœ„κ°€ μ œν•œλ¨
    • νƒœμŠ€ν¬λ³„ 졜적의 μ—μ΄μ „νŠΈ ꡬ성을 μ°ΎκΈ° 어렀움
  3. ν™•μž₯μ„± 문제
    • κΈ°μ‘΄ 연ꡬ듀은 인간이 μ„€κ³„ν•œ ν”„λ ˆμž„μ›Œν¬μ— 크게 의쑴
    • μ—μ΄μ „νŠΈ μ‹œμŠ€ν…œμ˜ κΈ°λŠ₯ λ²”μœ„μ™€ ν™•μž₯성이 μ œν•œλ¨

연ꡬ 동기

  • 인간 νŒ€μ²˜λŸΌ λ‹€μ–‘ν•œ μ „λ¬Έκ°€κ°€ ν˜‘λ ₯ν•˜μ—¬ λ³΅μž‘ν•œ 문제λ₯Ό ν•΄κ²°ν•˜λŠ” AI μ‹œμŠ€ν…œ ꡬ좕
  • νƒœμŠ€ν¬ λ‚΄μš©μ— 따라 μžλ™μœΌλ‘œ μ μ ˆν•œ μ—μ΄μ „νŠΈ νŒ€μ„ κ΅¬μ„±ν•˜λŠ” λ©”μ»€λ‹ˆμ¦˜ ν•„μš”
  • 인간 κ·Έλ£Ή λ‚΄ 닀양성이 λ‹€μ–‘ν•œ 관점을 μ΄‰μ§„ν•˜κ³  κ·Έλ£Ή μ„±κ³Όλ₯Ό ν–₯μƒμ‹œν‚¨λ‹€λŠ” κ²½ν—˜μ  증거에 기반

πŸ’‘ 핡심 아이디어

1. 동적 μ—μ΄μ „νŠΈ 생성 (Dynamic Agent Generation)

νƒœμŠ€ν¬ λ‚΄μš©μ„ λΆ„μ„ν•˜μ—¬ ν•„μš”ν•œ μ „λ¬Έ μ—μ΄μ „νŠΈλ₯Ό λ™μ μœΌλ‘œ 생성:

Task Input β†’ Agent Generation β†’ Specialized Agent Team
  • νƒœμŠ€ν¬μ™€ μ—­ν•  κ°„μ˜ 관계λ₯Ό μ—°κ²°(coupling)
  • νƒœμŠ€ν¬ μ½˜ν…μΈ  기반 ν•„μš” μ—μ΄μ „νŠΈ μžλ™ λ„μΆœ
  • μƒμ„±λœ μ „λ¬Έκ°€ μ—μ΄μ „νŠΈ 기반 μ‹€ν–‰ κ³„νš 수립

2. 2단계 ν”„λ‘œμ„ΈμŠ€ (Two-Stage Process)

Drafting Stage (μ΄ˆμ•ˆ 단계)

  • 3개의 사전 μ •μ˜λœ μ—μ΄μ „νŠΈ(Planner, Agent Observer, Plan Observer)κ°€ ν˜‘λ ₯적 ν† λ‘ 
  • μž…λ ₯ 문제/νƒœμŠ€ν¬μ— λ§žλŠ” μ»€μŠ€ν„°λ§ˆμ΄μ¦ˆλœ μ—μ΄μ „νŠΈ νŒ€ ν•©μ„±
  • νƒœμŠ€ν¬μ— μ ν•©ν•œ μ‹€ν–‰ κ³„νš 생성

Execution Stage (μ‹€ν–‰ 단계)

  • μ—μ΄μ „νŠΈ κ°„ ν˜‘μ—…κ³Ό ν”Όλ“œλ°±μ„ ν†΅ν•œ κ³„νš κ°œμ„ 
  • 자기 κ°œμ„ (Self-refinement)κ³Ό ν˜‘λ ₯적 κ°œμ„ (Collaborative refinement) μˆ˜ν–‰
  • μ΅œμ’… κ²°κ³Όλ¬Ό 생성

3. Observer λ©”μ»€λ‹ˆμ¦˜

  • Agent Observer: μƒμ„±λœ μ—μ΄μ „νŠΈμ˜ μ μ ˆμ„± κ²€ν† 
  • Plan Observer: μ‹€ν–‰ κ³„νšμ˜ 합리성 κ²€ν† 
  • Action Observer: μ‹€ν–‰ κ³Όμ •μ—μ„œμ˜ 행동 및 κ²°κ³Ό κ²€ν† 

4. κ°œμ„  λ©”μ»€λ‹ˆμ¦˜ (Refinement Mechanisms)

Self-Refinement (자기 κ°œμ„ )

  • 단일 μ—μ΄μ „νŠΈκ°€ μ „λ¬Έ νƒœμŠ€ν¬ μˆ˜ν–‰ λŠ₯λ ₯을 자체적으둜 ν–₯상
  • κ³„νš β†’ μ‹€ν–‰ β†’ ν”Όλ“œλ°±μ˜ μˆœν™˜μ„ ν†΅ν•œ 지속적 κ°œμ„ 

Collaborative Refinement (ν˜‘λ ₯적 κ°œμ„ )

  • μ—¬λŸ¬ μ—μ΄μ „νŠΈ κ°„ 지식 곡유
  • ν•™μ œκ°„ 전문성이 ν•„μš”ν•œ νƒœμŠ€ν¬ 달성
  • 순차적 ν„΄ν…Œμ΄ν‚Ή λ°©μ‹μ˜ ν˜‘μ—…

πŸ—οΈ μ•„ν‚€ν…μ²˜ / 방법둠

전체 μ‹œμŠ€ν…œ μ•„ν‚€ν…μ²˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        AutoAgents Framework                       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚                    DRAFTING STAGE                            β”‚ β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚ β”‚
β”‚  β”‚  β”‚ Planner │◄─►│ Agent Observer │◄─►│ Plan Observer  β”‚      β”‚ β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚ β”‚
β”‚  β”‚       β”‚                β”‚                     β”‚               β”‚ β”‚
β”‚  β”‚       β–Ό                β–Ό                     β–Ό               β”‚ β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚ β”‚
β”‚  β”‚  β”‚        Customized Agent Team + Execution Plan           β”‚β”‚ β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                              β”‚                                    β”‚
β”‚                              β–Ό                                    β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚                   EXECUTION STAGE                            β”‚ β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚ β”‚
β”‚  β”‚  β”‚              Generated Agent Team                        β”‚β”‚ β”‚
β”‚  β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”           β”‚β”‚ β”‚
β”‚  β”‚  β”‚  β”‚Agent 1 β”‚ β”‚Agent 2 β”‚ β”‚Agent 3 β”‚ β”‚Agent N β”‚           β”‚β”‚ β”‚
β”‚  β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β”‚β”‚ β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚ β”‚
β”‚  β”‚                              β”‚                               β”‚ β”‚
β”‚  β”‚       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚ β”‚
β”‚  β”‚       β–Ό                      β–Ό                      β–Ό       β”‚ β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚
β”‚  β”‚  β”‚Self-Refinementβ”‚    β”‚Collaborative     β”‚    β”‚  Action   β”‚ β”‚ β”‚
β”‚  β”‚  β”‚              β”‚    β”‚Refinement        β”‚    β”‚ Observer  β”‚ β”‚ β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                              β”‚                                    β”‚
β”‚                              β–Ό                                    β”‚
β”‚                       Final Output                                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

핡심 μ»΄ν¬λ„ŒνŠΈ

1. Planner (κ³„νšμž)

# μ—­ν• : μ „λ¬Έκ°€ μ—­ν•  κ²°μ • 및 μ‹€ν–‰ κ³„νš 수립
class Planner:
    def __init__(self, llm):
        self.llm = llm

    def generate_agents(self, task):
        """νƒœμŠ€ν¬ 뢄석 ν›„ ν•„μš”ν•œ μ—μ΄μ „νŠΈ μ—­ν•  생성"""
        prompt = f"""
        Task: {task}
        Analyze the task and determine:
        1. Required expert roles
        2. Each role's responsibilities
        3. Execution plan
        """
        return self.llm.generate(prompt)

    def create_execution_plan(self, agents, task):
        """μƒμ„±λœ μ—μ΄μ „νŠΈ 기반 μ‹€ν–‰ κ³„νš 수립"""
        pass

2. Observer μ»΄ν¬λ„ŒνŠΈ

# Agent Observer: μƒμ„±λœ μ—μ΄μ „νŠΈ 검증
class AgentObserver:
    def validate_agents(self, agents, task):
        """μ—μ΄μ „νŠΈμ˜ 적합성 및 μ™„μ „μ„± κ²€ν† """
        # μ—­ν•  쀑볡 검사
        # ν•„μš” μ—­ν•  λˆ„λ½ 검사
        # μ—­ν•  μ •μ˜μ˜ λͺ…ν™•μ„± 검사
        pass

# Plan Observer: μ‹€ν–‰ κ³„νš 검증
class PlanObserver:
    def validate_plan(self, plan, agents, task):
        """μ‹€ν–‰ κ³„νšμ˜ μ‹€ν˜„ κ°€λŠ₯μ„± κ²€ν† """
        # 단계별 μ‹€ν–‰ κ°€λŠ₯μ„±
        # μ—μ΄μ „νŠΈ μ—­ν• κ³Όμ˜ λ§€μΉ­
        # 논리적 μˆœμ„œ 검증
        pass

# Action Observer: μ‹€ν–‰ κ²°κ³Ό 검증
class ActionObserver:
    def validate_action(self, action, result):
        """μ‹€ν–‰ 결과의 ν’ˆμ§ˆ κ²€ν† """
        pass

3. Refinement λ©”μ»€λ‹ˆμ¦˜

# Self-Refinement: 단일 μ—μ΄μ „νŠΈμ˜ 자기 κ°œμ„ 
def self_refinement(agent, task, max_iterations=5):
    result = agent.execute(task)
    for i in range(max_iterations):
        feedback = agent.evaluate(result)
        if feedback.is_satisfactory:
            break
        result = agent.improve(result, feedback)
    return result

# Collaborative Refinement: 닀쀑 μ—μ΄μ „νŠΈ ν˜‘λ ₯ κ°œμ„ 
def collaborative_refinement(agents, task, max_rounds=5):
    chat_history = []
    current_result = None

    for round in range(max_rounds):
        for agent in agents:
            # 이전 μ—μ΄μ „νŠΈλ“€μ˜ λ°œν™”λ₯Ό 기반으둜 응닡 생성
            response = agent.generate(
                task=task,
                chat_history=chat_history,
                current_result=current_result
            )
            chat_history.append(response)
            current_result = response.result

        # ν•©μ˜ 도달 μ—¬λΆ€ 확인
        if check_consensus(agents, current_result):
            break

    return current_result

μ•Œκ³ λ¦¬μ¦˜ 흐름

Algorithm: AutoAgents Framework
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Input: Task T, LLM M, max_draft_iterations=3, max_refine_iterations=5
Output: Final Result R

# Phase 1: Drafting Stage
1. Initialize Planner, AgentObserver, PlanObserver with M
2. for i = 1 to max_draft_iterations:
   a. agents_list ← Planner.generate_agents(T)
   b. agent_feedback ← AgentObserver.validate(agents_list)
   c. if agent_feedback.needs_revision:
      agents_list ← Planner.revise_agents(agents_list, agent_feedback)
   d. execution_plan ← Planner.create_plan(agents_list, T)
   e. plan_feedback ← PlanObserver.validate(execution_plan)
   f. if plan_feedback.needs_revision:
      execution_plan ← Planner.revise_plan(execution_plan, plan_feedback)
   g. if AgentObserver.approved AND PlanObserver.approved:
      break

# Phase 2: Execution Stage
3. Initialize ActionObserver with M
4. for each step in execution_plan:
   a. assigned_agents ← get_agents_for_step(step, agents_list)
   b. if step.requires_collaboration:
      result ← collaborative_refinement(assigned_agents, step)
   else:
      result ← self_refinement(assigned_agents[0], step)
   c. action_feedback ← ActionObserver.validate(result)
   d. if action_feedback.needs_revision:
      result ← refine_result(result, action_feedback)
   e. update_shared_memory(result)

5. R ← aggregate_results()
6. return R

ν•˜μ΄νΌνŒŒλΌλ―Έν„° μ„€μ •

νŒŒλΌλ―Έν„° κ°’ μ„€λͺ…
max_draft_discussions 3 Drafting 단계 μ΅œλŒ€ ν† λ‘  횟수
max_self_refinement 5 자기 κ°œμ„  μ΅œλŒ€ 반볡 횟수
max_collaborative_refinement 5 ν˜‘λ ₯적 κ°œμ„  μ΅œλŒ€ 반볡 횟수

πŸ“Š μ‹€ν—˜ 및 κ²°κ³Ό

μ‹€ν—˜ μ„€μ •

평가 νƒœμŠ€ν¬

  1. Open-ended Question Answering: κ°œλ°©ν˜• 질문 응닡 νƒœμŠ€ν¬
  2. Trivia Creative Writing: 상식 기반 μ°½μž‘ κΈ€μ“°κΈ°
  3. Software Development (Case Study): μ†Œν”„νŠΈμ›¨μ–΄ 개발 사둀 연ꡬ (Tetris κ²Œμž„)

베이슀라인 비ꡐ

  • Standard LLM: μ—μ΄μ „νŠΈ 생성 없이 단일 LLM μ‚¬μš©
  • SSP (Solo Performance Prompting): μ—μ΄μ „νŠΈ μƒ˜ν”Œ 제곡 방식
  • AgentVerse: μ—μ΄μ „νŠΈ ν† λ‘  기반 μ‹€ν–‰ κ³„νš 생성
  • MetaGPT: 사전 μ •μ˜λœ μ—­ν•  기반 λ©€ν‹°μ—μ΄μ „νŠΈ μ‹œμŠ€ν…œ

평가 μ§€ν‘œ

  • FairEval (LLM-based): LLM 기반 곡정 평가
  • Human Evaluation: 인간 ν‰κ°€μžμ— μ˜ν•œ ν’ˆμ§ˆ 평가
  • Accuracy: 정확도 μΈ‘μ •

μ£Όμš” κ²°κ³Ό

μ •λŸ‰μ  κ²°κ³Ό

Method Open-ended QA Knowledge Acquisition
Standard Baseline Baseline
SSP +5% –
AgentVerse +7% –
AutoAgents +10% +10%

핡심 λ°œκ²¬μ‚¬ν•­:
1. AutoAgentsλŠ” Standard 방식 λŒ€λΉ„ λͺ¨λ“  μ‹€ν—˜μ—μ„œ 10% ν–₯상 달성
2. μ—μ΄μ „νŠΈ 생성을 μ‚¬μš©ν•˜μ§€λ§Œ λ‹€λ₯Έ μ ‘κ·Ό 방식인 SSP보닀 μš°μˆ˜ν•œ μ„±λŠ₯
3. FairEval 기반 LLM 평가와 Human 평가 λͺ¨λ‘μ—μ„œ κ°œλ³„ LLM λͺ¨λΈμ„ λŠ₯κ°€

정성적 κ²°κ³Ό

κ°œλ°©ν˜• μ§ˆλ¬Έμ— λŒ€ν•œ 응닡 ν’ˆμ§ˆ:
– λ‹€μˆ˜μ˜ μ „λ¬Έκ°€ λͺ¨λΈμ„ ν•©μ„±ν•˜μ—¬ 더 포괄적이고 λ‰˜μ•™μŠ€ μžˆλŠ” λ‹΅λ³€ 생성
– κΈ°μ‘΄ λ©€ν‹°μ—μ΄μ „νŠΈ 방법보닀 더 일관성 있고 μ •ν™•ν•œ μ†”λ£¨μ…˜ 생성

Case Study: Tetris κ²Œμž„ 개발

AutoAgentsκ°€ μƒμ„±ν•œ μ—μ΄μ „νŠΈ νŒ€:
1. Game Design Expert: κ²Œμž„ λ©”μ»€λ‹ˆμ¦˜ 섀계
2. UI Design Expert: μ‚¬μš©μž μΈν„°νŽ˜μ΄μŠ€ 섀계
3. Programmer: 핡심 μ½”λ“œ κ΅¬ν˜„
4. Debugging Expert: ν…ŒμŠ€νŠΈ 및 디버깅

κ²°κ³Ό:
– 더 μ •κ΅ν•œ λ¬Έμ„œν™”μ™€ ν”„λ‘œκ·Έλž¨ 생성
– μ‚¬μš©μžκ°€ μ΄ν•΄ν•˜κΈ° μ‰¬μš΄ κ²°κ³Όλ¬Ό
– ν˜‘μ—… 기반의 완성도 높은 κ²Œμž„ 개발

Ablation Study κ²°κ³Ό

Component 제거 μ‹œ μ„±λŠ₯ λ³€ν™”
Agent Observer μ„±λŠ₯ μ €ν•˜
Plan Observer μ„±λŠ₯ μ €ν•˜
Self-Refinement μƒλ‹Ήν•œ μ„±λŠ₯ μ €ν•˜
Collaborative Refinement μƒλ‹Ήν•œ μ„±λŠ₯ μ €ν•˜

πŸ’ͺ 강점 및 κΈ°μ—¬

ν•™μˆ μ  κΈ°μ—¬

  1. μƒˆλ‘œμš΄ ν”„λ ˆμž„μ›Œν¬ μ œμ•ˆ
    • λ‹€μ–‘ν•œ νƒœμŠ€ν¬μ— λ§žλŠ” μ»€μŠ€ν„°λ§ˆμ΄μ¦ˆλœ AI νŒ€μ„ λ™μ μœΌλ‘œ ν•©μ„±ν•˜κ³  쑰율
    • νƒœμŠ€ν¬μ™€ μ—­ν•  κ°„μ˜ 관계λ₯Ό 효과적으둜 μ—°κ²°(coupling)
  2. μ •λŸ‰μ  μ‹€ν—˜ 검증
    • 두 κ°€μ§€ 도전적인 νƒœμŠ€ν¬μ—μ„œ μ—„κ²©ν•œ μ •λŸ‰μ  μ‹€ν—˜ μˆ˜ν–‰
    • LLM의 지식 μŠ΅λ“κ³Ό μΆ”λ‘  λŠ₯λ ₯ λͺ¨λ‘ μœ μ˜λ―Έν•˜κ²Œ ν–₯상
  3. μ‹€μš©μ  적용 κ°€λŠ₯μ„± μž…μ¦
    • μ†Œν”„νŠΈμ›¨μ–΄ 개발 λ“± λ³΅μž‘ν•œ νƒœμŠ€ν¬μ— 적용 κ°€λŠ₯μ„± μ‹œμ—°

기술적 강점

  1. 적응성 (Adaptability)
    • νƒœμŠ€ν¬ νŠΉμ„±μ— λ”°λ₯Έ μ—μ΄μ „νŠΈ νŒ€ μžλ™ ꡬ성
    • 사전 μ •μ˜ 없이 동적 μ—­ν•  생성
  2. μ‹ λ’°μ„± ν–₯상 (Reliability)
    • Observer λ©”μ»€λ‹ˆμ¦˜μ„ ν†΅ν•œ ν’ˆμ§ˆ 보증
    • SSP, AgentVerse λŒ€λΉ„ μƒμ„±λœ μ—μ΄μ „νŠΈμ™€ κ³„νšμ˜ μ‹ λ’°μ„± κ°•μ‘°
  3. κ°œμ„  λ©”μ»€λ‹ˆμ¦˜ (Refinement)
    • Self-refinement: κ°œλ³„ μ—μ΄μ „νŠΈ μ—­λŸ‰ κ°•ν™”
    • Collaborative refinement: νŒ€ ν˜‘μ—… 효과 κ·ΉλŒ€ν™”
  4. ν™•μž₯μ„± (Scalability)
    • λ¬΄μ œν•œ μ—μ΄μ „νŠΈ 생성 지원
    • λ‹€μ–‘ν•œ 도메인에 적용 κ°€λŠ₯

차별화 μš”μ†Œ (vs κΈ°μ‘΄ 방법)

νŠΉμ„± MetaGPT SSP AgentVerse AutoAgents
동적 μ—μ΄μ „νŠΈ 생성 βœ— βœ“ βœ“ βœ“
Self-Refinement βœ— βœ— βœ— βœ“
Collaborative Refinement βœ— βœ— βœ— βœ“
Observer λ©”μ»€λ‹ˆμ¦˜ βœ— βœ— βœ— βœ“

⚠️ ν•œκ³„μ  및 ν–₯ν›„ 연ꡬ

λ…Όλ¬Έμ—μ„œ μ–ΈκΈ‰ν•œ ν•œκ³„

  1. LLM ν’ˆμ§ˆ μ˜μ‘΄μ„±
    • 기반 LLM의 ν’ˆμ§ˆκ³Ό λŠ₯λ ₯에 크게 의쑴
    • μƒμ„±λœ λ©€ν‹°μ—μ΄μ „νŠΈ μ‹œμŠ€ν…œμ˜ μ„±λŠ₯κ³Ό 신뒰성이 LLM에 쒌우
  2. ν›ˆλ ¨ 데이터 편ν–₯
    • ν›ˆλ ¨ 데이터와 λͺ¨λΈμ— μ‘΄μž¬ν•˜λŠ” 편ν–₯이 μ—μ΄μ „νŠΈ μ„±λŠ₯에 영ν–₯
  3. 초기 연ꡬ 단계
    • μ‹€μ œ ν™˜κ²½μ—μ„œμ˜ 검증 및 ν™•μž₯μ„± ν…ŒμŠ€νŠΈ ν•„μš”
    • 더 λ§Žμ€ μ‹€μ œ μ‹œλ‚˜λ¦¬μ˜€μ—μ„œμ˜ 평가 ν•„μš”

μ‹€μ œ κ΅¬ν˜„ μ‹œ λ„μ „κ³Όμ œ

  1. 토큰 μ œμ•½
    • GPT-4와 같은 LLM의 토큰 μ œν•œμœΌλ‘œ κΈ΄ λŒ€ν™” 관리 어렀움
    • νžˆμŠ€ν† λ¦¬ 정보가 λ§Žμ•„μ§ˆμˆ˜λ‘ μ»¨ν…μŠ€νŠΈ 관리 볡작
  2. μ—μ΄μ „νŠΈ κ°„ 톡신 섀계
    • μ›ν™œν•œ μ—μ΄μ „νŠΈ κ°„ 톡신을 μœ„ν•œ μ‹€ν–‰ λ£¨ν”„μ˜ μ‹ μ€‘ν•œ 섀계 ν•„μš”
  3. λΉ„μš© 문제
    • GPT-4 같은 λͺ¨λΈμ„ 닀쀑 μ—μ΄μ „νŠΈμ— μ‚¬μš© μ‹œ λΉ„μš© 증가
    • ν”„λ‘¬ν”„νŠΈ μ΅œμ ν™”λ‘œ 토큰 μ‚¬μš©λŸ‰ μ΅œμ†Œν™” ν•„μš”
  4. μ„ ν˜• ꡬ쑰 ν•œκ³„
    • MetaGPT, AutoAgents, SPP 등은 μ„ ν˜• λ©€ν‹°μ—μ΄μ „νŠΈ μ‹œμŠ€ν…œ
    • μœ ν•œ μƒνƒœ κΈ°κ³„μ²˜λŸΌ λ˜λŒμ•„κ°€λŠ” λŠ₯λ ₯ λΆ€μ‘±

ν–₯ν›„ 연ꡬ λ°©ν–₯

  1. μ‹œμŠ€ν…œ 견고성 ν–₯상
    • λ‹€μ–‘ν•œ ν™˜κ²½κ³Ό νƒœμŠ€ν¬μ—μ„œμ˜ μ•ˆμ •μ„± κ°œμ„ 
    • μ—λŸ¬ 핸듀링 및 볡ꡬ λ©”μ»€λ‹ˆμ¦˜ κ°•ν™”
  2. μ‘μš© 도메인 ν™•μž₯
    • μ†Œν”„νŠΈμ›¨μ–΄ 개발 μ™Έ λ‹€μ–‘ν•œ 도메인 적용
    • 도메인 νŠΉν™” μ—μ΄μ „νŠΈ 생성 μ „λž΅ 개발
  3. κ³ κΈ‰ AI 기법 톡합
    • κ°•ν™”ν•™μŠ΅ 기반 μ—μ΄μ „νŠΈ κ°œμ„ 
    • 더 μ •κ΅ν•œ ν˜‘μ—… λ©”μ»€λ‹ˆμ¦˜ 개발
  4. 윀리 및 λ³΄μ•ˆ κ³ λ €
    • 자율 생성 μ‹œμŠ€ν…œμ˜ 윀리적 ν•¨μ˜ κ²€ν† 
    • λ³΄μ•ˆ 취약점 뢄석 및 λŒ€μ‘

πŸ”— κ΄€λ ¨ λ…Όλ¬Έ

μ„ ν–‰ 연ꡬ

λ…Όλ¬Έ 연도 관계
MetaGPT: Meta Programming for Multi-Agent Collaborative Framework 2023 사전 μ •μ˜λœ μ—­ν•  기반 λ©€ν‹°μ—μ΄μ „νŠΈ
AgentVerse: Facilitating Multi-Agent Collaboration 2023 동적 μ—μ΄μ „νŠΈ 생성 + ν† λ‘  기반 κ³„νš
Solo Performance Prompting (SPP) 2023 μ—μ΄μ „νŠΈ μƒ˜ν”Œ 기반 생성
CAMEL: Communicative Agents for Mind Exploration 2023 μ—­ν• κ·Ή 기반 μ—μ΄μ „νŠΈ 톡신
AutoGen: Enabling Next-Gen LLM Applications 2023 λ©€ν‹°μ—μ΄μ „νŠΈ λŒ€ν™” ν”„λ ˆμž„μ›Œν¬

κ΄€λ ¨ κ°œλ…

κ°œλ… μ„€λͺ…
LLM-based Agents LLM을 기반으둜 ν•œ 자율 μ—μ΄μ „νŠΈ μ‹œμŠ€ν…œ
Multi-Agent Systems λ‹€μˆ˜μ˜ μ—μ΄μ „νŠΈκ°€ ν˜‘λ ₯ν•˜λŠ” μ‹œμŠ€ν…œ
Centralized Planning, Decentralized Execution (CPDE) 쀑앙 κ³„νš, λΆ„μ‚° μ‹€ν–‰ νŒ¨ν„΄
Self-Refinement μ—μ΄μ „νŠΈμ˜ 자기 κ°œμ„  λ©”μ»€λ‹ˆμ¦˜

후속 연ꡬ λ°©ν–₯

  • AutoAgent (2025): μ™„μ „ μžλ™ν™”λœ μ œλ‘œμ½”λ“œ LLM μ—μ΄μ „νŠΈ ν”„λ ˆμž„μ›Œν¬
  • AutoGenesisAgent: 자기 생성 λ©€ν‹°μ—μ΄μ „νŠΈ μ‹œμŠ€ν…œ
  • MegaAgent: 사전 μ •μ˜λœ SOP μ—†λŠ” λŒ€κ·œλͺ¨ 자율 λ©€ν‹°μ—μ΄μ „νŠΈ μ‹œμŠ€ν…œ

πŸ’» 싀무 적용 포인트

κ΅¬ν˜„ μ‹œ 고렀사항

1. κΈ°λ³Έ ν”„λ‘œμ νŠΈ ꡬ쑰 (MetaGPT 기반)

autoagents/
β”œβ”€β”€ agents/
β”‚   β”œβ”€β”€ planner.py          # Planner μ—μ΄μ „νŠΈ
β”‚   β”œβ”€β”€ observers/
β”‚   β”‚   β”œβ”€β”€ agent_observer.py
β”‚   β”‚   β”œβ”€β”€ plan_observer.py
β”‚   β”‚   └── action_observer.py
β”‚   └── generated/           # 동적 생성 μ—μ΄μ „νŠΈ
β”œβ”€β”€ actions/
β”‚   β”œβ”€β”€ self_refinement.py
β”‚   └── collaborative_refinement.py
β”œβ”€β”€ roles/
β”‚   └── role_bank.py         # μ—­ν•  ν…œν”Œλ¦Ώ μ €μž₯μ†Œ
β”œβ”€β”€ memory/
β”‚   └── shared_memory.py     # μ—μ΄μ „νŠΈ κ°„ 곡유 λ©”λͺ¨λ¦¬
└── main.py

2. 핡심 κ΅¬ν˜„ μ˜ˆμ‹œ

# agents/planner.py
from metagpt.roles import Role
from metagpt.actions import Action

class AgentGenerationAction(Action):
    """νƒœμŠ€ν¬ 뢄석 및 μ—μ΄μ „νŠΈ 생성 μ•‘μ…˜"""

    PROMPT_TEMPLATE = """
    Analyze the following task and generate a list of required expert agents:

    Task: {task}

    For each agent, provide:
    1. Role name
    2. Role description
    3. Specific responsibilities
    4. Required skills/knowledge

    Output format: JSON
    """

    async def run(self, task: str):
        prompt = self.PROMPT_TEMPLATE.format(task=task)
        response = await self._aask(prompt)
        return self.parse_agents(response)

class Planner(Role):
    """쀑앙 κ³„νšμž μ—­ν• """

    def __init__(self):
        super().__init__()
        self._init_actions([AgentGenerationAction])

    async def _act(self):
        task = self.get_current_task()
        agents = await AgentGenerationAction().run(task)
        plan = await self.create_execution_plan(agents)
        return {"agents": agents, "plan": plan}
# actions/collaborative_refinement.py
class CollaborativeRefinement:
    """닀쀑 μ—μ΄μ „νŠΈ ν˜‘λ ₯적 κ°œμ„ """

    def __init__(self, agents, max_rounds=5):
        self.agents = agents
        self.max_rounds = max_rounds
        self.chat_history = []

    async def execute(self, task):
        current_result = None

        for round_num in range(self.max_rounds):
            for agent in self.agents:
                # μ»¨ν…μŠ€νŠΈ ꡬ성
                context = self.build_context(
                    task,
                    self.chat_history,
                    current_result
                )

                # μ—μ΄μ „νŠΈ 응닡 생성
                response = await agent.respond(context)

                # νžˆμŠ€ν† λ¦¬ μ—…λ°μ΄νŠΈ
                self.chat_history.append({
                    "agent": agent.name,
                    "response": response
                })

                current_result = response.result

            # ν•©μ˜ 체크
            if self.check_consensus():
                break

        return current_result

    def check_consensus(self):
        """μ—μ΄μ „νŠΈλ“€μ΄ ν•©μ˜μ— λ„λ‹¬ν–ˆλŠ”μ§€ 확인"""
        # κ΅¬ν˜„ 둜직
        pass

3. 동적 μ—μ΄μ „νŠΈ 생성

# roles/dynamic_agent_factory.py
class DynamicAgentFactory:
    """동적 μ—μ΄μ „νŠΈ 생성 νŒ©ν† λ¦¬"""

    AGENT_TEMPLATE = """
    You are a {role_name}.

    Description: {description}
    Responsibilities: {responsibilities}

    Your goal is to {goal}

    Guidelines:
    - Focus on your specialized area
    - Collaborate with other team members
    - Provide detailed and actionable outputs
    """

    def create_agent(self, agent_spec):
        """μŠ€νŽ™μ— 따라 μ—μ΄μ „νŠΈ 동적 생성"""

        class GeneratedAgent(Role):
            def __init__(self, spec):
                super().__init__()
                self.name = spec["role_name"]
                self.description = spec["description"]
                self.system_prompt = DynamicAgentFactory.AGENT_TEMPLATE.format(
                    **spec
                )

        return GeneratedAgent(agent_spec)

싀무 적용 팁

  1. ν”„λ‘¬ν”„νŠΈ μ΅œμ ν™”
    • 토큰 μ‚¬μš©λŸ‰ μ΅œμ†Œν™”λ₯Ό μœ„ν•œ κ°„κ²°ν•œ ν”„λ‘¬ν”„νŠΈ 섀계
    • μ—­ν•  μ •μ˜λŠ” ν•΅μ‹¬λ§Œ 포함
  2. λ©”λͺ¨λ¦¬ 관리
    • κΈ΄ λŒ€ν™”μ—μ„œ μ»¨ν…μŠ€νŠΈ μœˆλ„μš° 관리 ν•„μˆ˜
    • μš”μ•½ λ©”μ»€λ‹ˆμ¦˜ λ„μž… κ³ λ €
  3. λΉ„μš© μ΅œμ ν™”
    • λ³΅μž‘ν•˜μ§€ μ•Šμ€ μž‘μ—…μ—λŠ” μ €λ ΄ν•œ λͺ¨λΈ μ‚¬μš©
    • 캐싱 μ „λž΅ 적용
  4. μ—λŸ¬ 처리
    • μ—μ΄μ „νŠΈ 생성 μ‹€νŒ¨ μ‹œ 폴백 λ©”μ»€λ‹ˆμ¦˜
    • μ‹€ν–‰ 단계 쀑 였λ₯˜ 볡ꡬ μ „λž΅

적용 κ°€λŠ₯ 도메인

도메인 적용 μ˜ˆμ‹œ
μ†Œν”„νŠΈμ›¨μ–΄ 개발 μ½”λ“œ 생성, 리뷰, ν…ŒμŠ€νŠΈ μžλ™ν™”
μ½˜ν…μΈ  생성 κΈ€μ“°κΈ°, νŽΈμ§‘, 팩트체크 νŒ€
연ꡬ 보쑰 λ¬Έν—Œ 쑰사, 뢄석, μš”μ•½
고객 μ„œλΉ„μŠ€ λ‹€μΈ΅ 지원 μ‹œμŠ€ν…œ
ꡐ윑 λ§žμΆ€ν˜• νŠœν„°λ§ μ‹œμŠ€ν…œ

🏷️ Tags

#AIAgent #MultiAgent #LLM #AutoGeneration #IJCAI2024 #DynamicAgentGeneration #MetaGPT #AgentVerse #CollaborativeAI #SelfRefinement #TaskPlanning #PekingUniversity #HKUST


πŸ“š μ°Έκ³  자료

  • IJCAI 2024 Proceedings
  • arXiv Paper
  • GitHub Repository
  • Semantic Scholar
μž‘μ„±μž

skycave

Follow Me
λ‹€λ₯Έ 기사
Previous

[AI Paper] Agentic Large Language Models: A Survey (2025)

Next

[AI Paper] πŸ“„ CRAG: Comprehensive RAG Benchmark

λŒ“κΈ€ μ—†μŒ! 첫 λŒ“κΈ€μ„ λ‚¨κ²¨λ³΄μ„Έμš”.

λ‹΅κΈ€ 남기기 응닡 μ·¨μ†Œ

이메일 μ£Όμ†ŒλŠ” κ³΅κ°œλ˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€. ν•„μˆ˜ ν•„λ“œλŠ” *둜 ν‘œμ‹œλ©λ‹ˆλ‹€

μ΅œμ‹ κΈ€

  • πŸ“Š 일일 λ‰΄μŠ€ 감성 리포트 – 2026-01-28
  • AI μ‹œμŠ€ν…œμ˜ λ¬Έλ§₯ 기반 검색(Contextual Retrieval) | Anthropic
  • “Think” 툴: Claudeκ°€ λ©ˆμΆ°μ„œ 생각할 수 μžˆλ„λ‘ ν•˜κΈ° | Anthropic
  • Claude Code λͺ¨λ²” 사둀 \ Anthropic
  • μš°λ¦¬κ°€ λ©€ν‹° μ—μ΄μ „νŠΈ 연ꡬ μ‹œμŠ€ν…œμ„ κ΅¬μΆ•ν•œ 방법
Copyright 2026 — skycave's Blog. All rights reserved. Blogsy WordPress Theme