๋ณธ๋ฌธ์œผ๋กœ ๊ฑด๋„ˆ๋›ฐ๊ธฐ
-
skycave's Blog
skycave's Blog
  • Home
  • Investment
  • IT
    • Data engineering
    • AI
    • Programing
  • Leisure
    • Camping
    • Fishing
  • Travel
    • Domestic
    • Overseas
  • Book
  • Product
  • Hot keyword in google
  • Home
  • Investment
  • IT
    • Data engineering
    • AI
    • Programing
  • Leisure
    • Camping
    • Fishing
  • Travel
    • Domestic
    • Overseas
  • Book
  • Product
  • Hot keyword in google
๋‹ซ๊ธฐ

๊ฒ€์ƒ‰

AI

[AI Paper] ๐Ÿ“„ Gorilla: LLM Connected with Massive APIs

By skycave
2026๋…„ 01์›” 25์ผ 5 Min Read
0

๐Ÿ“„ Gorilla: LLM Connected with Massive APIs

๐Ÿ“‹ ๋ฉ”ํƒ€ ์ •๋ณด

ํ•ญ๋ชฉ ๋‚ด์šฉ
์ œ๋ชฉ Gorilla: Large Language Model Connected with Massive APIs
์ €์ž Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez
์†Œ์† UC Berkeley, Microsoft Research
ํ•™ํšŒ/์ €๋„ NeurIPS 2024 (Advances in Neural Information Processing Systems)
๋ฐœํ‘œ์ผ 2024 (arXiv: 2023๋…„ 5์›”)
arXiv 2305.15334
๊ณต์‹ ํŽ˜์ด์ง€ gorilla.cs.berkeley.edu
GitHub ShishirPatil/gorilla
๋ผ์ด์„ ์Šค Apache 2.0

๐ŸŽฏ ํ•œ์ค„ ์š”์•ฝ

LLM์ด 1,600๊ฐœ ์ด์ƒ์˜ API๋ฅผ ์ •ํ™•ํ•˜๊ฒŒ ํ˜ธ์ถœํ•  ์ˆ˜ ์žˆ๋„๋ก Retriever-Aware Training(RAT)์„ ๋„์ž…ํ•˜์—ฌ GPT-4๋ณด๋‹ค 20% ์ด์ƒ ๋†’์€ API ํ˜ธ์ถœ ์ •ํ™•๋„๋ฅผ ๋‹ฌ์„ฑํ•˜๊ณ  ํ™˜๊ฐ(hallucination) ๋ฌธ์ œ๋ฅผ ํฌ๊ฒŒ ์™„ํ™”ํ•œ ์—ฐ๊ตฌ


๐Ÿ” ์—ฐ๊ตฌ ๋ฐฐ๊ฒฝ ๋ฐ ๋™๊ธฐ

๋ฌธ์ œ ์ธ์‹

  • LLM์˜ ๋„๊ตฌ ์‚ฌ์šฉ ํ•œ๊ณ„: GPT-4์™€ ๊ฐ™์€ ์ตœ์‹  LLM๋„ API ํ˜ธ์ถœ ์‹œ ์ •ํ™•ํ•œ ์ž…๋ ฅ ์ธ์ž๋ฅผ ์ƒ์„ฑํ•˜์ง€ ๋ชปํ•˜๊ณ , ์ž˜๋ชป๋œ API ์‚ฌ์šฉ๋ฒ•์„ ํ™˜๊ฐํ•˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์Œ
  • API ๋ฌธ์„œ์˜ ๋น ๋ฅธ ๋ณ€ํ™”: API ๋ฌธ์„œ๋Š” LLM ์žฌํ•™์Šต ์ฃผ๊ธฐ๋ณด๋‹ค ํ›จ์”ฌ ๋น ๋ฅด๊ฒŒ ์—…๋ฐ์ดํŠธ๋จ (์˜ˆ: AWS API๋งŒ ํ•˜๋ฃจ์— 31๊ฐœ์˜ ์ˆ˜์ • ๋ฐœ์ƒ)
  • ๊ธฐ์กด ์ ‘๊ทผ๋ฒ•์˜ ํ•œ๊ณ„:
    • ToolFormer: ์†Œ์ˆ˜์˜ ๋„๊ตฌ์—๋งŒ ์ง‘์ค‘
    • ReAct: ๋งค ๋‹จ๊ณ„๋งˆ๋‹ค LLM์„ ํ˜ธ์ถœํ•˜์—ฌ ๋†’์€ ์ถ”๋ก  ๋น„์šฉ ๋ฐœ์ƒ

์—ฐ๊ตฌ ๋ชฉํ‘œ

  1. LLM์ด ๋Œ€๊ทœ๋ชจ API ์ง‘ํ•ฉ์—์„œ ์ •ํ™•ํ•œ API๋ฅผ ์„ ํƒํ•˜๊ณ  ํ˜ธ์ถœํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•™์Šต
  2. API ๋ฌธ์„œ ๋ณ€๊ฒฝ์— ์‹ค์‹œ๊ฐ„์œผ๋กœ ์ ์‘ํ•  ์ˆ˜ ์žˆ๋Š” ์‹œ์Šคํ…œ ๊ตฌ์ถ•
  3. API ํ˜ธ์ถœ ์‹œ ํ™˜๊ฐ ๋ฌธ์ œ๋ฅผ ์ •๋Ÿ‰์ ์œผ๋กœ ์ธก์ •ํ•˜๊ณ  ์™„ํ™”

๐Ÿ’ก ํ•ต์‹ฌ ์•„์ด๋””์–ด

1. API ๋ฌธ์„œ ๊ธฐ๋ฐ˜ ํ•™์Šต (Self-Instruct Fine-tuning)

  • APIBench ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์ถ•: HuggingFace, TorchHub, TensorHub์—์„œ ์ด 1,640๊ฐœ์˜ API ๋ฌธ์„œ ์ˆ˜์ง‘
  • ํ’ˆ์งˆ ํ•„ํ„ฐ๋ง: HuggingFace์˜ 203,681๊ฐœ ๋ชจ๋ธ ์ค‘ ๋ฌธ์„œํ™”๊ฐ€ ์ž˜ ๋œ ์ƒ์œ„ ๋ชจ๋ธ๋งŒ ์„ ๋ณ„
    • 7๊ฐœ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋„๋ฉ”์ธ, 8๊ฐœ CV, 12๊ฐœ NLP, 5๊ฐœ ์˜ค๋””์˜ค, 2๊ฐœ ํ…Œ์ด๋ธ” ๋ฐ์ดํ„ฐ, 2๊ฐœ ๊ฐ•ํ™”ํ•™์Šต ๋„๋ฉ”์ธ
    • ์ตœ์ข… 925๊ฐœ HuggingFace ๋ชจ๋ธ ์„ ์ •
  • Instruction-Response ์Œ ์ƒ์„ฑ: ์‚ฌ์šฉ์ž ๋ช…๋ น(์˜ˆ: “์˜๋ฃŒ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜๊ธฐ ๋งŒ๋“ค์–ด์ค˜”)์„ API ํ˜ธ์ถœ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ํ•™์Šต ๋ฐ์ดํ„ฐ ๊ตฌ์ถ•

2. Retriever-Aware Training (RAT)

  • ํ•ต์‹ฌ ๊ฐœ๋…: ํ•™์Šต ์‹œ ๊ฒ€์ƒ‰๋œ API ๋ฌธ์„œ๋ฅผ ์ถ”๊ฐ€ ์ปจํ…์ŠคํŠธ๋กœ ํฌํ•จ
  • ๋™์ž‘ ๋ฐฉ์‹:
    1. ๊ฒ€์ƒ‰๊ธฐ๊ฐ€ ๊ด€๋ จ API ๋ฌธ์„œ๋ฅผ ์ œ๊ณต
    2. LLM์ด ๋ฌธ์„œ์˜ ๊ด€๋ จ์„ฑ์„ ํŒ๋‹จ
    3. ๊ด€๋ จ ์žˆ์œผ๋ฉด โ†’ ๋ฌธ์„œ ๊ธฐ๋ฐ˜ ์‘๋‹ต ์ƒ์„ฑ
    4. ๊ด€๋ จ ์—†์œผ๋ฉด โ†’ ๋‚ด์žฌ๋œ ๋„๋ฉ”์ธ ์ง€์‹์œผ๋กœ ์‘๋‹ต
  • ์žฅ์ :
    • ํ…Œ์ŠคํŠธ ์‹œ์ ์˜ ๋ฌธ์„œ ๋ณ€๊ฒฝ์— ์ ์‘ ๊ฐ€๋Šฅ
    • ๋ฌด๊ด€ํ•œ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ์— ํ˜ผ๋ž€๋ฐ›์ง€ ์•Š์Œ
    • ๋ฒ„์ „ ๋ณ€๊ฒฝ์ด๋‚˜ ์‚ฌ์šฉ์ž ์—…๋ฐ์ดํŠธ์— ์œ ์—ฐํ•˜๊ฒŒ ๋Œ€์‘

๐Ÿ—๏ธ ์•„ํ‚คํ…์ฒ˜ / ๋ฐฉ๋ฒ•๋ก 

์ „์ฒด ํŒŒ์ดํ”„๋ผ์ธ

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                        ์‚ฌ์šฉ์ž ์ž…๋ ฅ                               โ”‚
โ”‚         "์˜๋ฃŒ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ ๋ชจ๋ธ์„ ๋งŒ๋“ค์–ด์ค˜"                   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚
                              โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                     ๊ฒ€์ƒ‰๊ธฐ (Retriever)                           โ”‚
โ”‚              BM25 ๋˜๋Š” GPT-Index ๊ธฐ๋ฐ˜ ๊ฒ€์ƒ‰                        โ”‚
โ”‚         API ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์—์„œ ๊ด€๋ จ ๋ฌธ์„œ ๊ฒ€์ƒ‰                         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚
                              โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    Gorilla LLM (LLaMA ๊ธฐ๋ฐ˜)                      โ”‚
โ”‚                                                                 โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚
โ”‚  โ”‚  ์ž…๋ ฅ: ์‚ฌ์šฉ์ž ํ”„๋กฌํ”„ํŠธ + ๊ฒ€์ƒ‰๋œ API ๋ฌธ์„œ (RAT ๋ชจ๋“œ)         โ”‚    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚
โ”‚                              โ”‚                                  โ”‚
โ”‚                              โ–ผ                                  โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚
โ”‚  โ”‚  ๊ด€๋ จ์„ฑ ํŒ๋‹จ โ†’ ๋ฌธ์„œ ๊ธฐ๋ฐ˜ ์‘๋‹ต or ๋‚ด์žฌ ์ง€์‹ ๊ธฐ๋ฐ˜ ์‘๋‹ต        โ”‚    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚
                              โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         ์ถœ๋ ฅ (API ํ˜ธ์ถœ)                          โ”‚
โ”‚   torch.hub.load('pytorch/vision', 'resnet50', pretrained=True) โ”‚
โ”‚   + ๋‹จ๊ณ„๋ณ„ ์„ค๋ช… ๋ฐ ํ•„์š” ํŒจํ‚ค์ง€ ์ •๋ณด                                โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

์ถ”๋ก  ๋ชจ๋“œ

๋ชจ๋“œ ์„ค๋ช… ํŠน์ง•
Zero-shot ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ง์ ‘ Gorilla์— ์ž…๋ ฅ ๊ฒ€์ƒ‰๊ธฐ ์—†์ด ๋‚ด์žฌ ์ง€์‹๋งŒ ์‚ฌ์šฉ
Retrieval ๊ฒ€์ƒ‰๊ธฐ๋กœ ์ตœ์‹  API ๋ฌธ์„œ ๊ฒ€์ƒ‰ ํ›„ ์ž…๋ ฅ ์‹ค์‹œ๊ฐ„ ๋ฌธ์„œ ๋ณ€๊ฒฝ ๋Œ€์‘ ๊ฐ€๋Šฅ

AST ๊ธฐ๋ฐ˜ ํ‰๊ฐ€ ํ”„๋ ˆ์ž„์›Œํฌ

# ํ‰๊ฐ€ ํ”„๋กœ์„ธ์Šค
1. ์ƒ์„ฑ๋œ ์ฝ”๋“œ๋ฅผ AST(Abstract Syntax Tree)๋กœ ํŒŒ์‹ฑ
2. API ํ˜ธ์ถœ์„ ๋ฃจํŠธ ๋…ธ๋“œ๋กœ ํ•˜๋Š” ์„œ๋ธŒํŠธ๋ฆฌ ์ถ”์ถœ
3. ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์˜ API์™€ ์„œ๋ธŒํŠธ๋ฆฌ ๋งค์นญ
4. ๋งค์นญ ๊ฒฐ๊ณผ๋กœ ์ •ํ™•๋„ ๋ฐ ํ™˜๊ฐ ์ธก์ •

ํ™˜๊ฐ(Hallucination) ์ •์˜:
– ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์˜ ์–ด๋–ค API์™€๋„ ๋งค์นญ๋˜์ง€ ์•Š๋Š” API ํ˜ธ์ถœ
– ์™„์ „ํžˆ ์ƒ์ƒ๋œ ๋„๊ตฌ๋ฅผ ํ˜ธ์ถœํ•˜๋Š” ๊ฒฝ์šฐ
– ์ž˜๋ชป๋œ API ํ˜ธ์ถœ(์˜ค๋ฅ˜)๊ณผ๋Š” ๊ตฌ๋ถ„๋จ


๐Ÿ“Š ์‹คํ—˜ ๋ฐ ๊ฒฐ๊ณผ

์ฃผ์š” ์„ฑ๋Šฅ ์ง€ํ‘œ

๋ชจ๋ธ ์ •ํ™•๋„ (Acc) ํ™˜๊ฐ๋ฅ  (Hall) ๋น„๊ณ 
Gorilla (Zero-shot) GPT-4 ๋Œ€๋น„ +20.43% ์ตœ์ € SOTA
Gorilla (Zero-shot) ChatGPT ๋Œ€๋น„ +10.75% – –
Gorilla (Zero-shot) LLaMA ๋Œ€๋น„ +83% – –
GPT-4 (3-shot) TorchHub์—์„œ Gorilla์™€ ๋™๋“ฑ – –

RAT ํšจ๊ณผ ๋ถ„์„

์„ค์ • TorchHub ๊ฐœ์„  HuggingFace ๊ฐœ์„ 
RAT vs Non-RAT +12.37% +23.46%

๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์„ฑ (APIBench)

ํ”Œ๋žซํผ API ์ˆ˜ ๋„๋ฉ”์ธ
HuggingFace 925๊ฐœ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ, CV, NLP, ์˜ค๋””์˜ค ๋“ฑ
TorchHub ~200๊ฐœ PyTorch ๋ชจ๋ธ
TensorHub ~500๊ฐœ TensorFlow ๋ชจ๋ธ
์ดํ•ฉ ~1,640๊ฐœ ML/DL API

์ฃผ์š” ๋ฐœ๊ฒฌ

  1. Fine-tuning์˜ ํšจ๊ณผ: ๊ฐ€๋ณ๊ฒŒ ํŒŒ์ธํŠœ๋‹๋œ Gorilla๊ฐ€ ๋ชจ๋“  ๋ชจ๋ธ ๋Œ€๋น„ SOTA ์„ฑ๋Šฅ ๋‹ฌ์„ฑ
  2. 3-shot ICL์˜ ํ•œ๊ณ„: GPT ๊ณ„์—ด ๋ชจ๋ธ์—์„œ ๊ตฌ๋ฌธ์  ์ •ํ™•์„ฑ์€ ํ–ฅ์ƒ๋˜๋‚˜ ๊ทผ๋ณธ์  ํ•ด๊ฒฐ์€ ์•„๋‹˜
  3. ๊ฒ€์ƒ‰๊ธฐ ํ†ตํ•ฉ์˜ ์ค‘์š”์„ฑ: RAT๋ฅผ ํ†ตํ•œ ๊ฒ€์ƒ‰๊ธฐ ํ†ตํ•ฉ์ด ์„ฑ๋Šฅ๊ณผ ํ™˜๊ฐ ๊ฐ์†Œ์— ํ•ต์‹ฌ์ 

๐Ÿ’ช ๊ฐ•์  ๋ฐ ๊ธฐ์—ฌ

ํ•™์ˆ ์  ๊ธฐ์—ฌ

  1. Retriever-Aware Training (RAT): ๊ฒ€์ƒ‰๊ธฐ๋ฅผ ํ•™์Šต ํŒŒ์ดํ”„๋ผ์ธ์— ํ†ตํ•ฉํ•˜๋Š” ์ƒˆ๋กœ์šด ํŒจ๋Ÿฌ๋‹ค์ž„ ์ œ์‹œ
  2. AST ๊ธฐ๋ฐ˜ ํ™˜๊ฐ ์ธก์ •: LLM ์ƒ์„ฑ๋ฌผ์˜ ํ™˜๊ฐ์„ ์ตœ์ดˆ๋กœ ์ •๋Ÿ‰์ ์œผ๋กœ ์ธก์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก  ์ œ์•ˆ
  3. APIBench: ๋Œ€๊ทœ๋ชจ API ํ˜ธ์ถœ ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ ํ‘œ์ค€ ๋ฒค์น˜๋งˆํฌ ๊ตฌ์ถ•

๊ธฐ์ˆ ์  ๊ฐ•์ 

  1. ํ™•์žฅ์„ฑ: 1,600๊ฐœ ์ด์ƒ์˜ API ์ง€์› (ToolFormer ๋Œ€๋น„ ์••๋„์ )
  2. ์ ์‘์„ฑ: ์‹ค์‹œ๊ฐ„ API ๋ฌธ์„œ ๋ณ€๊ฒฝ์— ๋Œ€์‘ ๊ฐ€๋Šฅ
  3. ์ •ํ™•์„ฑ: GPT-4 ๋Œ€๋น„ 20% ์ด์ƒ ๋†’์€ ์ •ํ™•๋„
  4. ํ™˜๊ฐ ๊ฐ์†Œ: ๊ธฐ์กด LLM ๋Œ€๋น„ ํ˜„์ €ํžˆ ๋‚ฎ์€ ํ™˜๊ฐ๋ฅ 
  5. ์˜คํ”ˆ์†Œ์Šค: Apache 2.0 ๋ผ์ด์„ ์Šค๋กœ ์ƒ์—…์  ์‚ฌ์šฉ ๊ฐ€๋Šฅ

์‹ค์šฉ์  ๊ธฐ์—ฌ

  • Berkeley Function Calling Leaderboard: LLM์˜ ํ•จ์ˆ˜ ํ˜ธ์ถœ ๋Šฅ๋ ฅ ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ ํ‘œ์ค€ ๋ฆฌ๋”๋ณด๋“œ ๊ตฌ์ถ•
  • GoEX (Gorilla Execution Engine): LLM ์ƒ์„ฑ ์•ก์…˜์˜ ์•ˆ์ „ํ•œ ์‹คํ–‰์„ ์œ„ํ•œ ๋Ÿฐํƒ€์ž„ ์ œ๊ณต

โš ๏ธ ํ•œ๊ณ„์ 

ํ˜„์žฌ ์‹œ์Šคํ…œ์˜ ์ œ์•ฝ

ํ•œ๊ณ„์  ์„ค๋ช… ์˜ํ–ฅ
๋‹จ์ผ ๋„๋ฉ”์ธ ์ œ์•ฝ ์—ฌ๋Ÿฌ ๋„๋ฉ”์ธ์— ๊ฑธ์นœ ํ”„๋กฌํ”„ํŠธ ์ฒ˜๋ฆฌ ์–ด๋ ค์›€ ๋ณตํ•ฉ ์ž‘์—…์—์„œ ์„ฑ๋Šฅ ์ €ํ•˜
์ถœ๋ ฅ ํ˜•์‹ Python ์ฝ”๋“œ ํ˜•ํƒœ๋กœ๋งŒ ์ถœ๋ ฅ ํ•˜๋“œ์›จ์–ด ์ œ์•ฝ์ด ์žˆ๋Š” ์‚ฌ์šฉ์ž์—๊ฒŒ ๋ถˆํŽธ
API ์ปค๋ฒ„๋ฆฌ์ง€ ์ง€์› API ์ˆ˜ ์ œํ•œ์  ์ปค์Šคํ…€ API ์ถ”๊ฐ€/ํ•™์Šต ์ง€์› ๋ฏธํก
๋ฌธ์„œ ํ’ˆ์งˆ ์˜์กด์„ฑ ๊ณ ํ’ˆ์งˆ API ๋ฌธ์„œ ํ•„์š” ๋ฌธ์„œํ™” ๋ถ€์‹คํ•œ API๋Š” ์ œ์™ธ๋จ
์‹คํ–‰ ๊ฒ€์ฆ ๋ถ€์žฌ API ํ˜ธ์ถœ์˜ ์‹ค์ œ ์‹คํ–‰ ๊ฒฐ๊ณผ ๊ฒ€์ฆ ๋ฏธํฌํ•จ ๋Ÿฐํƒ€์ž„ ์˜ค๋ฅ˜ ์‚ฌ์ „ ๊ฐ์ง€ ๋ถˆ๊ฐ€

๊ทผ๋ณธ์  ํ•œ๊ณ„

  • API ์ง„ํ™” ์†๋„: API ๋ณ€๊ฒฝ ์†๋„๊ฐ€ ๋ชจ๋ธ ์—…๋ฐ์ดํŠธ ์†๋„๋ณด๋‹ค ๋น ๋ฆ„
  • ๋„๋ฉ”์ธ ํŠนํ™”: ML/DL API์— ์ง‘์ค‘๋˜์–ด ๋ฒ”์šฉ API ์ง€์› ๋ถ€์กฑ

๐Ÿ”— ๊ด€๋ จ ๋…ผ๋ฌธ

์„ ํ–‰ ์—ฐ๊ตฌ

๋…ผ๋ฌธ ํ•ต์‹ฌ ๋‚ด์šฉ Gorilla์™€์˜ ๊ด€๊ณ„
ToolFormer (2023) LLM์˜ ๋„๊ตฌ ์‚ฌ์šฉ ํ•™์Šต ์†Œ์ˆ˜ ๋„๊ตฌ์— ์ง‘์ค‘, Gorilla๋Š” ๋Œ€๊ทœ๋ชจ ํ™•์žฅ
ReAct (2022) Thought-Act-Observe ์‚ฌ์ดํด ์ถ”๋ก  ๋น„์šฉ ๋†’์Œ, Gorilla๋Š” ๋‹จ์ผ ํ˜ธ์ถœ๋กœ ํ•ด๊ฒฐ
Self-Instruct (2022) ์ž๊ธฐ ์ง€์‹œ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ Gorilla์˜ ํ•™์Šต ๋ฐ์ดํ„ฐ ์ƒ์„ฑ์— ํ™œ์šฉ

ํ›„์† ์—ฐ๊ตฌ ๋ฐ ํ™•์žฅ

์—ฐ๊ตฌ ์„ค๋ช…
BFCL (Berkeley Function Calling Leaderboard) Gorilla ๊ธฐ๋ฐ˜ ํ•จ์ˆ˜ ํ˜ธ์ถœ ๋ฒค์น˜๋งˆํฌ
GoEX LLM ์ƒ์„ฑ ์•ก์…˜ ์‹คํ–‰ ์—”์ง„
RAFT ๋„๋ฉ”์ธ ํŠนํ™” RAG ๋ฏธ์„ธ์กฐ์ • ๊ธฐ๋ฒ•
Gorilla OpenFunctions RESTful API ์ง€์› ํ™•์žฅ

๋น„๊ต ์—ฐ๊ตฌ

  • AutoTool: ํšจ์œจ์  ๋„๊ตฌ ์„ ํƒ์„ ์œ„ํ•œ ๊ฒฝ๋Ÿ‰ํ™” ์ ‘๊ทผ
  • LangChain/MetaGPT: ReAct ํŒจ๋Ÿฌ๋‹ค์ž„ ๊ธฐ๋ฐ˜ ์—์ด์ „ํŠธ ํ”„๋ ˆ์ž„์›Œํฌ

๐Ÿ’ป ์‹ค๋ฌด ์ ์šฉ ํฌ์ธํŠธ

์ ์šฉ ๊ฐ€๋Šฅ ์‹œ๋‚˜๋ฆฌ์˜ค

1. ํด๋ผ์šฐ๋“œ ์ธํ”„๋ผ ์ž๋™ํ™”

์ง€์› API: Kubernetes, AWS, GCP, Azure
ํ™œ์šฉ: ์ž์—ฐ์–ด ๋ช…๋ น์„ ํด๋ผ์šฐ๋“œ CLI ๋ช…๋ น์œผ๋กœ ๋ณ€ํ™˜
์˜ˆ์‹œ: "๋ฏธ๊ตญ ๋™๋ถ€์—์„œ A100 GPU ์ธ์Šคํ„ด์Šค ๊ฒ€์ƒ‰ํ•ด์ค˜"

2. ML/AI ํŒŒ์ดํ”„๋ผ์ธ ๊ตฌ์ถ•

์ง€์› API: HuggingFace, PyTorch Hub, TensorFlow Hub
ํ™œ์šฉ: ์ž์—ฐ์–ด๋กœ ๋ชจ๋ธ ๋กœ๋”ฉ ๋ฐ ์ถ”๋ก  ์ฝ”๋“œ ์ƒ์„ฑ
์˜ˆ์‹œ: "์˜๋ฃŒ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ ๋กœ๋“œํ•ด์ค˜"

3. ์›น ์„œ๋น„์Šค ํ†ตํ•ฉ

์ง€์› API: Slack, PayPal, Stripe (OpenFunctions v2)
ํ™œ์šฉ: ๋น„์ฆˆ๋‹ˆ์Šค ์›Œํฌํ”Œ๋กœ์šฐ ์ž๋™ํ™”
์˜ˆ์‹œ: "ํšŒ์˜ ์˜ˆ์•ฝํ•˜๊ณ  ์ฐธ์„์ž์—๊ฒŒ ์Šฌ๋ž™ ์•Œ๋ฆผ ๋ณด๋‚ด์ค˜"

์‹ค๋ฌด ๋„์ž… ์ฒดํฌ๋ฆฌ์ŠคํŠธ

  • [ ] API ๋ฌธ์„œ ํ’ˆ์งˆ ํ™•์ธ: ์‚ฌ์šฉํ•  API์˜ ๋ฌธ์„œํ™” ์ˆ˜์ค€ ์ ๊ฒ€
  • [ ] ๊ฒ€์ƒ‰๊ธฐ ์„ ํƒ: BM25 vs ์ž„๋ฒ ๋”ฉ ๊ธฐ๋ฐ˜ ๊ฒ€์ƒ‰๊ธฐ ๋น„๊ต ํ…Œ์ŠคํŠธ
  • [ ] RAT ๋ชจ๋“œ ํ™œ์„ฑํ™”: ๋™์  API ํ™˜๊ฒฝ์—์„œ๋Š” ํ•„์ˆ˜
  • [ ] GoEX ๋Ÿฐํƒ€์ž„ ๊ฒ€ํ† : ์‹คํ–‰ ๊ฒ€์ฆ ๋ฐ ๋กค๋ฐฑ ๊ธฐ๋Šฅ ํ•„์š” ์‹œ ๋„์ž…
  • [ ] ํ™˜๊ฐ ๋ชจ๋‹ˆํ„ฐ๋ง: AST ๋งค์นญ์œผ๋กœ ์ƒ์„ฑ ๊ฒฐ๊ณผ ํ’ˆ์งˆ ์ง€์† ๊ฒ€์ฆ

์ฃผ์˜์‚ฌํ•ญ

  1. ๋‹จ์ผ ๋„๋ฉ”์ธ ํ”„๋กฌํ”„ํŠธ ๊ถŒ์žฅ: ๋ณตํ•ฉ ๋„๋ฉ”์ธ ์š”์ฒญ์€ ๋ถ„๋ฆฌํ•˜์—ฌ ์ฒ˜๋ฆฌ
  2. ์‹คํ–‰ ์ „ ๊ฒ€์ฆ: ์ƒ์„ฑ๋œ ์ฝ”๋“œ๋Š” ์‹คํ–‰ ์ „ ๋ฆฌ๋ทฐ ๊ถŒ์žฅ
  3. API ๋ฒ„์ „ ๊ด€๋ฆฌ: ๊ฒ€์ƒ‰๊ธฐ์˜ API ๋ฌธ์„œ๋ฅผ ์ตœ์‹  ์ƒํƒœ๋กœ ์œ ์ง€
  4. ๋น„์šฉ ๊ณ ๋ ค: ๋Œ€๊ทœ๋ชจ ๋ฐฐํฌ ์‹œ LLM ์ถ”๋ก  ๋น„์šฉ ์‚ฐ์ • ํ•„์š”

๐Ÿท๏ธ Tags

#LLM #API #ToolUse #RAT #Retriever #FunctionCalling #NeurIPS2024 #Berkeley #Hallucination #APIBench #FineTuning #SelfInstruct #AgentAI #MLOps #CloudAutomation

์ž‘์„ฑ์ž

skycave

Follow Me
๋‹ค๋ฅธ ๊ธฐ์‚ฌ
Previous

[AI Paper] ๊ฒŒ์ž„ ์ด๋ก ์  ๊ด€์ ์—์„œ ๋ณธ LLM ๊ธฐ๋ฐ˜ ๋‹ค์ค‘ ์—์ด์ „ํŠธ ์‹œ์Šคํ…œ

Next

[AI Paper] ๐Ÿ“„ How to Build AI Agents by Augmenting LLMs with Codified Human Expert Domain Knowledge?

๋Œ“๊ธ€ ์—†์Œ! ์ฒซ ๋Œ“๊ธ€์„ ๋‚จ๊ฒจ๋ณด์„ธ์š”.

๋‹ต๊ธ€ ๋‚จ๊ธฐ๊ธฐ ์‘๋‹ต ์ทจ์†Œ

์ด๋ฉ”์ผ ์ฃผ์†Œ๋Š” ๊ณต๊ฐœ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ํ•„์ˆ˜ ํ•„๋“œ๋Š” *๋กœ ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค

์ตœ์‹ ๊ธ€

  • ๐Ÿ“Š ์ผ์ผ ๋‰ด์Šค ๊ฐ์„ฑ ๋ฆฌํฌํŠธ – 2026-01-28
  • AI ์‹œ์Šคํ…œ์˜ ๋ฌธ๋งฅ ๊ธฐ๋ฐ˜ ๊ฒ€์ƒ‰(Contextual Retrieval) | Anthropic
  • “Think” ํˆด: Claude๊ฐ€ ๋ฉˆ์ถฐ์„œ ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๊ธฐ | Anthropic
  • Claude Code ๋ชจ๋ฒ” ์‚ฌ๋ก€ \ Anthropic
  • ์šฐ๋ฆฌ๊ฐ€ ๋ฉ€ํ‹ฐ ์—์ด์ „ํŠธ ์—ฐ๊ตฌ ์‹œ์Šคํ…œ์„ ๊ตฌ์ถ•ํ•œ ๋ฐฉ๋ฒ•
Copyright 2026 — skycave's Blog. All rights reserved. Blogsy WordPress Theme