Seri Belajar LLM Part 3: Prompt Engineering

📚 Seri Belajar LLM:

Daftar Isi — Part 3

Prompt = Program untuk LLM
Zero-shot vs Few-shot
Chain-of-Thought (CoT) — "Think step by step"
System Prompts — Set role dan personality
Anatomy of a Perfect Prompt — 6 komponen
Structured Output — Force JSON/XML
Advanced: ReAct, ToT, Self-Consistency
Common Mistakes & Anti-patterns
Ringkasan & Preview Part 4

🎯

1. Prompt = Program untuk LLM

LLM tidak punya tombol — prompt adalah satu-satunya cara mengontrol output

Berbeda dengan software tradisional yang dikontrol via UI atau API parameters, LLM dikontrol sepenuhnya melalui teks (prompt). Prompt yang baik memberikan konteks, format, contoh, dan batasan. Seorang prompt engineer yang baik bisa mendapat hasil 10x lebih baik dari orang biasa — dengan model yang sama persis. Ini bukan "trial and error" — ini skill yang bisa dipelajari secara sistematis.

Prompt Engineering = Programming in Natural Language

Programmer menulis kode Python untuk mengontrol komputer. Prompt engineer menulis instruksi natural language untuk mengontrol LLM. Bedanya: bahasa natural itu ambiguous. "Buatkan ringkasan" bisa berarti 1 kalimat atau 1 halaman. "Formal" bisa berarti akademis atau bisnis. Prompt engineer yang baik menghilangkan ambiguitas ini dengan instruksi yang spesifik dan terstruktur.

📋

2. Zero-shot vs Few-shot

Langsung tanya vs berikan contoh dulu

05_zero_vs_few_shot.py

# ===== ZERO-SHOT: Langsung tanya, tanpa contoh =====
prompt_zero = """Classify this review as POSITIVE or NEGATIVE.
Review: "The food was incredible, best pasta I've ever had!"
Classification:"""
# Output: "POSITIVE"
# Akurasi: ~85% (tergantung model dan task complexity)

# ===== FEW-SHOT: Berikan 2-5 contoh dulu =====
prompt_few = """Classify reviews as POSITIVE or NEGATIVE.

Review: "Loved it! Amazing experience." 
Classification: POSITIVE

Review: "Terrible service, cold food."
Classification: NEGATIVE

Review: "Decent, nothing special but okay."
Classification: POSITIVE

Review: "The food was incredible, best pasta I've ever had!"
Classification:"""
# Output: "POSITIVE"
# Akurasi: ~92% (jauh lebih baik!)
# Few-shot mengajari LLM "format" dan "boundary" yang diharapkan

# ===== PRO TIP: Sertakan contoh edge case =====
# Contoh "Decent, nothing special" = POSITIVE
# mengajari model bahwa netral cenderung ke positive
# Tanpa contoh ini, model mungkin classify sebagai NEGATIVE

🧠

3. Chain-of-Thought — "Think Step by Step"

Tambah 4 kata ajaib, akurasi math naik dari 40% ke 90%+

Chain-of-Thought (CoT) prompting adalah breakthrough 2022 oleh Google Research: dengan meminta model "think step by step", akurasi pada math/reasoning tasks naik drastis. Kenapa? Karena model menghasilkan intermediate reasoning steps yang membantu menjaga koherensi logika — alih-alih langsung "menebak" jawaban akhir.

06_chain_of_thought.py

# TANPA CoT: Model langsung menebak (sering salah)
prompt_bad = "If 3 shirts cost $45 and each shirt has 20% tax, what's the total for 7 shirts?"
# Output: "$126" (SALAH!)

# DENGAN CoT: Model berpikir step-by-step
prompt_cot = """If 3 shirts cost $45 and each shirt has 20% tax, 
what's the total for 7 shirts?

Let's think step by step:
1. Price per shirt = $45 / 3 = $15
2. Tax per shirt = $15 x 20% = $3
3. Total per shirt with tax = $15 + $3 = $18
4. Total for 7 shirts = 7 x $18 = $126

Now solve this problem step by step:
If 5 books cost $60 and each has 10% discount, what's the total for 12 books?"""
# Output:
# 1. Price per book = $60 / 5 = $12
# 2. Discount per book = $12 x 10% = $1.20
# 3. Price with discount = $12 - $1.20 = $10.80
# 4. Total for 12 books = 12 x $10.80 = $129.60 (BENAR!)

💻

4. System Prompts — Set Role dan Personality

System prompt = "DNA" percakapan. Define siapa LLM, apa yang boleh/tidak boleh dilakukan.

07_system_prompt.py

from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": """You are a senior Python developer 
with 15 years of experience. Follow these rules:
1. Always write type hints and docstrings
2. Use clean code principles (single responsibility, DRY)
3. Explain your reasoning BEFORE writing code
4. Include error handling and edge cases
5. Suggest tests for the code
6. If the request is ambiguous, ask clarifying questions
7. Format: Markdown with code blocks"""},
        {"role": "user", "content": "Write a function to merge two sorted lists"}
    ]
)
# Output: Detailed explanation + clean Python code with
# type hints, docstring, error handling, and test suggestions

🏗

5. Anatomy of a Perfect Prompt — 6 Komponen

Role + Context + Task + Format + Examples + Constraints

Komponen	Apa Itu	Contoh	Impact
Role	Siapa LLM seharusnya	"You are a senior data scientist at Google"	Mengatur expertise level dan tone
Context	Background info yang relevan	"Given this dataset of 10K sales records..."	Grounding, mengurangi hallucination
Task	Apa yang harus dilakukan (SPESIFIK)	"Analyze Q3 trends and identify top 3 risks"	Core instruction
Format	Bentuk output yang diinginkan	"Return as JSON: {summary, risks[], recommendations[]}"	Parseable, consistent output
Examples	Contoh input/output (few-shot)	"Example: Input: ... Output: ..."	+7-15% accuracy improvement
Constraints	Batasan dan rules	"Max 200 words. No jargon. Cite sources."	Prevent over/under-generation

📦

6. Structured Output — Force JSON/XML

Pastikan LLM output data yang bisa di-parse oleh kode Anda

08_structured_output.py

# OpenAI Structured Output (response_format)
response = client.chat.completions.create(
    model="gpt-4o",
    response_format={"type": "json_object"},
    messages=[{
        "role": "user",
        "content": """Extract entities from this text as JSON:
"Elon Musk announced that Tesla will open a factory in Jakarta by 2027."

Return format: {"entities": [{"name": str, "type": str, "role": str}]}"""
    }]
)
# {"entities": [
#   {"name": "Elon Musk", "type": "PERSON", "role": "announcer"},
#   {"name": "Tesla", "type": "ORG", "role": "company"},
#   {"name": "Jakarta", "type": "LOCATION", "role": "factory location"}
# ]}

🔬

7. Advanced Techniques

ReAct, Tree-of-Thought, Self-Consistency, Meta-prompting

Teknik	Cara Kerja	Kapan Pakai	Improvement
Chain-of-Thought	"Think step by step" sebelum jawab	Math, logic, multi-step reasoning	+20-50% accuracy
Few-shot	Berikan 3-5 contoh input/output	Classification, extraction, formatting	+7-15% accuracy
Self-Consistency	Generate N jawaban, majority vote	Complex reasoning (high stakes)	+10-20% accuracy
Tree-of-Thought	Explore multiple reasoning paths, prune bad ones	Creative problem solving, planning	+15-30% on complex tasks
ReAct	Reason + Act: think then use tools	Agents that need search/calc/code	Enables tool use
Meta-prompting	Ask LLM to write its own prompt	Prompt optimization	Variable (sometimes great)
Persona Pattern	Give LLM a detailed character	Consistent tone/style	Better consistency

⚠

8. Common Mistakes & Anti-patterns

Hal yang sering salah dilakukan saat prompting

Mistake	Contoh Buruk	Contoh Baik
Terlalu vague	"Buatkan ringkasan"	"Ringkas dalam 3 bullet points, masing-masing max 20 kata, fokus pada key findings"
Tidak ada format	"Analisis data ini"	"Analisis dan return JSON: {trend, top_3_insights, recommendation}"
Prompt terlalu panjang	2000 kata instruksi membingungkan	Instruksi terstruktur dengan numbering dan contoh
Negative instructions	"Jangan pakai jargon teknis"	"Gunakan bahasa yang bisa dipahami anak SMA" (positive framing)
No examples	Deskripsi panjang tanpa contoh	1-3 contoh concrete input/output

📝

9. Ringkasan Part 3

Prompt engineering toolkit

Konsep	Key Takeaway
Zero-shot	Langsung tanya. Simple tapi akurasi terbatas.
Few-shot	Berikan 3-5 contoh. +7-15% accuracy. Best bang for buck.
Chain-of-Thought	"Think step by step". +20-50% pada reasoning tasks.
System Prompt	DNA percakapan: role, rules, constraints.
6 Komponen	Role + Context + Task + Format + Examples + Constraints.
Structured Output	Force JSON/XML untuk output yang parseable.
ReAct	Reason + Act: fondasi agentic AI.

📗

Next: Part 4 — RAG: Retrieval-Augmented Generation

Berikan LLM akses ke dokumen Anda. Vector DB, embeddings, chunking strategies, dan hybrid search. Dari "LLM tidak tahu data saya" ke "LLM yang menjawab dari dokumen internal".

LLM

Tech Review Desk — Seri Belajar LLM

Sumber: Wei et al. "Chain-of-Thought" 2022, Anthropic Prompt Engineering docs, OpenAI Cookbook, Yao et al. "ReAct" 2023.

rominur@gmail.com • t.me/Jekardah_AI