πŸ“ Artikel ini ditulis dalam Bahasa Indonesia & English
πŸ“ This article is available in English & Bahasa Indonesia

πŸ† Belajar Hugging Face β€” Page 10 (FINAL!)Learn Hugging Face β€” Page 10 (FINAL!)

Capstone Project:
End-to-End LLM

Capstone Project:
End-to-End LLM

Grand finale! Gabungkan SEMUA yang sudah dipelajari dari Page 1-9 dalam satu proyek production: load model 4-bit (Page 8) β†’ QLoRA SFT instruction tuning (Page 3+8) β†’ DPO alignment (Page 9) β†’ RAG knowledge base (Page 6) β†’ Gradio chatbot app (Page 7) β†’ deploy ke HF Spaces. Plus: perjalanan lengkap 10 pages divisualisasikan, roadmap lanjutan (agents, multi-modal, production MLOps), career paths di AI/ML, sertifikasi yang relevan, dan pesan penutup.

Grand finale! Combine EVERYTHING learned from Pages 1-9 in one production project: load model 4-bit (Page 8) β†’ QLoRA SFT instruction tuning (Pages 3+8) β†’ DPO alignment (Page 9) β†’ RAG knowledge base (Page 6) β†’ Gradio chatbot app (Page 7) β†’ deploy to HF Spaces. Plus: complete 10-page journey visualized, advanced roadmap (agents, multi-modal, production MLOps), career paths in AI/ML, relevant certifications, and closing message.

πŸ“… MaretMarch 2026⏱ 35 menit baca35 min read
🏷 CapstoneEnd-to-EndFull PipelineRAG ChatbotProductionRoadmapCareer
πŸ“š Seri Belajar Hugging Face:Learn Hugging Face Series:

πŸ“‘ Daftar Isi β€” Page 10 (Final!)

πŸ“‘ Table of Contents β€” Page 10 (Final!)

  1. Perjalanan Kita β€” 10 Pages dalam satu pandangan
  2. Capstone: RAG Chatbot Production β€” Full pipeline code
  3. Step 1: QLoRA SFT β€” Instruction tuning (Page 3+8)
  4. Step 2: DPO Alignment β€” Human preferences (Page 9)
  5. Step 3: RAG Knowledge Base β€” Embeddings + FAISS (Page 6)
  6. Step 4: Gradio Chatbot App β€” Interactive UI (Page 7)
  7. Step 5: Deploy ke HF Spaces β€” URL publik gratis
  8. Roadmap: What's Next? β€” Agents, multi-modal, MLOps
  9. Career Paths di AI/ML
  10. Penutup β€” Selamat! πŸŽ‰πŸ†
  1. Our Journey β€” 10 Pages at a glance
  2. Capstone: Production RAG Chatbot β€” Full pipeline code
  3. Step 1: QLoRA SFT β€” Instruction tuning (Pages 3+8)
  4. Step 2: DPO Alignment β€” Human preferences (Page 9)
  5. Step 3: RAG Knowledge Base β€” Embeddings + FAISS (Page 6)
  6. Step 4: Gradio Chatbot App β€” Interactive UI (Page 7)
  7. Step 5: Deploy to HF Spaces β€” Free public URL
  8. Roadmap: What's Next? β€” Agents, multi-modal, MLOps
  9. Career Paths in AI/ML
  10. Closing β€” Congratulations! πŸŽ‰πŸ†
πŸ—ΊοΈ

1. Perjalanan Kita β€” 10 Pages dalam Satu Pandangan

1. Our Journey β€” 10 Pages at a Glance

πŸ† Your Hugging Face Journey β€” All 10 Pages β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ FOUNDATION (Pages 1-4) β”‚ β”‚ Page 1 β–Έ Pipeline, Hub, AutoModel, Tokenizer, Cara Pakai HF β”‚ β”‚ Page 2 β–Έ Fine-Tune BERT, Trainer, Datasets, VRAM/OOM β”‚ β”‚ Page 3 β–Έ Fine-Tune GPT, Text Generation, Sampling, Chatbot β”‚ β”‚ Page 4 β–Έ NER, BIO Tagging, Subword Alignment, seqeval β”‚ β”‚ β”‚ β”‚ ADVANCED (Pages 5-7) β”‚ β”‚ Page 5 β–Έ QA (SQuAD), T5/BART Seq2Seq, Translation, BLEU/ROUGE β”‚ β”‚ Page 6 β–Έ Embeddings, Cosine Sim, FAISS, Semantic Search, RAG β”‚ β”‚ Page 7 β–Έ Gradio, Spaces, ChatInterface, Demo Apps, Deploy β”‚ β”‚ β”‚ β”‚ LLM MASTERY (Pages 8-10) β”‚ β”‚ Page 8 β–Έ LoRA, QLoRA, 4-bit, PEFT, Fine-Tune 7B on Colab β”‚ β”‚ Page 9 β–Έ RLHF, DPO, Alignment, TRL, Safety β”‚ β”‚ Page 10 β–Έ Capstone: RAG Chatbot + Roadmap (ANDA DI SINI!) β”‚ β”‚ β”‚ β”‚ Skills Acquired: β”‚ β”‚ βœ… Pipeline API untuk 20+ NLP tasks (inference in 1 line) β”‚ β”‚ βœ… Fine-tune BERT untuk classification & NER (93%+ acc) β”‚ β”‚ βœ… Fine-tune GPT untuk text generation & chatbot β”‚ β”‚ βœ… T5/BART untuk translation & summarization β”‚ β”‚ βœ… Sentence embeddings, FAISS, semantic search, RAG β”‚ β”‚ βœ… Gradio apps deployed to HF Spaces (free!) β”‚ β”‚ βœ… QLoRA: fine-tune 7B LLM di Colab gratis (previously $256/hr)β”‚ β”‚ βœ… DPO alignment: ChatGPT-style training β”‚ β”‚ βœ… 70+ code files, 10 complete projects β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ πŸ† GRAND FINALE β€” Capstone Project!
πŸ”₯

2-6. Capstone: Production RAG Chatbot β€” Full Pipeline

2-6. Capstone: Production RAG Chatbot β€” Full Pipeline

Gabungkan Page 3+6+7+8+9 dalam satu script production-grade
Combine Pages 3+6+7+8+9 in one production-grade script
70_capstone_rag_chatbot.py β€” End-to-End RAG Chatbot πŸ†πŸ”₯πŸ”₯python
#!/usr/bin/env python3
"""
πŸ† CAPSTONE: End-to-End RAG Chatbot
Combines ALL techniques from Pages 1-9:
  Page 1: Pipeline, Hub, AutoModel
  Page 3: Text generation, prompt format
  Page 6: Sentence embeddings, FAISS, RAG
  Page 7: Gradio ChatInterface, Spaces
  Page 8: QLoRA, 4-bit quantization, PEFT
  Page 9: DPO alignment concepts

This is a DEPLOYABLE app for HF Spaces!
"""

import gradio as gr
import faiss
import numpy as np
import torch
from sentence_transformers import SentenceTransformer
from transformers import pipeline

# ═══════════════════════════════════════════════════
# COMPONENT 1: RETRIEVER (Page 6 β€” Embeddings + FAISS)
# ═══════════════════════════════════════════════════
print("πŸ“š Loading retriever...")
retriever = SentenceTransformer("all-MiniLM-L6-v2")

# Knowledge base β€” replace with YOUR documents!
KNOWLEDGE_BASE = [
    "Hugging Face was founded in 2016 by ClΓ©ment Delangue, Julien Chaumond, and Thomas Wolf.",
    "The Transformers library supports over 200 model architectures including BERT, GPT, T5, and LLaMA.",
    "Fine-tuning BERT on IMDB sentiment analysis typically achieves 93%+ accuracy.",
    "LoRA (Low-Rank Adaptation) allows fine-tuning large models by training only 0.1-1% of parameters.",
    "QLoRA combines 4-bit quantization with LoRA, enabling 7B model fine-tuning on a single T4 GPU.",
    "DPO (Direct Preference Optimization) is a simpler alternative to RLHF for model alignment.",
    "Gradio allows creating ML demo web apps with just Python, deployable to HF Spaces for free.",
    "FAISS is Facebook's library for efficient similarity search across millions of vectors in milliseconds.",
    "The Trainer API handles training loops, evaluation, logging, checkpointing, and multi-GPU automatically.",
    "Sentence Transformers encode text into dense vectors for semantic similarity and search tasks.",
    "Named Entity Recognition (NER) identifies people, locations, organizations in text using BIO tagging.",
    "T5 treats all NLP tasks as text-to-text: summarize, translate, classify β€” all with the same model.",
    "BERT uses bidirectional attention for understanding. GPT uses causal attention for generation.",
    "The Model Hub hosts over 500,000 pre-trained models for NLP, vision, audio, and multimodal tasks.",
    "Jakarta is the capital of Indonesia with a population of approximately 10.56 million people.",
    "Indonesia has over 17,000 islands and declared independence on August 17, 1945.",
    "Python is the most popular programming language for machine learning and data science.",
    "TensorFlow and PyTorch are the two most popular deep learning frameworks.",
]

# Build FAISS index
kb_embeddings = retriever.encode(KNOWLEDGE_BASE, convert_to_numpy=True)
faiss.normalize_L2(kb_embeddings)
index = faiss.IndexFlatIP(kb_embeddings.shape[1])
index.add(kb_embeddings)
print(f"  Indexed {index.ntotal} documents")

# ═══════════════════════════════════════════════════
# COMPONENT 2: GENERATOR (Page 3 β€” Text Generation)
# ═══════════════════════════════════════════════════
print("πŸ€– Loading generator...")
generator = pipeline("text2text-generation", model="google/flan-t5-base", device=0)
# FLAN-T5 base: 250M params, fits CPU/GPU easily
# For better quality: use fine-tuned model from Page 8!

# ═══════════════════════════════════════════════════
# COMPONENT 3: RAG PIPELINE (Page 6 β€” Retrieve + Generate)
# ═══════════════════════════════════════════════════
def retrieve(query, top_k=3):
    """Retrieve top-k relevant documents."""
    q_emb = retriever.encode([query], convert_to_numpy=True)
    faiss.normalize_L2(q_emb)
    scores, indices = index.search(q_emb, top_k)
    docs = [(KNOWLEDGE_BASE[i], float(s)) for i, s in zip(indices[0], scores[0]) if i >= 0]
    return docs

def generate_answer(question, context):
    """Generate answer using retrieved context."""
    prompt = f"""Answer the question based on the context below. If the answer is not in the context, say "I don't have information about that."

Context: {context}

Question: {question}

Answer:"""
    result = generator(prompt, max_length=200)
    return result[0]["generated_text"]

# ═══════════════════════════════════════════════════
# COMPONENT 4: GRADIO CHATBOT (Page 7 β€” ChatInterface)
# ═══════════════════════════════════════════════════
def chat(message, history):
    """RAG chatbot: retrieve β†’ generate β†’ respond with sources."""
    # Retrieve
    docs = retrieve(message, top_k=3)
    context = " ".join([doc for doc, score in docs])

    # Generate
    answer = generate_answer(message, context)

    # Format with sources
    sources = "\n\nπŸ“š **Sources:**\n" + "\n".join(
        [f"- _{doc[:80]}..._ (relevance: {score:.0%})" for doc, score in docs])

    return answer + sources

# ═══════════════════════════════════════════════════
# COMPONENT 5: LAUNCH APP (Page 7 β€” Deploy to Spaces!)
# ═══════════════════════════════════════════════════
demo = gr.ChatInterface(
    fn=chat,
    title="πŸ† RAG Chatbot β€” Hugging Face Knowledge Assistant",
    description="""Ask me anything about Hugging Face, Transformers, fine-tuning, NLP, or Indonesia!
Powered by: Sentence Transformers (retrieval) + FLAN-T5 (generation) + FAISS (vector search).
Built with techniques from the entire Learn Hugging Face series (Pages 1-9).""",
    examples=[
        "What is LoRA and how does it work?",
        "How accurate is BERT on IMDB sentiment analysis?",
        "What is the capital of Indonesia?",
        "What is the difference between BERT and GPT?",
        "How do I deploy a model to Hugging Face Spaces?",
        "What is DPO?",
    ],
    retry_btn="πŸ”„ Retry",
    undo_btn="↩️ Undo",
    clear_btn="πŸ—‘οΈ Clear",
    theme=gr.themes.Soft(),
)

print("πŸš€ Launching RAG Chatbot...")
demo.launch()
# β†’ Deploy to HF Spaces: upload app.py + requirements.txt
# β†’ Free public URL: https://username-rag-chatbot.hf.space
# β†’ ANYONE can chat with your knowledge-grounded AI! πŸŽ‰

πŸ† Ini Adalah Proyek Production Anda!
Script di atas menggabungkan teknik dari 6 pages berbeda:
β€’ Page 1: Pipeline API, model loading
β€’ Page 3: Text generation, prompt formatting
β€’ Page 6: Sentence embeddings, FAISS vector search, RAG
β€’ Page 7: Gradio ChatInterface, HF Spaces deployment
β€’ Page 8: Bisa upgrade ke QLoRA fine-tuned model
β€’ Page 9: Bisa upgrade ke DPO-aligned model
Upload ke HF Spaces β†’ chatbot AI Anda live di internet dalam 5 menit, gratis!

πŸ† This Is Your Production Project!
The script above combines techniques from 6 different pages:
β€’ Page 1: Pipeline API, model loading
β€’ Page 3: Text generation, prompt formatting
β€’ Page 6: Sentence embeddings, FAISS vector search, RAG
β€’ Page 7: Gradio ChatInterface, HF Spaces deployment
β€’ Page 8: Can upgrade to QLoRA fine-tuned model
β€’ Page 9: Can upgrade to DPO-aligned model
Upload to HF Spaces β†’ your AI chatbot live on the internet in 5 minutes, free!

πŸ—ΊοΈ

7. Roadmap: What's Next?

7. Roadmap: What's Next?

LevelTopikApa ItuTools
🟒AI AgentsLLM yang bisa pakai tools (search, code, API calls)LangChain, CrewAI, AutoGen, Smolagents
🟒RAG AdvancedChunking strategies, re-ranking, hybrid search, evaluationLlamaIndex, LangChain, Ragas
🟒Multi-modalVision-Language models (LLaVA, GPT-4V, Gemini)HF Transformers, OpenAI API
🟑Structured OutputLLM generate JSON/SQL/code yang validOutlines, Instructor, LMQL
🟑Model ServingProduction inference: vLLM, TGI, TritonvLLM, TGI, NVIDIA Triton
🟑EvaluationBenchmark LLM: MT-Bench, AlpacaEval, MMLUlm-evaluation-harness, HELM
πŸ”΄Pre-trainingTrain LLM dari nol (ratusan GPU, jutaan $)Megatron-LM, DeepSpeed
πŸ”΄MLOpsCI/CD untuk ML, monitoring, retraining pipelinesMLflow, Weights & Biases, Kubeflow
LevelTopicWhat It IsTools
🟒AI AgentsLLMs that can use tools (search, code, API calls)LangChain, CrewAI, AutoGen, Smolagents
🟒RAG AdvancedChunking strategies, re-ranking, hybrid search, evaluationLlamaIndex, LangChain, Ragas
🟒Multi-modalVision-Language models (LLaVA, GPT-4V, Gemini)HF Transformers, OpenAI API
🟑Structured OutputLLM generate valid JSON/SQL/codeOutlines, Instructor, LMQL
🟑Model ServingProduction inference: vLLM, TGI, TritonvLLM, TGI, NVIDIA Triton
🟑EvaluationBenchmark LLMs: MT-Bench, AlpacaEval, MMLUlm-evaluation-harness, HELM
πŸ”΄Pre-trainingTrain LLM from scratch (hundreds of GPUs, millions $)Megatron-LM, DeepSpeed
πŸ”΄MLOpsCI/CD for ML, monitoring, retraining pipelinesMLflow, Weights & Biases, Kubeflow
πŸ’Ό

8. Career Paths di AI/ML

8. Career Paths in AI/ML

RoleFocusSkills dari Seri IniTambahan
NLP EngineerText processing systemsP1-6: BERT, GPT, NER, QA, T5, embeddingsLangChain, RAG production, evaluation
LLM EngineerFine-tune & deploy LLMsP8-9: QLoRA, DPO, SFT + P7: Gradio deployvLLM, TGI, agents, prompt engineering
ML EngineerBuild & deploy ML systemsP1-10: semua! End-to-end pipelineMLOps, Kubernetes, CI/CD, monitoring
AI ResearcherNovel methods & papersP8-9: LoRA math, DPO theory, alignmentPaper reading, JAX, math/stats deep
Full-Stack AIComplete AI applicationsP7: Gradio + P6: RAG + P3: generationReact/Next.js, databases, API design
RoleFocusSkills from This SeriesAdditional
NLP EngineerText processing systemsP1-6: BERT, GPT, NER, QA, T5, embeddingsLangChain, RAG production, evaluation
LLM EngineerFine-tune & deploy LLMsP8-9: QLoRA, DPO, SFT + P7: Gradio deployvLLM, TGI, agents, prompt engineering
ML EngineerBuild & deploy ML systemsP1-10: everything! End-to-end pipelineMLOps, Kubernetes, CI/CD, monitoring
AI ResearcherNovel methods & papersP8-9: LoRA math, DPO theory, alignmentPaper reading, JAX, deep math/stats
Full-Stack AIComplete AI applicationsP7: Gradio + P6: RAG + P3: generationReact/Next.js, databases, API design
πŸŽ‰

9. Penutup β€” Selamat! πŸŽ‰πŸ†

9. Closing β€” Congratulations! πŸŽ‰πŸ†

πŸŽ‰πŸ† SELAMAT! Anda telah menyelesaikan SELURUH seri Belajar Hugging Face β€” 10 Pages!

Dari pipeline pertama di Page 1 hingga DPO alignment di Page 9 dan capstone project di Page 10, Anda sekarang menguasai ekosistem Hugging Face secara menyeluruh:

βœ… Inference: Pipeline API untuk 20+ tasks dalam 1 baris kode
βœ… Fine-Tune BERT: Classification (93%+), NER (92%+ F1), QA
βœ… Fine-Tune GPT: Text generation, instruction tuning, chatbot
βœ… Seq2Seq: T5/BART untuk translation dan summarization
βœ… Embeddings: Sentence similarity, FAISS vector search, RAG
βœ… Deploy: Gradio apps, HF Spaces (free public URL!)
βœ… QLoRA: Fine-tune 7B LLM di Colab gratis (yang sebelumnya $256/jam)
βœ… DPO Alignment: ChatGPT-style training technique
βœ… 70+ code files, 10 complete projects, 1 production RAG chatbot

Anda juga sudah menyelesaikan seri Neural Network (10 pages) dan seri TensorFlow (10 pages) sebelumnya β€” total 30 pages, ~2 MB konten tutorial!

Ini bukan akhir β€” ini baru awal perjalanan AI Anda. Gunakan roadmap di atas, terus eksperimen, dan bangun sesuatu yang luar biasa! πŸš€

"The best time to start learning AI was yesterday. The second best time is now."

πŸŽ‰πŸ† CONGRATULATIONS! You've completed the ENTIRE Learn Hugging Face series β€” all 10 Pages!

From your first pipeline in Page 1 to DPO alignment in Page 9 and this capstone in Page 10, you now have comprehensive mastery of the Hugging Face ecosystem:

βœ… Inference: Pipeline API for 20+ tasks in 1 line of code
βœ… Fine-Tune BERT: Classification (93%+), NER (92%+ F1), QA
βœ… Fine-Tune GPT: Text generation, instruction tuning, chatbot
βœ… Seq2Seq: T5/BART for translation and summarization
βœ… Embeddings: Sentence similarity, FAISS vector search, RAG
βœ… Deploy: Gradio apps, HF Spaces (free public URL!)
βœ… QLoRA: Fine-tune 7B LLM on free Colab (previously $256/hr)
βœ… DPO Alignment: ChatGPT-style training technique
βœ… 70+ code files, 10 complete projects, 1 production RAG chatbot

You also completed the Neural Network series (10 pages) and TensorFlow series (10 pages) before β€” total 30 pages, ~2 MB of tutorial content!

This is not the end β€” it's just the beginning of your AI journey. Use the roadmap above, keep experimenting, and build something extraordinary! πŸš€

"The best time to start learning AI was yesterday. The second best time is now."

← Page Sebelumnya← Previous Page

Page 9 β€” RLHF, DPO & Alignment