π Daftar Isi β Page 10 (Final!)
π Table of Contents β Page 10 (Final!)
- Perjalanan Kita β 10 Pages dalam satu pandangan
- Capstone: RAG Chatbot Production β Full pipeline code
- Step 1: QLoRA SFT β Instruction tuning (Page 3+8)
- Step 2: DPO Alignment β Human preferences (Page 9)
- Step 3: RAG Knowledge Base β Embeddings + FAISS (Page 6)
- Step 4: Gradio Chatbot App β Interactive UI (Page 7)
- Step 5: Deploy ke HF Spaces β URL publik gratis
- Roadmap: What's Next? β Agents, multi-modal, MLOps
- Career Paths di AI/ML
- Penutup β Selamat! ππ
- Our Journey β 10 Pages at a glance
- Capstone: Production RAG Chatbot β Full pipeline code
- Step 1: QLoRA SFT β Instruction tuning (Pages 3+8)
- Step 2: DPO Alignment β Human preferences (Page 9)
- Step 3: RAG Knowledge Base β Embeddings + FAISS (Page 6)
- Step 4: Gradio Chatbot App β Interactive UI (Page 7)
- Step 5: Deploy to HF Spaces β Free public URL
- Roadmap: What's Next? β Agents, multi-modal, MLOps
- Career Paths in AI/ML
- Closing β Congratulations! ππ
1. Perjalanan Kita β 10 Pages dalam Satu Pandangan
1. Our Journey β 10 Pages at a Glance
2-6. Capstone: Production RAG Chatbot β Full Pipeline
2-6. Capstone: Production RAG Chatbot β Full Pipeline
#!/usr/bin/env python3 """ π CAPSTONE: End-to-End RAG Chatbot Combines ALL techniques from Pages 1-9: Page 1: Pipeline, Hub, AutoModel Page 3: Text generation, prompt format Page 6: Sentence embeddings, FAISS, RAG Page 7: Gradio ChatInterface, Spaces Page 8: QLoRA, 4-bit quantization, PEFT Page 9: DPO alignment concepts This is a DEPLOYABLE app for HF Spaces! """ import gradio as gr import faiss import numpy as np import torch from sentence_transformers import SentenceTransformer from transformers import pipeline # βββββββββββββββββββββββββββββββββββββββββββββββββββ # COMPONENT 1: RETRIEVER (Page 6 β Embeddings + FAISS) # βββββββββββββββββββββββββββββββββββββββββββββββββββ print("π Loading retriever...") retriever = SentenceTransformer("all-MiniLM-L6-v2") # Knowledge base β replace with YOUR documents! KNOWLEDGE_BASE = [ "Hugging Face was founded in 2016 by ClΓ©ment Delangue, Julien Chaumond, and Thomas Wolf.", "The Transformers library supports over 200 model architectures including BERT, GPT, T5, and LLaMA.", "Fine-tuning BERT on IMDB sentiment analysis typically achieves 93%+ accuracy.", "LoRA (Low-Rank Adaptation) allows fine-tuning large models by training only 0.1-1% of parameters.", "QLoRA combines 4-bit quantization with LoRA, enabling 7B model fine-tuning on a single T4 GPU.", "DPO (Direct Preference Optimization) is a simpler alternative to RLHF for model alignment.", "Gradio allows creating ML demo web apps with just Python, deployable to HF Spaces for free.", "FAISS is Facebook's library for efficient similarity search across millions of vectors in milliseconds.", "The Trainer API handles training loops, evaluation, logging, checkpointing, and multi-GPU automatically.", "Sentence Transformers encode text into dense vectors for semantic similarity and search tasks.", "Named Entity Recognition (NER) identifies people, locations, organizations in text using BIO tagging.", "T5 treats all NLP tasks as text-to-text: summarize, translate, classify β all with the same model.", "BERT uses bidirectional attention for understanding. GPT uses causal attention for generation.", "The Model Hub hosts over 500,000 pre-trained models for NLP, vision, audio, and multimodal tasks.", "Jakarta is the capital of Indonesia with a population of approximately 10.56 million people.", "Indonesia has over 17,000 islands and declared independence on August 17, 1945.", "Python is the most popular programming language for machine learning and data science.", "TensorFlow and PyTorch are the two most popular deep learning frameworks.", ] # Build FAISS index kb_embeddings = retriever.encode(KNOWLEDGE_BASE, convert_to_numpy=True) faiss.normalize_L2(kb_embeddings) index = faiss.IndexFlatIP(kb_embeddings.shape[1]) index.add(kb_embeddings) print(f" Indexed {index.ntotal} documents") # βββββββββββββββββββββββββββββββββββββββββββββββββββ # COMPONENT 2: GENERATOR (Page 3 β Text Generation) # βββββββββββββββββββββββββββββββββββββββββββββββββββ print("π€ Loading generator...") generator = pipeline("text2text-generation", model="google/flan-t5-base", device=0) # FLAN-T5 base: 250M params, fits CPU/GPU easily # For better quality: use fine-tuned model from Page 8! # βββββββββββββββββββββββββββββββββββββββββββββββββββ # COMPONENT 3: RAG PIPELINE (Page 6 β Retrieve + Generate) # βββββββββββββββββββββββββββββββββββββββββββββββββββ def retrieve(query, top_k=3): """Retrieve top-k relevant documents.""" q_emb = retriever.encode([query], convert_to_numpy=True) faiss.normalize_L2(q_emb) scores, indices = index.search(q_emb, top_k) docs = [(KNOWLEDGE_BASE[i], float(s)) for i, s in zip(indices[0], scores[0]) if i >= 0] return docs def generate_answer(question, context): """Generate answer using retrieved context.""" prompt = f"""Answer the question based on the context below. If the answer is not in the context, say "I don't have information about that." Context: {context} Question: {question} Answer:""" result = generator(prompt, max_length=200) return result[0]["generated_text"] # βββββββββββββββββββββββββββββββββββββββββββββββββββ # COMPONENT 4: GRADIO CHATBOT (Page 7 β ChatInterface) # βββββββββββββββββββββββββββββββββββββββββββββββββββ def chat(message, history): """RAG chatbot: retrieve β generate β respond with sources.""" # Retrieve docs = retrieve(message, top_k=3) context = " ".join([doc for doc, score in docs]) # Generate answer = generate_answer(message, context) # Format with sources sources = "\n\nπ **Sources:**\n" + "\n".join( [f"- _{doc[:80]}..._ (relevance: {score:.0%})" for doc, score in docs]) return answer + sources # βββββββββββββββββββββββββββββββββββββββββββββββββββ # COMPONENT 5: LAUNCH APP (Page 7 β Deploy to Spaces!) # βββββββββββββββββββββββββββββββββββββββββββββββββββ demo = gr.ChatInterface( fn=chat, title="π RAG Chatbot β Hugging Face Knowledge Assistant", description="""Ask me anything about Hugging Face, Transformers, fine-tuning, NLP, or Indonesia! Powered by: Sentence Transformers (retrieval) + FLAN-T5 (generation) + FAISS (vector search). Built with techniques from the entire Learn Hugging Face series (Pages 1-9).""", examples=[ "What is LoRA and how does it work?", "How accurate is BERT on IMDB sentiment analysis?", "What is the capital of Indonesia?", "What is the difference between BERT and GPT?", "How do I deploy a model to Hugging Face Spaces?", "What is DPO?", ], retry_btn="π Retry", undo_btn="β©οΈ Undo", clear_btn="ποΈ Clear", theme=gr.themes.Soft(), ) print("π Launching RAG Chatbot...") demo.launch() # β Deploy to HF Spaces: upload app.py + requirements.txt # β Free public URL: https://username-rag-chatbot.hf.space # β ANYONE can chat with your knowledge-grounded AI! π
π Ini Adalah Proyek Production Anda!
Script di atas menggabungkan teknik dari 6 pages berbeda:
β’ Page 1: Pipeline API, model loading
β’ Page 3: Text generation, prompt formatting
β’ Page 6: Sentence embeddings, FAISS vector search, RAG
β’ Page 7: Gradio ChatInterface, HF Spaces deployment
β’ Page 8: Bisa upgrade ke QLoRA fine-tuned model
β’ Page 9: Bisa upgrade ke DPO-aligned model
Upload ke HF Spaces β chatbot AI Anda live di internet dalam 5 menit, gratis!
π This Is Your Production Project!
The script above combines techniques from 6 different pages:
β’ Page 1: Pipeline API, model loading
β’ Page 3: Text generation, prompt formatting
β’ Page 6: Sentence embeddings, FAISS vector search, RAG
β’ Page 7: Gradio ChatInterface, HF Spaces deployment
β’ Page 8: Can upgrade to QLoRA fine-tuned model
β’ Page 9: Can upgrade to DPO-aligned model
Upload to HF Spaces β your AI chatbot live on the internet in 5 minutes, free!
7. Roadmap: What's Next?
7. Roadmap: What's Next?
| Level | Topik | Apa Itu | Tools |
|---|---|---|---|
| π’ | AI Agents | LLM yang bisa pakai tools (search, code, API calls) | LangChain, CrewAI, AutoGen, Smolagents |
| π’ | RAG Advanced | Chunking strategies, re-ranking, hybrid search, evaluation | LlamaIndex, LangChain, Ragas |
| π’ | Multi-modal | Vision-Language models (LLaVA, GPT-4V, Gemini) | HF Transformers, OpenAI API |
| π‘ | Structured Output | LLM generate JSON/SQL/code yang valid | Outlines, Instructor, LMQL |
| π‘ | Model Serving | Production inference: vLLM, TGI, Triton | vLLM, TGI, NVIDIA Triton |
| π‘ | Evaluation | Benchmark LLM: MT-Bench, AlpacaEval, MMLU | lm-evaluation-harness, HELM |
| π΄ | Pre-training | Train LLM dari nol (ratusan GPU, jutaan $) | Megatron-LM, DeepSpeed |
| π΄ | MLOps | CI/CD untuk ML, monitoring, retraining pipelines | MLflow, Weights & Biases, Kubeflow |
| Level | Topic | What It Is | Tools |
|---|---|---|---|
| π’ | AI Agents | LLMs that can use tools (search, code, API calls) | LangChain, CrewAI, AutoGen, Smolagents |
| π’ | RAG Advanced | Chunking strategies, re-ranking, hybrid search, evaluation | LlamaIndex, LangChain, Ragas |
| π’ | Multi-modal | Vision-Language models (LLaVA, GPT-4V, Gemini) | HF Transformers, OpenAI API |
| π‘ | Structured Output | LLM generate valid JSON/SQL/code | Outlines, Instructor, LMQL |
| π‘ | Model Serving | Production inference: vLLM, TGI, Triton | vLLM, TGI, NVIDIA Triton |
| π‘ | Evaluation | Benchmark LLMs: MT-Bench, AlpacaEval, MMLU | lm-evaluation-harness, HELM |
| π΄ | Pre-training | Train LLM from scratch (hundreds of GPUs, millions $) | Megatron-LM, DeepSpeed |
| π΄ | MLOps | CI/CD for ML, monitoring, retraining pipelines | MLflow, Weights & Biases, Kubeflow |
8. Career Paths di AI/ML
8. Career Paths in AI/ML
| Role | Focus | Skills dari Seri Ini | Tambahan |
|---|---|---|---|
| NLP Engineer | Text processing systems | P1-6: BERT, GPT, NER, QA, T5, embeddings | LangChain, RAG production, evaluation |
| LLM Engineer | Fine-tune & deploy LLMs | P8-9: QLoRA, DPO, SFT + P7: Gradio deploy | vLLM, TGI, agents, prompt engineering |
| ML Engineer | Build & deploy ML systems | P1-10: semua! End-to-end pipeline | MLOps, Kubernetes, CI/CD, monitoring |
| AI Researcher | Novel methods & papers | P8-9: LoRA math, DPO theory, alignment | Paper reading, JAX, math/stats deep |
| Full-Stack AI | Complete AI applications | P7: Gradio + P6: RAG + P3: generation | React/Next.js, databases, API design |
| Role | Focus | Skills from This Series | Additional |
|---|---|---|---|
| NLP Engineer | Text processing systems | P1-6: BERT, GPT, NER, QA, T5, embeddings | LangChain, RAG production, evaluation |
| LLM Engineer | Fine-tune & deploy LLMs | P8-9: QLoRA, DPO, SFT + P7: Gradio deploy | vLLM, TGI, agents, prompt engineering |
| ML Engineer | Build & deploy ML systems | P1-10: everything! End-to-end pipeline | MLOps, Kubernetes, CI/CD, monitoring |
| AI Researcher | Novel methods & papers | P8-9: LoRA math, DPO theory, alignment | Paper reading, JAX, deep math/stats |
| Full-Stack AI | Complete AI applications | P7: Gradio + P6: RAG + P3: generation | React/Next.js, databases, API design |
9. Penutup β Selamat! ππ
9. Closing β Congratulations! ππ
ππ SELAMAT! Anda telah menyelesaikan SELURUH seri Belajar Hugging Face β 10 Pages!
Dari pipeline pertama di Page 1 hingga DPO alignment di Page 9 dan capstone project di Page 10, Anda sekarang menguasai ekosistem Hugging Face secara menyeluruh:
β
Inference: Pipeline API untuk 20+ tasks dalam 1 baris kode
β
Fine-Tune BERT: Classification (93%+), NER (92%+ F1), QA
β
Fine-Tune GPT: Text generation, instruction tuning, chatbot
β
Seq2Seq: T5/BART untuk translation dan summarization
β
Embeddings: Sentence similarity, FAISS vector search, RAG
β
Deploy: Gradio apps, HF Spaces (free public URL!)
β
QLoRA: Fine-tune 7B LLM di Colab gratis (yang sebelumnya $256/jam)
β
DPO Alignment: ChatGPT-style training technique
β
70+ code files, 10 complete projects, 1 production RAG chatbot
Anda juga sudah menyelesaikan seri Neural Network (10 pages) dan seri TensorFlow (10 pages) sebelumnya β total 30 pages, ~2 MB konten tutorial!
Ini bukan akhir β ini baru awal perjalanan AI Anda. Gunakan roadmap di atas, terus eksperimen, dan bangun sesuatu yang luar biasa! π
"The best time to start learning AI was yesterday. The second best time is now."
ππ CONGRATULATIONS! You've completed the ENTIRE Learn Hugging Face series β all 10 Pages!
From your first pipeline in Page 1 to DPO alignment in Page 9 and this capstone in Page 10, you now have comprehensive mastery of the Hugging Face ecosystem:
β
Inference: Pipeline API for 20+ tasks in 1 line of code
β
Fine-Tune BERT: Classification (93%+), NER (92%+ F1), QA
β
Fine-Tune GPT: Text generation, instruction tuning, chatbot
β
Seq2Seq: T5/BART for translation and summarization
β
Embeddings: Sentence similarity, FAISS vector search, RAG
β
Deploy: Gradio apps, HF Spaces (free public URL!)
β
QLoRA: Fine-tune 7B LLM on free Colab (previously $256/hr)
β
DPO Alignment: ChatGPT-style training technique
β
70+ code files, 10 complete projects, 1 production RAG chatbot
You also completed the Neural Network series (10 pages) and TensorFlow series (10 pages) before β total 30 pages, ~2 MB of tutorial content!
This is not the end β it's just the beginning of your AI journey. Use the roadmap above, keep experimenting, and build something extraordinary! π
"The best time to start learning AI was yesterday. The second best time is now."