Belajar Hugging Face Page 1 — Pengenalan Transformers & Pipeline

📑 Daftar Isi — Page 1

📑 Table of Contents — Page 1

Apa Itu Hugging Face? — Ekosistem yang merevolusi AI
Instalasi — transformers, datasets, tokenizers, accelerate
Cara Pakai HF — Colab, lokal, Inference API, Spaces, self-hosting
Pipeline API — Inference 1 baris untuk 20+ tugas
Pipeline: NLP Tasks — Sentiment, NER, QA, Translation, Summarization
Pipeline: Beyond NLP — Image Classification, Object Detection, Zero-Shot
Model Hub — 500k+ models, cara memilih yang tepat
Auto Classes — AutoModel, AutoTokenizer, AutoConfig
Tokenisasi Mendalam — WordPiece, BPE, SentencePiece, encoding
Dari Tokenizer ke Model — Full forward pass manual
First Look: Fine-Tuning BERT — Text classification preview
Ringkasan & Preview Page 2

What Is Hugging Face? — The ecosystem that revolutionized AI
Installation — transformers, datasets, tokenizers, accelerate
How to Use HF — Colab, local, Inference API, Spaces, self-hosting
Pipeline API — 1-line inference for 20+ tasks
Pipeline: NLP Tasks — Sentiment, NER, QA, Translation, Summarization
Pipeline: Beyond NLP — Image Classification, Object Detection, Zero-Shot
Model Hub — 500k+ models, choosing the right one
Auto Classes — AutoModel, AutoTokenizer, AutoConfig
Deep Dive: Tokenization — WordPiece, BPE, SentencePiece, encoding
From Tokenizer to Model — Full manual forward pass
First Look: Fine-Tuning BERT — Text classification preview
Summary & Page 2 Preview

🤗

1. Apa Itu Hugging Face? — Revolusi AI Open-Source

1. What Is Hugging Face? — The Open-Source AI Revolution

Dari startup chatbot kecil menjadi "GitHub of AI" — platform terpenting di dunia ML

From a small chatbot startup to the "GitHub of AI" — the most important platform in ML

Hugging Face (🤗) adalah perusahaan dan platform open-source yang menyediakan ekosistem lengkap untuk machine learning modern. Bayangkan GitHub, tapi khusus untuk model AI: Anda bisa menemukan, menggunakan, dan berbagi model dari BERT sampai LLaMA, dari Stable Diffusion sampai Whisper — semuanya gratis. Lebih dari 500,000 model dan 100,000 dataset tersedia di Hub mereka.

Hugging Face (🤗) is a company and open-source platform providing a complete ecosystem for modern machine learning. Imagine GitHub, but specifically for AI models: you can find, use, and share models from BERT to LLaMA, from Stable Diffusion to Whisper — all for free. Over 500,000 models and 100,000 datasets are available on their Hub.

Kenapa Hugging Face begitu penting? Karena ia mendemokratisasi AI. Sebelum HF, menggunakan BERT membutuhkan ratusan baris kode boilerplate dan pengetahuan mendalam tentang arsitektur model. Sekarang: pipeline("sentiment-analysis")("I love this!") — selesai, satu baris.

Why is Hugging Face so important? Because it democratizes AI. Before HF, using BERT required hundreds of lines of boilerplate code and deep knowledge of model architecture. Now: pipeline("sentiment-analysis")("I love this!") — done, one line.

🤗 Hugging Face Ecosystem — Everything You Need for AI 📚 Libraries 🌐 Hub (huggingface.co) 🛠 Tools transformers 500k+ Models Spaces (demo apps) → BERT, GPT, T5, LLaMA → NLP, CV, Audio, Multi Inference API (free!) → Pipeline, AutoModel → Download in 1 line Inference Endpoints → Fine-tuning, training (production) datasets 100k+ Datasets AutoTrain (no-code) → Load any dataset → GLUE, SQuAD, ImageNet Evaluate (benchmarks) → Streaming, preprocessing → Community uploads Optimum (optimization) tokenizers Spaces (100k+ apps) Accelerate (multi-GPU) → Fast Rust-based → Gradio, Streamlit PEFT (LoRA, QLoRA) → BPE, WordPiece, Unigram → Try models live! TRL (RLHF training) accelerate safetensors (safe format) → Multi-GPU, TPU huggingface_hub (API) → Mixed precision → DeepSpeed integration Semua GRATIS dan open-source! MIT / Apache 2.0 license. Used by: Google, Meta, Microsoft, NVIDIA, Amazon, 50k+ companies

💡 Analogi: Hugging Face = App Store untuk AI
Model Hub = App Store → download model siap pakai dalam 1 baris kode
Datasets Hub = Data marketplace → dataset berkualitas untuk training
Spaces = Demo gallery → coba model langsung di browser
transformers library = SDK → unified API untuk 200+ arsitektur model
Anda tidak perlu implementasi BERT dari nol — cukup from transformers import dan mulai bekerja.

💡 Analogy: Hugging Face = App Store for AI
Model Hub = App Store → download ready-to-use models in 1 line of code
Datasets Hub = Data marketplace → quality datasets for training
Spaces = Demo gallery → try models directly in browser
transformers library = SDK → unified API for 200+ model architectures
You don't need to implement BERT from scratch — just from transformers import and start working.

📦

2. Instalasi — 4 Library Inti Hugging Face

2. Installation — 4 Core Hugging Face Libraries

transformers + datasets + tokenizers + accelerate — fondasi lengkap

transformers + datasets + tokenizers + accelerate — complete foundation

Terminal — Install Hugging Face Stackbash

# ===========================
# Core libraries
# ===========================
pip install transformers        # models, pipelines, Auto classes
pip install datasets            # dataset loading & processing
pip install tokenizers          # fast Rust-based tokenizers
pip install accelerate          # multi-GPU, mixed precision

# Or install everything at once:
pip install transformers[torch] datasets accelerate

# ===========================
# Backend: PyTorch or TensorFlow
# ===========================
pip install torch               # PyTorch (RECOMMENDED — community default)
# pip install tensorflow        # TensorFlow (also supported)
# HF Transformers supports BOTH backends!
# This series uses PyTorch (90% of HF community uses PyTorch)

# ===========================
# Optional but useful
# ===========================
pip install evaluate            # evaluation metrics
pip install peft                # LoRA, QLoRA (efficient fine-tuning)
pip install trl                 # RLHF training (ChatGPT-style)
pip install bitsandbytes        # 4-bit/8-bit quantization
pip install sentencepiece       # for T5, LLaMA tokenizers

# ===========================
# Verify installation
# ===========================
python -c "import transformers; print(f'transformers {transformers.__version__}')"
python -c "import datasets; print(f'datasets {datasets.__version__}')"
python -c "import torch; print(f'PyTorch {torch.__version__}, CUDA: {torch.cuda.is_available()}')"
# transformers 4.47.x
# datasets 3.2.x
# PyTorch 2.5.x, CUDA: True

Library	Fungsi	Size	Wajib?
transformers	Model, tokenizer, pipeline, training	~30 MB	✅ Ya
datasets	Load & proses dataset	~5 MB	✅ Ya (training)
tokenizers	Fast Rust tokenizer (auto-installed)	~5 MB	Auto
accelerate	Multi-GPU, mixed precision	~3 MB	✅ Ya (training)
evaluate	Metrics (accuracy, F1, BLEU)	~2 MB	Recommended
peft	LoRA, QLoRA efficient fine-tuning	~3 MB	Optional
torch	PyTorch backend	~2 GB	✅ Ya (1 backend)

Library	Purpose	Size	Required?
transformers	Models, tokenizer, pipeline, training	~30 MB	✅ Yes
datasets	Load & process datasets	~5 MB	✅ Yes (training)
tokenizers	Fast Rust tokenizer (auto-installed)	~5 MB	Auto
accelerate	Multi-GPU, mixed precision	~3 MB	✅ Yes (training)
evaluate	Metrics (accuracy, F1, BLEU)	~2 MB	Recommended
peft	LoRA, QLoRA efficient fine-tuning	~3 MB	Optional
torch	PyTorch backend	~2 GB	✅ Yes (1 backend)

💡 Google Colab: Semua library HF sudah pre-installed di Colab! Cukup !pip install -q transformers datasets accelerate untuk update ke versi terbaru. GPU T4 gratis sudah cukup untuk fine-tuning BERT dan model medium lainnya.

💡 Google Colab: All HF libraries come pre-installed on Colab! Just !pip install -q transformers datasets accelerate to update to the latest version. The free T4 GPU is sufficient for fine-tuning BERT and other medium models.

🛤️

2b. Bagaimana Cara Pakai Hugging Face? — 6 Cara dari Gratisan sampai Production

2b. How Do You Actually Use Hugging Face? — 6 Ways from Free to Production

Pertanyaan paling penting: apakah saya jalankan di komputer saya, di cloud, atau di server HF?

The most important question: do I run it on my computer, on the cloud, or on HF's servers?

Banyak yang bingung saat pertama kali mengenal Hugging Face: "Ini dijalankan di mana? Di website HF? Di komputer saya? Di cloud?" Jawabannya: semua bisa! Hugging Face bukan satu platform tunggal — ia adalah ekosistem yang bisa dipakai dengan berbagai cara. Berikut 6 cara menggunakan HF, dari yang paling mudah sampai production-grade:

Many people get confused when first encountering Hugging Face: "Where does this run? On HF's website? On my computer? On the cloud?" The answer: all of the above! Hugging Face isn't a single platform — it's an ecosystem that can be used in various ways. Here are 6 ways to use HF, from easiest to production-grade:

🛤️ 6 Cara Menggunakan Hugging Face / 6 Ways to Use Hugging Face GRATIS / FREE: ① Google Colab (RECOMMENDED untuk belajar!) ⭐ BEST FOR BEGINNERS → pip install transformers di Colab notebook → GPU T4 gratis (cukup untuk fine-tuning BERT!) → Tidak perlu install apapun di komputer lokal → File: colab.research.google.com ② Komputer Lokal (laptop/desktop Anda) → pip install transformers di terminal → Model di-download ke ~/.cache/huggingface/ → CPU: inference OK, training lambat → GPU NVIDIA: training cepat (RTX 3060+ recommended) ③ HF Inference API (serverless, gratis rate-limited) → Kirim HTTP request ke api-inference.huggingface.co → Tidak perlu download model — HF yang jalankan → Rate limit: ~30k tokens/hari (gratis) → Bagus untuk: prototyping, demo kecil ④ HF Spaces (hosting demo apps gratis) → Upload Gradio/Streamlit app ke huggingface.co/spaces → Dapat URL publik gratis (username.hf.space) → CPU gratis, GPU mulai $0.60/jam → Bagus untuk: demo, portfolio, sharing BERBAYAR / PAID: ⑤ HF Inference Endpoints (production hosting di HF) → Deploy model ke dedicated server di HF → Auto-scaling, GPU, monitoring → Mulai ~$0.06/jam (CPU) sampai $4.50/jam (A100) → Bagus untuk: production API tanpa manage server ⑥ Self-Hosting (server Anda sendiri / cloud) → Download model → deploy di AWS/GCP/Azure/VPS → Full control: Docker, Kubernetes, custom infra → Biaya: tergantung server ($5-$1000+/bulan) → Bagus untuk: enterprise, data privacy, custom scaling

① Google Colab — REKOMENDASI #1 untuk Belajar

① Google Colab — #1 RECOMMENDATION for Learning

Google Colab adalah cara termudah dan tercepat untuk mulai menggunakan Hugging Face. Anda tidak perlu install apapun di komputer — cukup buka browser, tulis kode Python, dan jalankan di GPU gratis. Seluruh seri ini bisa diikuti 100% di Colab.

Google Colab is the easiest and fastest way to start using Hugging Face. You don't need to install anything on your computer — just open a browser, write Python code, and run it on a free GPU. This entire series can be followed 100% on Colab.

Google Colab — Setup dalam 30 Detikpython

# ===========================
# 1. Buka colab.research.google.com
# 2. Runtime → Change runtime type → GPU (T4)
# 3. Jalankan cell berikut:
# ===========================

# Install/update HF libraries (sudah pre-installed, tapi update)
!pip install -q transformers datasets accelerate evaluate

# Verify GPU
import torch
print(f"GPU: {torch.cuda.get_device_name(0)}")
# GPU: Tesla T4

# Test pipeline
from transformers import pipeline
classifier = pipeline("sentiment-analysis", device=0)  # GPU!
print(classifier("Hugging Face is amazing!"))
# [{'label': 'POSITIVE', 'score': 0.9998}]

# ✅ Selesai! Siap fine-tuning BERT dengan GPU gratis!
# Colab T4 = 16GB VRAM → cukup untuk BERT, DistilBERT, RoBERTa
# Tidak cukup untuk: LLaMA 7B+, Stable Diffusion (butuh A100)

② Komputer Lokal — Untuk Development Sehari-hari

② Local Computer — For Daily Development

Terminal — Local Setupbash

# ===========================
# Setup di laptop/desktop Anda
# ===========================

# 1. Buat virtual environment (recommended!)
python -m venv hf-env
source hf-env/bin/activate  # Linux/Mac
# hf-env\Scripts\activate    # Windows

# 2. Install PyTorch (pilih sesuai GPU Anda)
# CPU only:
pip install torch torchvision torchaudio
# NVIDIA GPU (CUDA 12.x):
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# 3. Install Hugging Face stack
pip install transformers datasets accelerate evaluate

# 4. Test
python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('Hello!'))"

# ===========================
# Di mana model disimpan?
# ===========================
# Model di-download ke cache folder:
# Linux/Mac: ~/.cache/huggingface/hub/
# Windows:   C:\Users\\.cache\huggingface\hub\
#
# BERT base: ~420 MB
# DistilBERT: ~250 MB
# GPT-2 small: ~550 MB
# LLaMA 3.2 1B: ~2.5 GB
# LLaMA 3.2 8B: ~16 GB
#
# Pertama kali download → lambat
# Kedua kali → instant (cached!)
#
# Hapus cache: rm -rf ~/.cache/huggingface/hub/

③ HF Inference API — Pakai Model Tanpa Download

③ HF Inference API — Use Models Without Downloading

Tidak mau download model besar ke komputer? Gunakan Inference API — kirim request HTTP ke server HF, mereka yang jalankan model. Gratis untuk prototyping (rate-limited).

Don't want to download large models to your computer? Use the Inference API — send HTTP requests to HF servers, they run the model. Free for prototyping (rate-limited).

hf_inference_api.py — Serverless Inferencepython

import requests

# ===========================
# Method 1: Direct HTTP request (no library needed!)
# ===========================
API_URL = "https://api-inference.huggingface.co/models/distilbert-base-uncased-finetuned-sst-2-english"
headers = {"Authorization": "Bearer hf_YOUR_TOKEN_HERE"}
# Get free token: huggingface.co/settings/tokens

response = requests.post(API_URL, headers=headers,
    json={"inputs": "I love this product!"})
print(response.json())
# [[{'label': 'POSITIVE', 'score': 0.9998}]]

# ===========================
# Method 2: huggingface_hub library (easier)
# ===========================
from huggingface_hub import InferenceClient

client = InferenceClient(token="hf_YOUR_TOKEN")

# Text classification
result = client.text_classification("I love this!")
print(result)  # [TextClassificationOutput(label='POSITIVE', score=0.9998)]

# Text generation
result = client.text_generation(
    "The meaning of life is",
    model="gpt2",
    max_new_tokens=50
)
print(result)

# Translation
result = client.translation("I am learning AI",
    model="Helsinki-NLP/opus-mt-en-id")
print(result)  # "Saya sedang belajar AI"

# ===========================
# Kapan pakai Inference API?
# ===========================
# ✅ Prototyping cepat (tidak perlu GPU lokal)
# ✅ Demo kecil (< 1000 requests/hari)
# ✅ Test model baru sebelum download
# ❌ Training / fine-tuning (hanya inference!)
# ❌ Production (rate limited, cold starts)
# ❌ Data sensitif (data dikirim ke server HF)

④ HF Spaces — Buat Demo App Gratis

④ HF Spaces — Build Free Demo Apps

Spaces = hosting gratis untuk demo ML app. Anda bisa membuat app dengan Gradio atau Streamlit, push ke HF, dan mendapat URL publik. Sempurna untuk portfolio dan sharing.

Spaces = free hosting for ML demo apps. You can build apps with Gradio or Streamlit, push to HF, and get a public URL. Perfect for portfolios and sharing.

app.py — Gradio Demo App (upload ke HF Spaces)python

import gradio as gr
from transformers import pipeline

# Load model
classifier = pipeline("sentiment-analysis")

# Define interface
def analyze(text):
    result = classifier(text)[0]
    return f"{result['label']} ({result['score']:.1%})"

# Create Gradio app
demo = gr.Interface(
    fn=analyze,
    inputs=gr.Textbox(placeholder="Type your text here..."),
    outputs="text",
    title="🤗 Sentiment Analyzer",
    description="Analyze sentiment of any English text",
)
demo.launch()

# ===========================
# Deploy ke HF Spaces:
# 1. Buat repo di huggingface.co/new-space
# 2. Pilih "Gradio" sebagai SDK
# 3. Upload app.py + requirements.txt
# 4. Otomatis deploy → dapat URL: username.hf.space/sentiment
# 5. GRATIS untuk CPU! GPU mulai $0.60/jam
# ===========================

⑤ & ⑥ Production Deployment — Inference Endpoints & Self-Hosting

production_options.py — Production Deploymentpython

# ===========================
# Option 5: HF Inference Endpoints (managed hosting)
# → huggingface.co/inference-endpoints
# ===========================
# 1. Pilih model dari Hub
# 2. Pilih hardware (CPU/GPU/A100)
# 3. Pilih region (US, EU, Asia)
# 4. Deploy → dapat production API URL
# 5. Auto-scaling, monitoring, HTTPS included
# 
# Pricing:
# CPU (2 vCPU):     ~$0.06/jam  (~$43/bulan)
# GPU T4 (16GB):    ~$0.60/jam  (~$432/bulan)
# GPU A10G (24GB):  ~$1.30/jam  (~$936/bulan)
# GPU A100 (80GB):  ~$4.50/jam  (~$3,240/bulan)
# 
# Best for: production API tanpa manage infrastructure

# ===========================
# Option 6: Self-Hosting (Docker on your server)
# ===========================
# A. Simple: FastAPI + model
from fastapi import FastAPI
from transformers import pipeline

app = FastAPI()
classifier = pipeline("sentiment-analysis", device=0)

@app.post("/predict")
async def predict(text: str):
    result = classifier(text)
    return result

# uvicorn app:app --host 0.0.0.0 --port 8000
# Deploy dengan Docker → AWS EC2, GCP VM, DigitalOcean, dll.

# B. Optimized: Text Generation Inference (TGI)
# Docker container dari HF untuk LLM serving
# docker run --gpus all -p 8080:80 \
#   ghcr.io/huggingface/text-generation-inference \
#   --model-id meta-llama/Llama-3.2-1B
# 
# Optimized: continuous batching, flash attention, quantization
# Best for: high-throughput LLM serving

# C. vLLM (alternative to TGI)
# pip install vllm
# python -m vllm.entrypoints.openai.api_server \
#   --model meta-llama/Llama-3.2-1B --port 8000
# → OpenAI-compatible API for any HF model!

🎓 Rekomendasi Berdasarkan Situasi:
Belajar / ikut seri ini: → ① Google Colab (gratis, GPU T4, zero setup) ⭐
Development sehari-hari: → ② Lokal + Colab untuk training berat
Demo / portfolio: → ④ HF Spaces (Gradio app, URL publik gratis)
Prototyping cepat: → ③ Inference API (HTTP request, tanpa download)
Production API (startup): → ⑤ Inference Endpoints (managed, auto-scale)
Production API (enterprise): → ⑥ Self-hosting Docker/Kubernetes (full control)

Penting: Hugging Face Hub = "tempat model disimpan" (seperti GitHub). Model di-download dari Hub ke tempat Anda menjalankannya (Colab, laptop, server). HF Hub BUKAN tempat menjalankan kode — kode berjalan di device Anda!

🎓 Recommendation Based on Situation:
Learning / following this series: → ① Google Colab (free, T4 GPU, zero setup) ⭐
Daily development: → ② Local + Colab for heavy training
Demo / portfolio: → ④ HF Spaces (Gradio app, free public URL)
Quick prototyping: → ③ Inference API (HTTP request, no download)
Production API (startup): → ⑤ Inference Endpoints (managed, auto-scale)
Production API (enterprise): → ⑥ Self-hosting Docker/Kubernetes (full control)

Important: Hugging Face Hub = "where models are stored" (like GitHub). Models are downloaded FROM the Hub TO wherever you run them (Colab, laptop, server). The Hub is NOT where code runs — code runs on YOUR device!

Flow: Bagaimana Model Sampai ke Anda / How Models Reach You Hugging Face Hub Your Environment (huggingface.co) (tempat kode berjalan) ┌────────────────┐ ┌────────────────────────┐ │ 500k+ Models │ ── download ──► │ ① Google Colab │ │ │ (first time │ ② Laptop/Desktop │ │ bert-base │ ~420MB, │ ③ AWS/GCP Server │ │ gpt2 │ lalu cached) │ ④ HF Spaces │ │ llama-3.2 │ │ ⑤ HF Endpoints │ │ whisper │ │ ⑥ Docker container │ │ ... │ └────────────────────────┘ └────────────────┘ │ ▼ Kode Anda: Model berjalan di SINI! from transformers import pipeline (bukan di website HF) pipe = pipeline("sentiment-analysis") pipe("Hello!") ← inference di device Anda Hub = storage (seperti GitHub untuk code) Inference = di device Anda (Colab, laptop, server) Exception: Inference API (③) & Endpoints (⑤) → HF servers yang jalankan

Cara	Biaya	GPU	Setup	Best For
① Colab	Gratis	T4 (16GB) gratis	0 menit	Belajar, fine-tuning BERT ⭐
② Lokal	Listrik	GPU Anda (jika ada)	10 menit	Development harian
③ Inference API	Gratis (rate-limit)	HF servers	0 menit	Prototyping, demo kecil
④ Spaces	Gratis (CPU)	Opsional ($0.60/jam)	5 menit	Demo apps, portfolio
⑤ Endpoints	$0.06-4.50/jam	T4/A10/A100	5 menit	Production API
⑥ Self-host	$5-1000+/bln	Your choice	30-60 menit	Enterprise, privacy

Method	Cost	GPU	Setup	Best For
① Colab	Free	T4 (16GB) free	0 min	Learning, BERT fine-tuning ⭐
② Local	Electricity	Your GPU (if any)	10 min	Daily development
③ Inference API	Free (rate-limited)	HF servers	0 min	Prototyping, small demos
④ Spaces	Free (CPU)	Optional ($0.60/hr)	5 min	Demo apps, portfolio
⑤ Endpoints	$0.06-4.50/hr	T4/A10/A100	5 min	Production API
⑥ Self-host	$5-1000+/mo	Your choice	30-60 min	Enterprise, privacy

🎉 TL;DR untuk Pemula:
1. Buka colab.research.google.com
2. Aktifkan GPU: Runtime → Change runtime type → T4 GPU
3. Ketik: !pip install -q transformers datasets accelerate
4. Ketik: from transformers import pipeline
5. Selesai! Anda sudah bisa menjalankan BERT, GPT-2, Whisper, dll. di cloud gratis.
Model di-download dari Hub ke Colab server → berjalan di GPU T4 Colab → Anda dapat hasil di notebook. Tidak perlu install apapun di laptop Anda.

🎉 TL;DR for Beginners:
1. Open colab.research.google.com
2. Enable GPU: Runtime → Change runtime type → T4 GPU
3. Type: !pip install -q transformers datasets accelerate
4. Type: from transformers import pipeline
5. Done! You can now run BERT, GPT-2, Whisper, etc. on a free cloud GPU.
Models are downloaded from the Hub to Colab server → run on Colab's T4 GPU → you get results in the notebook. No need to install anything on your laptop.

🚀

3. Pipeline API — Inference 1 Baris untuk 20+ Tugas

3. Pipeline API — 1-Line Inference for 20+ Tasks

API paling powerful di dunia ML: satu baris kode = download model + tokenize + inference + postprocess

The most powerful API in ML: one line of code = download model + tokenize + inference + postprocess

Pipeline adalah API tertinggi (highest-level) di Hugging Face. Satu function call melakukan segalanya: download model dari Hub, tokenize input, jalankan inference, dan format output. Anda bahkan tidak perlu tahu arsitektur model yang digunakan.

Pipeline is the highest-level API in Hugging Face. One function call does everything: download model from Hub, tokenize input, run inference, and format output. You don't even need to know the model architecture being used.

01_pipeline_basics.py — Pipeline Magic ✨python

from transformers import pipeline

# ===========================
# 1. Sentiment Analysis — one line!
# ===========================
classifier = pipeline("sentiment-analysis")
# First run: downloads model (~270MB) — cached for future use

result = classifier("I absolutely love this product! Best purchase ever.")
print(result)
# [{'label': 'POSITIVE', 'score': 0.9998}]

# Multiple texts at once (batched!)
results = classifier([
    "This movie was fantastic!",
    "Terrible experience, waste of money.",
    "It was okay, nothing special."
])
for r in results:
    print(f"  {r['label']:8s} ({r['score']:.1%})")
# POSITIVE (99.9%)
# NEGATIVE (99.8%)
# POSITIVE (63.1%)  ← uncertain → neutral-ish

# ===========================
# 2. Specify a different model
# ===========================
classifier_multi = pipeline(
    "sentiment-analysis",
    model="nlptown/bert-base-multilingual-uncased-sentiment"
)
# Now supports 6 languages! (EN, DE, NL, ES, FR, IT)
result = classifier_multi("Film ini sangat bagus!")  # Indonesian!
print(result)
# [{'label': '5 stars', 'score': 0.73}]

# ===========================
# 3. GPU acceleration
# ===========================
classifier_gpu = pipeline("sentiment-analysis", device=0)  # GPU:0
# device=0 → first GPU
# device=-1 → CPU (default)
# device="mps" → Apple Silicon

# ===========================
# 4. How pipeline() works internally
# ===========================
# pipeline("sentiment-analysis") is equivalent to:
# 1. tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
# 2. model = AutoModelForSequenceClassification.from_pretrained("...")
# 3. inputs = tokenizer(text, return_tensors="pt")
# 4. outputs = model(**inputs)
# 5. predictions = softmax(outputs.logits)
# 6. label = model.config.id2label[predicted_class]
# Pipeline wraps ALL of this in one call!

🎓 Pipeline: Apa yang Terjadi di Balik Layar?
Satu panggilan pipeline("sentiment-analysis")("text") melakukan 6 langkah:
1. Download model dari Hugging Face Hub (pertama kali saja, lalu di-cache)
2. Tokenize input — text → subword tokens → integer IDs + attention mask
3. Forward pass — jalankan model Transformer (BERT/DistilBERT/etc.)
4. Post-process — logits → softmax → probabilities
5. Map to labels — index → "POSITIVE"/"NEGATIVE"
6. Format output — return list of dicts dengan label dan score
Anda akan belajar SEMUA langkah ini secara manual di section 7-9!

🎓 Pipeline: What Happens Behind the Scenes?
One call to pipeline("sentiment-analysis")("text") performs 6 steps:
1. Download model from Hugging Face Hub (first time only, then cached)
2. Tokenize input — text → subword tokens → integer IDs + attention mask
3. Forward pass — run Transformer model (BERT/DistilBERT/etc.)
4. Post-process — logits → softmax → probabilities
5. Map to labels — index → "POSITIVE"/"NEGATIVE"
6. Format output — return list of dicts with label and score
You'll learn ALL of these steps manually in sections 7-9!

📝

4. Pipeline NLP Tasks — Sentiment, NER, QA, Translation, Summarization

Satu API untuk semua tugas NLP — ganti nama task, dapat model baru

One API for all NLP tasks — change the task name, get a new model

02_nlp_pipelines.py — Semua NLP Pipelinepython

from transformers import pipeline

# ===========================
# 1. Named Entity Recognition (NER)
# Identifikasi entitas: orang, tempat, organisasi
# ===========================
ner = pipeline("ner", grouped_entities=True)
result = ner("Joko Widodo visited Google headquarters in Mountain View, California.")
for entity in result:
    print(f"  {entity['word']:20s} → {entity['entity_group']:5s} ({entity['score']:.1%})")
# Joko Widodo          → PER   (99.8%)
# Google               → ORG   (99.6%)
# Mountain View        → LOC   (99.9%)
# California           → LOC   (99.9%)

# ===========================
# 2. Question Answering (extractive)
# Jawab pertanyaan berdasarkan konteks
# ===========================
qa = pipeline("question-answering")
result = qa(
    question="What is the capital of France?",
    context="France is a country in Europe. Its capital is Paris, a city known for the Eiffel Tower."
)
print(f"Answer: {result['answer']} (score: {result['score']:.1%})")
# Answer: Paris (score: 98.7%)

# ===========================
# 3. Text Summarization
# ===========================
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
article = """
Hugging Face has raised $235 million in a Series D funding round, 
bringing the company's valuation to $4.5 billion. The round was led 
by Salesforce Ventures, with participation from Google, Amazon, NVIDIA, 
Intel, AMD, and Qualcomm. The company plans to use the funding to 
expand its open-source AI platform and hire more researchers.
"""
summary = summarizer(article, max_length=50, min_length=20)
print(summary[0]['summary_text'])
# "Hugging Face raised $235M at $4.5B valuation, led by Salesforce..."

# ===========================
# 4. Translation
# ===========================
translator = pipeline("translation_en_to_fr")
result = translator("Hugging Face is the best AI platform.")
print(result[0]['translation_text'])
# "Hugging Face est la meilleure plateforme d'IA."

# Multi-language: Helsinki-NLP models
id_to_en = pipeline("translation", model="Helsinki-NLP/opus-mt-id-en")
result = id_to_en("Saya sedang belajar kecerdasan buatan.")
print(result[0]['translation_text'])
# "I'm learning artificial intelligence."

# ===========================
# 5. Text Generation (GPT-style)
# ===========================
generator = pipeline("text-generation", model="gpt2")
result = generator(
    "Artificial intelligence will",
    max_length=50,
    num_return_sequences=2,    # generate 2 variations
    temperature=0.7,           # creativity (0=deterministic, 1=random)
    do_sample=True
)
for i, r in enumerate(result):
    print(f"  Variation {i+1}: {r['generated_text'][:80]}...")

# ===========================
# 6. Fill-Mask (BERT-style)
# ===========================
fill = pipeline("fill-mask")
results = fill("The capital of Indonesia is [MASK].")
for r in results[:3]:
    print(f"  {r['token_str']:10s} ({r['score']:.1%})")
# Jakarta    (92.3%)
# Bandung    (2.1%)
# Surabaya   (1.4%)

🖼️

5. Pipeline Beyond NLP — Image, Audio, Zero-Shot

Hugging Face bukan hanya untuk teks — juga gambar, audio, dan multimodal

Hugging Face isn't just for text — also images, audio, and multimodal

03_beyond_nlp.py — Image, Audio & Zero-Shot Pipelinespython

from transformers import pipeline

# ===========================
# 1. Image Classification
# ===========================
img_classifier = pipeline("image-classification")
result = img_classifier("https://upload.wikimedia.org/wikipedia/commons/4/4d/Cat_November_2010-1a.jpg")
for r in result[:3]:
    print(f"  {r['label']:30s} ({r['score']:.1%})")
# tabby, tabby cat              (43.2%)
# Egyptian cat                  (22.1%)
# tiger cat                     (13.8%)

# ===========================
# 2. Object Detection
# ===========================
detector = pipeline("object-detection")
results = detector("https://example.com/street_scene.jpg")
for r in results:
    print(f"  {r['label']:10s} ({r['score']:.1%}) at {r['box']}")
# car        (97.2%) at {'xmin': 12, 'ymin': 50, ...}
# person     (95.1%) at {'xmin': 200, 'ymin': 30, ...}

# ===========================
# 3. Zero-Shot Classification (NO TRAINING NEEDED!)
# Classify text into ANY categories — even ones the model never saw!
# ===========================
zero_shot = pipeline("zero-shot-classification")
result = zero_shot(
    "Harga saham Tesla naik 15% setelah pengumuman earnings Q4.",
    candidate_labels=["finance", "sports", "technology", "politics", "health"]
)
for label, score in zip(result['labels'], result['scores']):
    print(f"  {label:12s}: {score:.1%}")
# finance     : 78.3%
# technology  : 15.2%
# politics    :  3.8%
# sports      :  1.5%
# health      :  1.2%

# ===========================
# 4. Automatic Speech Recognition
# ===========================
# asr = pipeline("automatic-speech-recognition", model="openai/whisper-base")
# result = asr("audio_file.mp3")
# print(result["text"])  # "Hello, how are you today?"
# Whisper supports 99 languages including Indonesian!

# ===========================
# 5. Text-to-Speech
# ===========================
# tts = pipeline("text-to-speech", model="microsoft/speecht5_tts")
# audio = tts("Hello, welcome to the Hugging Face tutorial!")
# # Returns audio array that can be saved as .wav

🎉 Zero-Shot Classification — Superpower!
Zero-shot = klasifikasi tanpa training sama sekali. Anda cukup memberikan kategori yang diinginkan sebagai teks, dan model mencocokkan input dengan kategori tersebut menggunakan natural language understanding. Cocok untuk: prototyping cepat, label discovery, klasifikasi dengan kategori yang sering berubah.

🎉 Zero-Shot Classification — Superpower!
Zero-shot = classification without any training. You just provide desired categories as text, and the model matches input to categories using natural language understanding. Great for: rapid prototyping, label discovery, classification with frequently changing categories.

Pipeline Task	Deskripsi	Default Model	Input → Output
sentiment-analysis	Sentiment positif/negatif	DistilBERT SST-2	teks → label + score
ner	Named Entity Recognition	BERT NER	teks → entitas + tipe
question-answering	Jawab dari konteks	DistilBERT SQuAD	question + context → answer
summarization	Ringkas teks panjang	BART CNN	teks panjang → ringkasan
translation_xx_to_yy	Terjemahan	Helsinki-NLP	teks bahasa A → bahasa B
text-generation	Generate teks (GPT-style)	GPT-2	prompt → teks lanjutan
fill-mask	Prediksi kata yang hilang	BERT base	teks + [MASK] → kata
zero-shot-classification	Klasifikasi tanpa training	BART MNLI	teks + labels → scores
image-classification	Klasifikasi gambar	ViT ImageNet	gambar → label + score
object-detection	Deteksi objek	DETR	gambar → boxes + labels
automatic-speech-recognition	Speech to text	Whisper	audio → teks

Pipeline Task	Description	Default Model	Input → Output
sentiment-analysis	Positive/negative sentiment	DistilBERT SST-2	text → label + score
ner	Named Entity Recognition	BERT NER	text → entities + types
question-answering	Answer from context	DistilBERT SQuAD	question + context → answer
summarization	Summarize long text	BART CNN	long text → summary
translation_xx_to_yy	Translation	Helsinki-NLP	language A text → language B
text-generation	Generate text (GPT-style)	GPT-2	prompt → continuation
fill-mask	Predict missing word	BERT base	text + [MASK] → word
zero-shot-classification	Classify without training	BART MNLI	text + labels → scores
image-classification	Classify images	ViT ImageNet	image → label + score
object-detection	Detect objects	DETR	image → boxes + labels
automatic-speech-recognition	Speech to text	Whisper	audio → text

🌐

6. Model Hub — 500k+ Models, Cara Memilih yang Tepat

6. Model Hub — 500k+ Models, Choosing the Right One

huggingface.co/models — filter by task, language, size, license

Dengan 500k+ model di Hub, bagaimana memilih yang tepat? Gunakan filter: task (sentiment, NER, dll), language (Indonesian, English), library (PyTorch, TensorFlow), dataset (model trained on what data), dan license (open vs restricted). Sort by downloads atau likes untuk model terpopuler.

With 500k+ models on the Hub, how to choose the right one? Use filters: task (sentiment, NER, etc.), language (Indonesian, English), library (PyTorch, TensorFlow), dataset (model trained on what data), and license (open vs restricted). Sort by downloads or likes for most popular models.

04_model_hub.py — Browse & Download Modelspython

from huggingface_hub import HfApi, list_models

# ===========================
# 1. Search models programmatically
# ===========================
api = HfApi()
models = api.list_models(
    filter="text-classification",
    sort="downloads",
    direction=-1,
    limit=5
)
for m in models:
    print(f"  {m.id:50s} ↓{m.downloads:>10,}")
# distilbert-base-uncased-finetuned-sst-2-english  ↓ 85,432,100
# nlptown/bert-base-multilingual-uncased-sentiment  ↓ 12,345,000
# cardiffnlp/twitter-roberta-base-sentiment-latest  ↓  8,765,000

# ===========================
# 2. Indonesian NLP models
# ===========================
id_models = api.list_models(
    filter="text-classification",
    search="indonesian",
    sort="downloads",
    direction=-1,
    limit=5
)
for m in id_models:
    print(f"  {m.id}")
# indobenchmark/indobert-base-p1
# indolem/indobert-base-uncased
# cahya/bert-base-indonesian-522M

# ===========================
# 3. Model naming convention
# ===========================
# Format: organization/model-name
# Examples:
# google-bert/bert-base-uncased         ← Google's BERT
# meta-llama/Llama-3.2-1B              ← Meta's LLaMA
# openai-community/gpt2                ← OpenAI's GPT-2
# facebook/bart-large-cnn              ← Meta's BART
# sentence-transformers/all-MiniLM-L6-v2 ← sentence embeddings

# ===========================
# 4. Download model manually (for offline use)
# ===========================
from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained("bert-base-uncased")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
# Downloads to ~/.cache/huggingface/ (~420MB for BERT base)

# Save locally
model.save_pretrained("./my_bert")
tokenizer.save_pretrained("./my_bert")

# Load from local
model = AutoModel.from_pretrained("./my_bert")
tokenizer = AutoTokenizer.from_pretrained("./my_bert")

🎓 Tips Memilih Model:
Prototyping: Mulai dengan default pipeline (biasanya DistilBERT — cepat dan bagus).
Production English: roberta-base atau deberta-v3-base (lebih akurat dari BERT).
Production Indonesian: indobert-base atau cahya/bert-base-indonesian.
Multilingual: xlm-roberta-base (100+ bahasa termasuk Indonesia).
Speed priority: DistilBERT (40% lebih cepat, 97% akurasi BERT).
LLM/Chat: meta-llama/Llama-3.2, Qwen/Qwen2.5, mistralai/Mistral.

🎓 Tips for Choosing Models:
Prototyping: Start with default pipeline (usually DistilBERT — fast and good).
Production English: roberta-base or deberta-v3-base (more accurate than BERT).
Production Indonesian: indobert-base or cahya/bert-base-indonesian.
Multilingual: xlm-roberta-base (100+ languages including Indonesian).
Speed priority: DistilBERT (40% faster, 97% of BERT accuracy).
LLM/Chat: meta-llama/Llama-3.2, Qwen/Qwen2.5, mistralai/Mistral.

🔧

7. Auto Classes — AutoModel, AutoTokenizer, AutoConfig

Satu API universal yang otomatis memilih class yang tepat berdasarkan model name

One universal API that automatically selects the right class based on model name

Auto Classes adalah abstraksi brilliant dari Hugging Face: Anda tidak perlu tahu apakah model itu BERT, RoBERTa, GPT-2, atau T5 — cukup gunakan AutoModel dan ia akan otomatis memilih class yang tepat. Ini memungkinkan Anda mengganti model tanpa mengubah kode.

Auto Classes are a brilliant abstraction from Hugging Face: you don't need to know if the model is BERT, RoBERTa, GPT-2, or T5 — just use AutoModel and it automatically selects the right class. This lets you swap models without changing code.

05_auto_classes.py — Auto Classes Deep Divepython

from transformers import (
    AutoModel, AutoTokenizer, AutoConfig,
    AutoModelForSequenceClassification,
    AutoModelForTokenClassification,
    AutoModelForQuestionAnswering,
    AutoModelForCausalLM,
    AutoModelForSeq2SeqLM,
)

# ===========================
# 1. AutoTokenizer — universal tokenizer loader
# ===========================
# Doesn't matter if model uses WordPiece, BPE, or SentencePiece!
tokenizer_bert = AutoTokenizer.from_pretrained("bert-base-uncased")      # WordPiece
tokenizer_gpt = AutoTokenizer.from_pretrained("gpt2")                    # BPE
tokenizer_t5 = AutoTokenizer.from_pretrained("google-t5/t5-small")      # SentencePiece
tokenizer_llama = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")  # BPE

# All have the SAME interface!
for name, tok in [("BERT", tokenizer_bert), ("GPT-2", tokenizer_gpt), ("T5", tokenizer_t5)]:
    encoded = tok("Hello world", return_tensors="pt")
    print(f"  {name:6s}: {encoded['input_ids'][0].tolist()}")
# BERT  : [101, 7592, 2088, 102]           ← [CLS] hello world [SEP]
# GPT-2 : [15496, 995]                      ← hello world (no special tokens)
# T5    : [8774, 296, 1]                    ← Hello▁world 

# ===========================
# 2. AutoModel — base model (no head)
# ===========================
model = AutoModel.from_pretrained("bert-base-uncased")
print(f"Type: {type(model).__name__}")  # BertModel
print(f"Params: {model.num_parameters():,}")  # 109,482,240
# Output: last_hidden_state (batch, seq_len, hidden_size)
# → Raw embeddings, NO classification head

# ===========================
# 3. AutoModelForSequenceClassification — with classifier head
# ===========================
model_cls = AutoModelForSequenceClassification.from_pretrained(
    "bert-base-uncased",
    num_labels=3  # positive, negative, neutral
)
print(f"Type: {type(model_cls).__name__}")  # BertForSequenceClassification
# Output: logits (batch, num_labels) → ready for classification!

# ===========================
# 4. Task-specific Auto Classes
# ===========================
# AutoModelForSequenceClassification  → sentiment, topic classification
# AutoModelForTokenClassification     → NER, POS tagging
# AutoModelForQuestionAnswering       → extractive QA
# AutoModelForCausalLM                → text generation (GPT-style)
# AutoModelForSeq2SeqLM               → translation, summarization (T5-style)
# AutoModelForMaskedLM                → fill-mask (BERT-style)
# AutoModelForImageClassification     → image classification (ViT)
# AutoModelForObjectDetection         → object detection (DETR)

# ===========================
# 5. AutoConfig — model configuration
# ===========================
config = AutoConfig.from_pretrained("bert-base-uncased")
print(f"Hidden size: {config.hidden_size}")       # 768
print(f"Num layers:  {config.num_hidden_layers}")  # 12
print(f"Num heads:   {config.num_attention_heads}") # 12
print(f"Vocab size:  {config.vocab_size}")         # 30522

Auto Classes — Satu API untuk Semua Model / One API for All Models AutoTokenizer.from_pretrained("model_name") │ ├── bert-base-uncased → BertTokenizerFast (WordPiece) ├── gpt2 → GPT2TokenizerFast (BPE) ├── google-t5/t5-small → T5TokenizerFast (SentencePiece) └── meta-llama/Llama-3.2 → LlamaTokenizerFast (BPE) AutoModelForSequenceClassification.from_pretrained("model_name") │ ├── bert-base-uncased → BertForSequenceClassification ├── roberta-base → RobertaForSequenceClassification ├── distilbert-base → DistilBertForSequenceClassification └── xlm-roberta-base → XLMRobertaForSequenceClassification Same code, different model — just change the model name string! model_name = "bert-base-uncased" # → BERT model_name = "roberta-base" # → RoBERTa (same code!) model_name = "xlm-roberta-base" # → XLM-R multilingual (same code!)

✂️

8. Tokenisasi Mendalam — WordPiece, BPE, SentencePiece

8. Deep Dive: Tokenization — WordPiece, BPE, SentencePiece

Memahami SETIAP langkah: text → tokens → IDs → attention mask → model input

Understanding EVERY step: text → tokens → IDs → attention mask → model input

06_tokenization.py — Tokenization Deep Dive 🔬python

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

# ===========================
# 1. Step by step tokenization
# ===========================
text = "Hugging Face's tokenizers are incredibly fast!"

# Step 1: Tokenize (split into subwords)
tokens = tokenizer.tokenize(text)
print(f"Tokens: {tokens}")
# ['hugging', 'face', "'", 's', 'token', '##ize', '##rs', 'are', 'incredibly', 'fast', '!']
# Note: "tokenizers" → ["token", "##ize", "##rs"] (WordPiece subwords!)
# "##" prefix means "continuation of previous word"

# Step 2: Convert to IDs
ids = tokenizer.convert_tokens_to_ids(tokens)
print(f"IDs: {ids}")
# [17662, 2227, 1005, 1055, 19204, 4697, 2869, 2024, 12978, 3435, 999]

# Step 3: Add special tokens + create attention mask
encoded = tokenizer(text, return_tensors="pt")
print(f"input_ids:      {encoded['input_ids'][0].tolist()}")
print(f"attention_mask: {encoded['attention_mask'][0].tolist()}")
print(f"token_type_ids: {encoded['token_type_ids'][0].tolist()}")
# input_ids:      [101, 17662, 2227, ..., 999, 102]    ← [CLS] ... [SEP]
# attention_mask: [1, 1, 1, ..., 1, 1]                  ← all real tokens
# token_type_ids: [0, 0, 0, ..., 0, 0]                  ← single sentence

# ===========================
# 2. Decode back to text
# ===========================
decoded = tokenizer.decode(encoded['input_ids'][0])
print(f"Decoded: {decoded}")
# "[CLS] hugging face's tokenizers are incredibly fast! [SEP]"

decoded_skip = tokenizer.decode(encoded['input_ids'][0], skip_special_tokens=True)
print(f"Clean:   {decoded_skip}")
# "hugging face's tokenizers are incredibly fast!"

# ===========================
# 3. Padding & Truncation
# ===========================
texts = ["Short text.", "This is a much longer sentence that has more words in it."]

# Without padding: different lengths → can't batch!
for t in texts:
    enc = tokenizer(t)
    print(f"  Length: {len(enc['input_ids'])}")
# Length: 4
# Length: 14  ← different! Can't make a tensor

# With padding + truncation: same length → can batch!
batch = tokenizer(texts,
    padding=True,           # pad shorter sequences
    truncation=True,         # truncate if too long
    max_length=128,          # max sequence length
    return_tensors="pt"     # return PyTorch tensors
)
print(f"Batch shape: {batch['input_ids'].shape}")
# Batch shape: torch.Size([2, 14])  ← padded to longest!
print(f"Attention mask: {batch['attention_mask'][0].tolist()}")
# [1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
# 1=real token, 0=padding → model IGNORES padding!

# ===========================
# 4. Special tokens per model
# ===========================
print(f"BERT special tokens:")
print(f"  CLS: {tokenizer.cls_token} (ID: {tokenizer.cls_token_id})")  # [CLS] = 101
print(f"  SEP: {tokenizer.sep_token} (ID: {tokenizer.sep_token_id})")  # [SEP] = 102
print(f"  PAD: {tokenizer.pad_token} (ID: {tokenizer.pad_token_id})")  # [PAD] = 0
print(f"  UNK: {tokenizer.unk_token} (ID: {tokenizer.unk_token_id})")  # [UNK] = 100
print(f"  Vocab size: {tokenizer.vocab_size}")  # 30522

# ===========================
# 5. Sentence pairs (for NLI, QA, etc.)
# ===========================
encoded_pair = tokenizer(
    "What is the capital?",     # sentence A
    "The capital of France is Paris.",  # sentence B
    return_tensors="pt"
)
print(encoded_pair['token_type_ids'][0].tolist())
# [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1]
# 0=sentence A, 1=sentence B
# [CLS] What is the capital ? [SEP] The capital of France is Paris . [SEP]

🎓 WordPiece vs BPE vs SentencePiece:
WordPiece (BERT): Split kata yang tidak dikenal menjadi subword. "tokenizers" → ["token", "##ize", "##rs"]. Prefix ## = lanjutan.
BPE (GPT-2, RoBERTa): Byte Pair Encoding — merge pasangan byte paling sering. "lower" → ["low", "er"]. Prefix Ġ = awal kata baru.
SentencePiece (T5, LLaMA): Language-agnostic, treat semua input sebagai byte sequence. ▁ = space/word boundary. Bekerja untuk SEMUA bahasa tanpa preprocessing.
Anda tidak perlu memilih — AutoTokenizer otomatis load tokenizer yang tepat untuk setiap model!

🎓 WordPiece vs BPE vs SentencePiece:
WordPiece (BERT): Split unknown words into subwords. "tokenizers" → ["token", "##ize", "##rs"]. ## prefix = continuation.
BPE (GPT-2, RoBERTa): Byte Pair Encoding — merge most frequent byte pairs. "lower" → ["low", "er"]. Ġ prefix = new word start.
SentencePiece (T5, LLaMA): Language-agnostic, treats all input as byte sequence. ▁ = space/word boundary. Works for ALL languages without preprocessing.
You don't need to choose — AutoTokenizer automatically loads the right tokenizer for each model!

🔬

9. Dari Tokenizer ke Model — Full Forward Pass Manual

9. From Tokenizer to Model — Full Manual Forward Pass

Memahami apa yang pipeline() lakukan di balik layar — langkah per langkah

Understanding what pipeline() does behind the scenes — step by step

07_manual_forward.py — Full Manual Inference 🔬python

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# ===========================
# Step 1: Load tokenizer & model
# ===========================
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# ===========================
# Step 2: Tokenize input
# ===========================
text = "I absolutely love learning about Hugging Face!"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
print(f"Input IDs shape: {inputs['input_ids'].shape}")
print(f"Tokens: {tokenizer.convert_ids_to_tokens(inputs['input_ids'][0])}")
# ['[CLS]', 'i', 'absolutely', 'love', 'learning', 'about', 'hugging', 'face', '!', '[SEP]']

# ===========================
# Step 3: Forward pass (no gradient needed for inference!)
# ===========================
with torch.no_grad():  # disable gradient computation → faster + less memory
    outputs = model(**inputs)

print(f"Output type: {type(outputs)}")
# SequenceClassifierOutput
print(f"Logits: {outputs.logits}")
# tensor([[-4.2532,  4.5687]])  ← raw scores (NOT probabilities!)

# ===========================
# Step 4: Post-process — logits → probabilities
# ===========================
probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
print(f"Probabilities: {probabilities}")
# tensor([[0.0001, 0.9999]])  ← [NEGATIVE, POSITIVE]

# ===========================
# Step 5: Map to label
# ===========================
predicted_class = torch.argmax(probabilities, dim=-1).item()
label = model.config.id2label[predicted_class]
confidence = probabilities[0][predicted_class].item()

print(f"\\n🎯 Prediction: {label} ({confidence:.1%})")
# 🎯 Prediction: POSITIVE (99.99%)

# ===========================
# Compare with pipeline (should be identical!)
# ===========================
from transformers import pipeline
pipe = pipeline("sentiment-analysis", model=model_name)
print(f"Pipeline: {pipe(text)}")
# [{'label': 'POSITIVE', 'score': 0.9999}] ← identical! ✓

🎉 Sekarang Anda Paham Seluruh Flow!
Pipeline = Step 1-5 di atas digabung jadi satu baris. Tapi memahami setiap langkah penting karena: (1) Anda bisa custom preprocessing, (2) Anda bisa custom postprocessing, (3) Anda bisa debug masalah, dan (4) Fine-tuning (Page 2-3) membutuhkan pemahaman tentang tokenizer + model secara terpisah.

🎉 Now You Understand the Entire Flow!
Pipeline = Steps 1-5 above combined into one line. But understanding each step matters because: (1) you can customize preprocessing, (2) you can customize postprocessing, (3) you can debug issues, and (4) Fine-tuning (Pages 2-3) requires understanding tokenizer + model separately.

🎯

10. First Look: Fine-Tuning BERT — Preview Page 2

10. First Look: Fine-Tuning BERT — Page 2 Preview

Sneak peek: dari pre-trained model ke classifier custom Anda — dalam 20 baris

Sneak peek: from pre-trained model to your custom classifier — in 20 lines

08_finetuning_preview.py — First Taste of Fine-Tuning 🔥python

from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments
from datasets import load_dataset

# ===========================
# Fine-tune BERT on IMDB — PREVIEW (Page 2 = full version)
# ===========================

# 1. Load dataset
dataset = load_dataset("imdb")
print(dataset)
# DatasetDict({'train': Dataset(25000 rows), 'test': Dataset(25000 rows)})

# 2. Load tokenizer & model
model_name = "distilbert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)

# 3. Tokenize dataset
def tokenize(batch):
    return tokenizer(batch["text"], truncation=True, padding="max_length", max_length=256)

tokenized = dataset.map(tokenize, batched=True)

# 4. Training arguments
args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=64,
    eval_strategy="epoch",
    learning_rate=2e-5,
    weight_decay=0.01,
    fp16=True,             # mixed precision!
)

# 5. Create Trainer & train!
trainer = Trainer(
    model=model,
    args=args,
    train_dataset=tokenized["train"],
    eval_dataset=tokenized["test"],
    tokenizer=tokenizer,
)

trainer.train()
# → 93%+ accuracy on IMDB in ~15 minutes on Google Colab T4!
# Compare: BiLSTM from TF series = 87%. BERT = 93%+. That's the power!

# Page 2 will cover: full Trainer API, custom metrics, hyperparameter
# tuning, data collators, and pushing models to the Hub.

🎯 Preview: 93%+ IMDB Accuracy dalam 15 Menit!
Bandingkan dengan seri sebelumnya:
• Seri NN (manual NumPy): ~80% (ratusan baris kode, berjam-jam training)
• Seri TF Page 5 (BiLSTM): ~87% (25 baris, 30 menit training)
• Seri TF Page 6 (BERT TF Hub): ~95% (lebih kompleks setup)
• Hugging Face (Trainer API): 93%+ (20 baris, 15 menit!) 🏆
Page 2 akan membahas ini secara mendalam — stay tuned!

🎯 Preview: 93%+ IMDB Accuracy in 15 Minutes!
Compare with previous series:
• NN Series (manual NumPy): ~80% (hundreds of lines, hours of training)
• TF Series Page 5 (BiLSTM): ~87% (25 lines, 30 min training)
• TF Series Page 6 (BERT TF Hub): ~95% (more complex setup)
• Hugging Face (Trainer API): 93%+ (20 lines, 15 minutes!) 🏆
Page 2 will cover this in depth — stay tuned!

📝

11. Ringkasan Page 1

11. Page 1 Summary

Semua yang sudah kita pelajari

Everything we learned

Konsep	Apa Itu	Kode Kunci
Pipeline	1-line inference untuk 20+ tasks	`pipeline("sentiment-analysis")(text)`
Model Hub	500k+ models siap download	`huggingface.co/models`
AutoTokenizer	Universal tokenizer loader	`AutoTokenizer.from_pretrained(name)`
AutoModel	Universal model loader	`AutoModelForXxx.from_pretrained(name)`
Tokenization	Text → tokens → IDs → tensors	`tokenizer(text, return_tensors="pt")`
Padding/Truncation	Fixed-length batching	`padding=True, truncation=True`
Forward Pass	model(**inputs) → logits	`outputs = model(**inputs)`
Post-process	logits → softmax → label	`softmax(logits) → argmax → id2label`
Zero-Shot	Classify tanpa training	`pipeline("zero-shot-classification")`
Trainer (preview)	Fine-tuning API	`Trainer(model, args, train_dataset)`

Concept	What It Is	Key Code
Pipeline	1-line inference for 20+ tasks	`pipeline("sentiment-analysis")(text)`
Model Hub	500k+ ready-to-download models	`huggingface.co/models`
AutoTokenizer	Universal tokenizer loader	`AutoTokenizer.from_pretrained(name)`
AutoModel	Universal model loader	`AutoModelForXxx.from_pretrained(name)`
Tokenization	Text → tokens → IDs → tensors	`tokenizer(text, return_tensors="pt")`
Padding/Truncation	Fixed-length batching	`padding=True, truncation=True`
Forward Pass	model(**inputs) → logits	`outputs = model(**inputs)`
Post-process	logits → softmax → label	`softmax(logits) → argmax → id2label`
Zero-Shot	Classify without training	`pipeline("zero-shot-classification")`
Trainer (preview)	Fine-tuning API	`Trainer(model, args, train_dataset)`

📘

Coming Next: Page 2 — Fine-Tuning BERT & Trainer API

Deep dive fine-tuning! Page 2 membahas: Datasets library (load, preprocess, tokenize), Trainer API lengkap (TrainingArguments, callbacks, logging), fine-tuning BERT/DistilBERT/RoBERTa untuk text classification, custom metrics (F1, precision, recall), data collator dan dynamic padding, push model ke Hugging Face Hub, dan hyperparameter tuning. Dari IMDB sentiment sampai custom dataset Anda sendiri!

📘

Coming Next: Page 2 — Fine-Tuning BERT & Trainer API

Deep dive into fine-tuning! Page 2 covers: Datasets library (load, preprocess, tokenize), complete Trainer API (TrainingArguments, callbacks, logging), fine-tuning BERT/DistilBERT/RoBERTa for text classification, custom metrics (F1, precision, recall), data collator and dynamic padding, pushing models to Hugging Face Hub, and hyperparameter tuning. From IMDB sentiment to your own custom datasets!

Pengenalan Hugging Face
Transformers & Pipeline

Introduction to Hugging Face
Transformers & Pipeline

📑 Daftar Isi — Page 1

📑 Table of Contents — Page 1

1. Apa Itu Hugging Face? — Revolusi AI Open-Source

1. What Is Hugging Face? — The Open-Source AI Revolution

2. Instalasi — 4 Library Inti Hugging Face

2. Installation — 4 Core Hugging Face Libraries

2b. Bagaimana Cara Pakai Hugging Face? — 6 Cara dari Gratisan sampai Production

2b. How Do You Actually Use Hugging Face? — 6 Ways from Free to Production

① Google Colab — REKOMENDASI #1 untuk Belajar

① Google Colab — #1 RECOMMENDATION for Learning

② Komputer Lokal — Untuk Development Sehari-hari

② Local Computer — For Daily Development

③ HF Inference API — Pakai Model Tanpa Download

③ HF Inference API — Use Models Without Downloading

④ HF Spaces — Buat Demo App Gratis

④ HF Spaces — Build Free Demo Apps

⑤ & ⑥ Production Deployment — Inference Endpoints & Self-Hosting

⑤ & ⑥ Production Deployment — Inference Endpoints & Self-Hosting

3. Pipeline API — Inference 1 Baris untuk 20+ Tugas

3. Pipeline API — 1-Line Inference for 20+ Tasks

4. Pipeline NLP Tasks — Sentiment, NER, QA, Translation, Summarization

4. Pipeline NLP Tasks — Sentiment, NER, QA, Translation, Summarization

5. Pipeline Beyond NLP — Image, Audio, Zero-Shot

5. Pipeline Beyond NLP — Image, Audio, Zero-Shot

6. Model Hub — 500k+ Models, Cara Memilih yang Tepat

6. Model Hub — 500k+ Models, Choosing the Right One

7. Auto Classes — AutoModel, AutoTokenizer, AutoConfig

7. Auto Classes — AutoModel, AutoTokenizer, AutoConfig

8. Tokenisasi Mendalam — WordPiece, BPE, SentencePiece

8. Deep Dive: Tokenization — WordPiece, BPE, SentencePiece

9. Dari Tokenizer ke Model — Full Forward Pass Manual

9. From Tokenizer to Model — Full Manual Forward Pass

10. First Look: Fine-Tuning BERT — Preview Page 2

10. First Look: Fine-Tuning BERT — Page 2 Preview

11. Ringkasan Page 1

11. Page 1 Summary

Coming Next: Page 2 — Fine-Tuning BERT & Trainer API

Coming Next: Page 2 — Fine-Tuning BERT & Trainer API