πŸ“ Artikel ini ditulis dalam Bahasa Indonesia & English
πŸ“ This article is available in English & Bahasa Indonesia

πŸ€— Belajar Hugging Face β€” Page 1Learn Hugging Face β€” Page 1

Pengenalan Hugging Face
Transformers & Pipeline

Introduction to Hugging Face
Transformers & Pipeline

Ekosistem open-source terbesar untuk NLP, Computer Vision, dan Generative AI. Page 1 membahas secara mendalam: apa itu Hugging Face dan kenapa ia merevolusi AI, instalasi library transformers/datasets/tokenizers/accelerate, Pipeline API untuk inference 1-baris (sentiment, NER, translation, summarization, text generation, image classification, zero-shot), arsitektur model Hub (500k+ models), memahami Auto Classes (AutoModel, AutoTokenizer, AutoConfig), tokenisasi mendalam (WordPiece, BPE, SentencePiece), dan first look fine-tuning BERT untuk text classification.

The largest open-source ecosystem for NLP, Computer Vision, and Generative AI. Page 1 covers in depth: what Hugging Face is and why it revolutionized AI, installing transformers/datasets/tokenizers/accelerate libraries, Pipeline API for 1-line inference (sentiment, NER, translation, summarization, text generation, image classification, zero-shot), the Model Hub architecture (500k+ models), understanding Auto Classes (AutoModel, AutoTokenizer, AutoConfig), deep dive into tokenization (WordPiece, BPE, SentencePiece), and first look at fine-tuning BERT for text classification.

πŸ“… MaretMarch 2026⏱ 40 menit baca40 min read
🏷 Hugging FaceTransformersPipelineAutoModelTokenizerBERTGPTModel HubFine-Tuning
πŸ“š Seri Belajar Hugging Face:Learn Hugging Face Series:

πŸ“‘ Daftar Isi β€” Page 1

πŸ“‘ Table of Contents β€” Page 1

  1. Apa Itu Hugging Face? β€” Ekosistem yang merevolusi AI
  2. Instalasi β€” transformers, datasets, tokenizers, accelerate
  3. Cara Pakai HF β€” Colab, lokal, Inference API, Spaces, self-hosting
  4. Pipeline API β€” Inference 1 baris untuk 20+ tugas
  5. Pipeline: NLP Tasks β€” Sentiment, NER, QA, Translation, Summarization
  6. Pipeline: Beyond NLP β€” Image Classification, Object Detection, Zero-Shot
  7. Model Hub β€” 500k+ models, cara memilih yang tepat
  8. Auto Classes β€” AutoModel, AutoTokenizer, AutoConfig
  9. Tokenisasi Mendalam β€” WordPiece, BPE, SentencePiece, encoding
  10. Dari Tokenizer ke Model β€” Full forward pass manual
  11. First Look: Fine-Tuning BERT β€” Text classification preview
  12. Ringkasan & Preview Page 2
  1. What Is Hugging Face? β€” The ecosystem that revolutionized AI
  2. Installation β€” transformers, datasets, tokenizers, accelerate
  3. How to Use HF β€” Colab, local, Inference API, Spaces, self-hosting
  4. Pipeline API β€” 1-line inference for 20+ tasks
  5. Pipeline: NLP Tasks β€” Sentiment, NER, QA, Translation, Summarization
  6. Pipeline: Beyond NLP β€” Image Classification, Object Detection, Zero-Shot
  7. Model Hub β€” 500k+ models, choosing the right one
  8. Auto Classes β€” AutoModel, AutoTokenizer, AutoConfig
  9. Deep Dive: Tokenization β€” WordPiece, BPE, SentencePiece, encoding
  10. From Tokenizer to Model β€” Full manual forward pass
  11. First Look: Fine-Tuning BERT β€” Text classification preview
  12. Summary & Page 2 Preview
πŸ€—

1. Apa Itu Hugging Face? β€” Revolusi AI Open-Source

1. What Is Hugging Face? β€” The Open-Source AI Revolution

Dari startup chatbot kecil menjadi "GitHub of AI" β€” platform terpenting di dunia ML
From a small chatbot startup to the "GitHub of AI" β€” the most important platform in ML

Hugging Face (πŸ€—) adalah perusahaan dan platform open-source yang menyediakan ekosistem lengkap untuk machine learning modern. Bayangkan GitHub, tapi khusus untuk model AI: Anda bisa menemukan, menggunakan, dan berbagi model dari BERT sampai LLaMA, dari Stable Diffusion sampai Whisper β€” semuanya gratis. Lebih dari 500,000 model dan 100,000 dataset tersedia di Hub mereka.

Hugging Face (πŸ€—) is a company and open-source platform providing a complete ecosystem for modern machine learning. Imagine GitHub, but specifically for AI models: you can find, use, and share models from BERT to LLaMA, from Stable Diffusion to Whisper β€” all for free. Over 500,000 models and 100,000 datasets are available on their Hub.

Kenapa Hugging Face begitu penting? Karena ia mendemokratisasi AI. Sebelum HF, menggunakan BERT membutuhkan ratusan baris kode boilerplate dan pengetahuan mendalam tentang arsitektur model. Sekarang: pipeline("sentiment-analysis")("I love this!") β€” selesai, satu baris.

Why is Hugging Face so important? Because it democratizes AI. Before HF, using BERT required hundreds of lines of boilerplate code and deep knowledge of model architecture. Now: pipeline("sentiment-analysis")("I love this!") β€” done, one line.

πŸ€— Hugging Face Ecosystem β€” Everything You Need for AI πŸ“š Libraries 🌐 Hub (huggingface.co) πŸ›  Tools transformers 500k+ Models Spaces (demo apps) β†’ BERT, GPT, T5, LLaMA β†’ NLP, CV, Audio, Multi Inference API (free!) β†’ Pipeline, AutoModel β†’ Download in 1 line Inference Endpoints β†’ Fine-tuning, training (production) datasets 100k+ Datasets AutoTrain (no-code) β†’ Load any dataset β†’ GLUE, SQuAD, ImageNet Evaluate (benchmarks) β†’ Streaming, preprocessing β†’ Community uploads Optimum (optimization) tokenizers Spaces (100k+ apps) Accelerate (multi-GPU) β†’ Fast Rust-based β†’ Gradio, Streamlit PEFT (LoRA, QLoRA) β†’ BPE, WordPiece, Unigram β†’ Try models live! TRL (RLHF training) accelerate safetensors (safe format) β†’ Multi-GPU, TPU huggingface_hub (API) β†’ Mixed precision β†’ DeepSpeed integration Semua GRATIS dan open-source! MIT / Apache 2.0 license. Used by: Google, Meta, Microsoft, NVIDIA, Amazon, 50k+ companies

πŸ’‘ Analogi: Hugging Face = App Store untuk AI
Model Hub = App Store β†’ download model siap pakai dalam 1 baris kode
Datasets Hub = Data marketplace β†’ dataset berkualitas untuk training
Spaces = Demo gallery β†’ coba model langsung di browser
transformers library = SDK β†’ unified API untuk 200+ arsitektur model
Anda tidak perlu implementasi BERT dari nol β€” cukup from transformers import dan mulai bekerja.

πŸ’‘ Analogy: Hugging Face = App Store for AI
Model Hub = App Store β†’ download ready-to-use models in 1 line of code
Datasets Hub = Data marketplace β†’ quality datasets for training
Spaces = Demo gallery β†’ try models directly in browser
transformers library = SDK β†’ unified API for 200+ model architectures
You don't need to implement BERT from scratch β€” just from transformers import and start working.

πŸ“¦

2. Instalasi β€” 4 Library Inti Hugging Face

2. Installation β€” 4 Core Hugging Face Libraries

transformers + datasets + tokenizers + accelerate β€” fondasi lengkap
transformers + datasets + tokenizers + accelerate β€” complete foundation
Terminal β€” Install Hugging Face Stackbash
# ===========================
# Core libraries
# ===========================
pip install transformers        # models, pipelines, Auto classes
pip install datasets            # dataset loading & processing
pip install tokenizers          # fast Rust-based tokenizers
pip install accelerate          # multi-GPU, mixed precision

# Or install everything at once:
pip install transformers[torch] datasets accelerate

# ===========================
# Backend: PyTorch or TensorFlow
# ===========================
pip install torch               # PyTorch (RECOMMENDED β€” community default)
# pip install tensorflow        # TensorFlow (also supported)
# HF Transformers supports BOTH backends!
# This series uses PyTorch (90% of HF community uses PyTorch)

# ===========================
# Optional but useful
# ===========================
pip install evaluate            # evaluation metrics
pip install peft                # LoRA, QLoRA (efficient fine-tuning)
pip install trl                 # RLHF training (ChatGPT-style)
pip install bitsandbytes        # 4-bit/8-bit quantization
pip install sentencepiece       # for T5, LLaMA tokenizers

# ===========================
# Verify installation
# ===========================
python -c "import transformers; print(f'transformers {transformers.__version__}')"
python -c "import datasets; print(f'datasets {datasets.__version__}')"
python -c "import torch; print(f'PyTorch {torch.__version__}, CUDA: {torch.cuda.is_available()}')"
# transformers 4.47.x
# datasets 3.2.x
# PyTorch 2.5.x, CUDA: True
LibraryFungsiSizeWajib?
transformersModel, tokenizer, pipeline, training~30 MBβœ… Ya
datasetsLoad & proses dataset~5 MBβœ… Ya (training)
tokenizersFast Rust tokenizer (auto-installed)~5 MBAuto
accelerateMulti-GPU, mixed precision~3 MBβœ… Ya (training)
evaluateMetrics (accuracy, F1, BLEU)~2 MBRecommended
peftLoRA, QLoRA efficient fine-tuning~3 MBOptional
torchPyTorch backend~2 GBβœ… Ya (1 backend)
LibraryPurposeSizeRequired?
transformersModels, tokenizer, pipeline, training~30 MBβœ… Yes
datasetsLoad & process datasets~5 MBβœ… Yes (training)
tokenizersFast Rust tokenizer (auto-installed)~5 MBAuto
accelerateMulti-GPU, mixed precision~3 MBβœ… Yes (training)
evaluateMetrics (accuracy, F1, BLEU)~2 MBRecommended
peftLoRA, QLoRA efficient fine-tuning~3 MBOptional
torchPyTorch backend~2 GBβœ… Yes (1 backend)

πŸ’‘ Google Colab: Semua library HF sudah pre-installed di Colab! Cukup !pip install -q transformers datasets accelerate untuk update ke versi terbaru. GPU T4 gratis sudah cukup untuk fine-tuning BERT dan model medium lainnya.

πŸ’‘ Google Colab: All HF libraries come pre-installed on Colab! Just !pip install -q transformers datasets accelerate to update to the latest version. The free T4 GPU is sufficient for fine-tuning BERT and other medium models.

πŸ›€οΈ

2b. Bagaimana Cara Pakai Hugging Face? β€” 6 Cara dari Gratisan sampai Production

2b. How Do You Actually Use Hugging Face? β€” 6 Ways from Free to Production

Pertanyaan paling penting: apakah saya jalankan di komputer saya, di cloud, atau di server HF?
The most important question: do I run it on my computer, on the cloud, or on HF's servers?

Banyak yang bingung saat pertama kali mengenal Hugging Face: "Ini dijalankan di mana? Di website HF? Di komputer saya? Di cloud?" Jawabannya: semua bisa! Hugging Face bukan satu platform tunggal β€” ia adalah ekosistem yang bisa dipakai dengan berbagai cara. Berikut 6 cara menggunakan HF, dari yang paling mudah sampai production-grade:

Many people get confused when first encountering Hugging Face: "Where does this run? On HF's website? On my computer? On the cloud?" The answer: all of the above! Hugging Face isn't a single platform β€” it's an ecosystem that can be used in various ways. Here are 6 ways to use HF, from easiest to production-grade:

πŸ›€οΈ 6 Cara Menggunakan Hugging Face / 6 Ways to Use Hugging Face GRATIS / FREE: β‘  Google Colab (RECOMMENDED untuk belajar!) ⭐ BEST FOR BEGINNERS β†’ pip install transformers di Colab notebook β†’ GPU T4 gratis (cukup untuk fine-tuning BERT!) β†’ Tidak perlu install apapun di komputer lokal β†’ File: colab.research.google.com β‘‘ Komputer Lokal (laptop/desktop Anda) β†’ pip install transformers di terminal β†’ Model di-download ke ~/.cache/huggingface/ β†’ CPU: inference OK, training lambat β†’ GPU NVIDIA: training cepat (RTX 3060+ recommended) β‘’ HF Inference API (serverless, gratis rate-limited) β†’ Kirim HTTP request ke api-inference.huggingface.co β†’ Tidak perlu download model β€” HF yang jalankan β†’ Rate limit: ~30k tokens/hari (gratis) β†’ Bagus untuk: prototyping, demo kecil β‘£ HF Spaces (hosting demo apps gratis) β†’ Upload Gradio/Streamlit app ke huggingface.co/spaces β†’ Dapat URL publik gratis (username.hf.space) β†’ CPU gratis, GPU mulai $0.60/jam β†’ Bagus untuk: demo, portfolio, sharing BERBAYAR / PAID: β‘€ HF Inference Endpoints (production hosting di HF) β†’ Deploy model ke dedicated server di HF β†’ Auto-scaling, GPU, monitoring β†’ Mulai ~$0.06/jam (CPU) sampai $4.50/jam (A100) β†’ Bagus untuk: production API tanpa manage server β‘₯ Self-Hosting (server Anda sendiri / cloud) β†’ Download model β†’ deploy di AWS/GCP/Azure/VPS β†’ Full control: Docker, Kubernetes, custom infra β†’ Biaya: tergantung server ($5-$1000+/bulan) β†’ Bagus untuk: enterprise, data privacy, custom scaling

β‘  Google Colab β€” REKOMENDASI #1 untuk Belajar

β‘  Google Colab β€” #1 RECOMMENDATION for Learning

Google Colab adalah cara termudah dan tercepat untuk mulai menggunakan Hugging Face. Anda tidak perlu install apapun di komputer β€” cukup buka browser, tulis kode Python, dan jalankan di GPU gratis. Seluruh seri ini bisa diikuti 100% di Colab.

Google Colab is the easiest and fastest way to start using Hugging Face. You don't need to install anything on your computer β€” just open a browser, write Python code, and run it on a free GPU. This entire series can be followed 100% on Colab.

Google Colab β€” Setup dalam 30 Detikpython
# ===========================
# 1. Buka colab.research.google.com
# 2. Runtime β†’ Change runtime type β†’ GPU (T4)
# 3. Jalankan cell berikut:
# ===========================

# Install/update HF libraries (sudah pre-installed, tapi update)
!pip install -q transformers datasets accelerate evaluate

# Verify GPU
import torch
print(f"GPU: {torch.cuda.get_device_name(0)}")
# GPU: Tesla T4

# Test pipeline
from transformers import pipeline
classifier = pipeline("sentiment-analysis", device=0)  # GPU!
print(classifier("Hugging Face is amazing!"))
# [{'label': 'POSITIVE', 'score': 0.9998}]

# βœ… Selesai! Siap fine-tuning BERT dengan GPU gratis!
# Colab T4 = 16GB VRAM β†’ cukup untuk BERT, DistilBERT, RoBERTa
# Tidak cukup untuk: LLaMA 7B+, Stable Diffusion (butuh A100)

β‘‘ Komputer Lokal β€” Untuk Development Sehari-hari

β‘‘ Local Computer β€” For Daily Development

Terminal β€” Local Setupbash
# ===========================
# Setup di laptop/desktop Anda
# ===========================

# 1. Buat virtual environment (recommended!)
python -m venv hf-env
source hf-env/bin/activate  # Linux/Mac
# hf-env\Scripts\activate    # Windows

# 2. Install PyTorch (pilih sesuai GPU Anda)
# CPU only:
pip install torch torchvision torchaudio
# NVIDIA GPU (CUDA 12.x):
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# 3. Install Hugging Face stack
pip install transformers datasets accelerate evaluate

# 4. Test
python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('Hello!'))"

# ===========================
# Di mana model disimpan?
# ===========================
# Model di-download ke cache folder:
# Linux/Mac: ~/.cache/huggingface/hub/
# Windows:   C:\Users\\.cache\huggingface\hub\
#
# BERT base: ~420 MB
# DistilBERT: ~250 MB
# GPT-2 small: ~550 MB
# LLaMA 3.2 1B: ~2.5 GB
# LLaMA 3.2 8B: ~16 GB
#
# Pertama kali download β†’ lambat
# Kedua kali β†’ instant (cached!)
#
# Hapus cache: rm -rf ~/.cache/huggingface/hub/

β‘’ HF Inference API β€” Pakai Model Tanpa Download

β‘’ HF Inference API β€” Use Models Without Downloading

Tidak mau download model besar ke komputer? Gunakan Inference API β€” kirim request HTTP ke server HF, mereka yang jalankan model. Gratis untuk prototyping (rate-limited).

Don't want to download large models to your computer? Use the Inference API β€” send HTTP requests to HF servers, they run the model. Free for prototyping (rate-limited).

hf_inference_api.py β€” Serverless Inferencepython
import requests

# ===========================
# Method 1: Direct HTTP request (no library needed!)
# ===========================
API_URL = "https://api-inference.huggingface.co/models/distilbert-base-uncased-finetuned-sst-2-english"
headers = {"Authorization": "Bearer hf_YOUR_TOKEN_HERE"}
# Get free token: huggingface.co/settings/tokens

response = requests.post(API_URL, headers=headers,
    json={"inputs": "I love this product!"})
print(response.json())
# [[{'label': 'POSITIVE', 'score': 0.9998}]]

# ===========================
# Method 2: huggingface_hub library (easier)
# ===========================
from huggingface_hub import InferenceClient

client = InferenceClient(token="hf_YOUR_TOKEN")

# Text classification
result = client.text_classification("I love this!")
print(result)  # [TextClassificationOutput(label='POSITIVE', score=0.9998)]

# Text generation
result = client.text_generation(
    "The meaning of life is",
    model="gpt2",
    max_new_tokens=50
)
print(result)

# Translation
result = client.translation("I am learning AI",
    model="Helsinki-NLP/opus-mt-en-id")
print(result)  # "Saya sedang belajar AI"

# ===========================
# Kapan pakai Inference API?
# ===========================
# βœ… Prototyping cepat (tidak perlu GPU lokal)
# βœ… Demo kecil (< 1000 requests/hari)
# βœ… Test model baru sebelum download
# ❌ Training / fine-tuning (hanya inference!)
# ❌ Production (rate limited, cold starts)
# ❌ Data sensitif (data dikirim ke server HF)

β‘£ HF Spaces β€” Buat Demo App Gratis

β‘£ HF Spaces β€” Build Free Demo Apps

Spaces = hosting gratis untuk demo ML app. Anda bisa membuat app dengan Gradio atau Streamlit, push ke HF, dan mendapat URL publik. Sempurna untuk portfolio dan sharing.

Spaces = free hosting for ML demo apps. You can build apps with Gradio or Streamlit, push to HF, and get a public URL. Perfect for portfolios and sharing.

app.py β€” Gradio Demo App (upload ke HF Spaces)python
import gradio as gr
from transformers import pipeline

# Load model
classifier = pipeline("sentiment-analysis")

# Define interface
def analyze(text):
    result = classifier(text)[0]
    return f"{result['label']} ({result['score']:.1%})"

# Create Gradio app
demo = gr.Interface(
    fn=analyze,
    inputs=gr.Textbox(placeholder="Type your text here..."),
    outputs="text",
    title="πŸ€— Sentiment Analyzer",
    description="Analyze sentiment of any English text",
)
demo.launch()

# ===========================
# Deploy ke HF Spaces:
# 1. Buat repo di huggingface.co/new-space
# 2. Pilih "Gradio" sebagai SDK
# 3. Upload app.py + requirements.txt
# 4. Otomatis deploy β†’ dapat URL: username.hf.space/sentiment
# 5. GRATIS untuk CPU! GPU mulai $0.60/jam
# ===========================

β‘€ & β‘₯ Production Deployment β€” Inference Endpoints & Self-Hosting

β‘€ & β‘₯ Production Deployment β€” Inference Endpoints & Self-Hosting

production_options.py β€” Production Deploymentpython
# ===========================
# Option 5: HF Inference Endpoints (managed hosting)
# β†’ huggingface.co/inference-endpoints
# ===========================
# 1. Pilih model dari Hub
# 2. Pilih hardware (CPU/GPU/A100)
# 3. Pilih region (US, EU, Asia)
# 4. Deploy β†’ dapat production API URL
# 5. Auto-scaling, monitoring, HTTPS included
# 
# Pricing:
# CPU (2 vCPU):     ~$0.06/jam  (~$43/bulan)
# GPU T4 (16GB):    ~$0.60/jam  (~$432/bulan)
# GPU A10G (24GB):  ~$1.30/jam  (~$936/bulan)
# GPU A100 (80GB):  ~$4.50/jam  (~$3,240/bulan)
# 
# Best for: production API tanpa manage infrastructure

# ===========================
# Option 6: Self-Hosting (Docker on your server)
# ===========================
# A. Simple: FastAPI + model
from fastapi import FastAPI
from transformers import pipeline

app = FastAPI()
classifier = pipeline("sentiment-analysis", device=0)

@app.post("/predict")
async def predict(text: str):
    result = classifier(text)
    return result

# uvicorn app:app --host 0.0.0.0 --port 8000
# Deploy dengan Docker β†’ AWS EC2, GCP VM, DigitalOcean, dll.

# B. Optimized: Text Generation Inference (TGI)
# Docker container dari HF untuk LLM serving
# docker run --gpus all -p 8080:80 \
#   ghcr.io/huggingface/text-generation-inference \
#   --model-id meta-llama/Llama-3.2-1B
# 
# Optimized: continuous batching, flash attention, quantization
# Best for: high-throughput LLM serving

# C. vLLM (alternative to TGI)
# pip install vllm
# python -m vllm.entrypoints.openai.api_server \
#   --model meta-llama/Llama-3.2-1B --port 8000
# β†’ OpenAI-compatible API for any HF model!

πŸŽ“ Rekomendasi Berdasarkan Situasi:
Belajar / ikut seri ini: β†’ β‘  Google Colab (gratis, GPU T4, zero setup) ⭐
Development sehari-hari: β†’ β‘‘ Lokal + Colab untuk training berat
Demo / portfolio: β†’ β‘£ HF Spaces (Gradio app, URL publik gratis)
Prototyping cepat: β†’ β‘’ Inference API (HTTP request, tanpa download)
Production API (startup): β†’ β‘€ Inference Endpoints (managed, auto-scale)
Production API (enterprise): β†’ β‘₯ Self-hosting Docker/Kubernetes (full control)

Penting: Hugging Face Hub = "tempat model disimpan" (seperti GitHub). Model di-download dari Hub ke tempat Anda menjalankannya (Colab, laptop, server). HF Hub BUKAN tempat menjalankan kode β€” kode berjalan di device Anda!

πŸŽ“ Recommendation Based on Situation:
Learning / following this series: β†’ β‘  Google Colab (free, T4 GPU, zero setup) ⭐
Daily development: β†’ β‘‘ Local + Colab for heavy training
Demo / portfolio: β†’ β‘£ HF Spaces (Gradio app, free public URL)
Quick prototyping: β†’ β‘’ Inference API (HTTP request, no download)
Production API (startup): β†’ β‘€ Inference Endpoints (managed, auto-scale)
Production API (enterprise): β†’ β‘₯ Self-hosting Docker/Kubernetes (full control)

Important: Hugging Face Hub = "where models are stored" (like GitHub). Models are downloaded FROM the Hub TO wherever you run them (Colab, laptop, server). The Hub is NOT where code runs β€” code runs on YOUR device!

Flow: Bagaimana Model Sampai ke Anda / How Models Reach You Hugging Face Hub Your Environment (huggingface.co) (tempat kode berjalan) β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ 500k+ Models β”‚ ── download ──► β”‚ β‘  Google Colab β”‚ β”‚ β”‚ (first time β”‚ β‘‘ Laptop/Desktop β”‚ β”‚ bert-base β”‚ ~420MB, β”‚ β‘’ AWS/GCP Server β”‚ β”‚ gpt2 β”‚ lalu cached) β”‚ β‘£ HF Spaces β”‚ β”‚ llama-3.2 β”‚ β”‚ β‘€ HF Endpoints β”‚ β”‚ whisper β”‚ β”‚ β‘₯ Docker container β”‚ β”‚ ... β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό Kode Anda: Model berjalan di SINI! from transformers import pipeline (bukan di website HF) pipe = pipeline("sentiment-analysis") pipe("Hello!") ← inference di device Anda Hub = storage (seperti GitHub untuk code) Inference = di device Anda (Colab, laptop, server) Exception: Inference API (β‘’) & Endpoints (β‘€) β†’ HF servers yang jalankan
CaraBiayaGPUSetupBest For
β‘  ColabGratisT4 (16GB) gratis0 menitBelajar, fine-tuning BERT ⭐
β‘‘ LokalListrikGPU Anda (jika ada)10 menitDevelopment harian
β‘’ Inference APIGratis (rate-limit)HF servers0 menitPrototyping, demo kecil
β‘£ SpacesGratis (CPU)Opsional ($0.60/jam)5 menitDemo apps, portfolio
β‘€ Endpoints$0.06-4.50/jamT4/A10/A1005 menitProduction API
β‘₯ Self-host$5-1000+/blnYour choice30-60 menitEnterprise, privacy
MethodCostGPUSetupBest For
β‘  ColabFreeT4 (16GB) free0 minLearning, BERT fine-tuning ⭐
β‘‘ LocalElectricityYour GPU (if any)10 minDaily development
β‘’ Inference APIFree (rate-limited)HF servers0 minPrototyping, small demos
β‘£ SpacesFree (CPU)Optional ($0.60/hr)5 minDemo apps, portfolio
β‘€ Endpoints$0.06-4.50/hrT4/A10/A1005 minProduction API
β‘₯ Self-host$5-1000+/moYour choice30-60 minEnterprise, privacy

πŸŽ‰ TL;DR untuk Pemula:
1. Buka colab.research.google.com
2. Aktifkan GPU: Runtime β†’ Change runtime type β†’ T4 GPU
3. Ketik: !pip install -q transformers datasets accelerate
4. Ketik: from transformers import pipeline
5. Selesai! Anda sudah bisa menjalankan BERT, GPT-2, Whisper, dll. di cloud gratis.
Model di-download dari Hub ke Colab server β†’ berjalan di GPU T4 Colab β†’ Anda dapat hasil di notebook. Tidak perlu install apapun di laptop Anda.

πŸŽ‰ TL;DR for Beginners:
1. Open colab.research.google.com
2. Enable GPU: Runtime β†’ Change runtime type β†’ T4 GPU
3. Type: !pip install -q transformers datasets accelerate
4. Type: from transformers import pipeline
5. Done! You can now run BERT, GPT-2, Whisper, etc. on a free cloud GPU.
Models are downloaded from the Hub to Colab server β†’ run on Colab's T4 GPU β†’ you get results in the notebook. No need to install anything on your laptop.

πŸš€

3. Pipeline API β€” Inference 1 Baris untuk 20+ Tugas

3. Pipeline API β€” 1-Line Inference for 20+ Tasks

API paling powerful di dunia ML: satu baris kode = download model + tokenize + inference + postprocess
The most powerful API in ML: one line of code = download model + tokenize + inference + postprocess

Pipeline adalah API tertinggi (highest-level) di Hugging Face. Satu function call melakukan segalanya: download model dari Hub, tokenize input, jalankan inference, dan format output. Anda bahkan tidak perlu tahu arsitektur model yang digunakan.

Pipeline is the highest-level API in Hugging Face. One function call does everything: download model from Hub, tokenize input, run inference, and format output. You don't even need to know the model architecture being used.

01_pipeline_basics.py β€” Pipeline Magic ✨python
from transformers import pipeline

# ===========================
# 1. Sentiment Analysis β€” one line!
# ===========================
classifier = pipeline("sentiment-analysis")
# First run: downloads model (~270MB) β€” cached for future use

result = classifier("I absolutely love this product! Best purchase ever.")
print(result)
# [{'label': 'POSITIVE', 'score': 0.9998}]

# Multiple texts at once (batched!)
results = classifier([
    "This movie was fantastic!",
    "Terrible experience, waste of money.",
    "It was okay, nothing special."
])
for r in results:
    print(f"  {r['label']:8s} ({r['score']:.1%})")
# POSITIVE (99.9%)
# NEGATIVE (99.8%)
# POSITIVE (63.1%)  ← uncertain β†’ neutral-ish

# ===========================
# 2. Specify a different model
# ===========================
classifier_multi = pipeline(
    "sentiment-analysis",
    model="nlptown/bert-base-multilingual-uncased-sentiment"
)
# Now supports 6 languages! (EN, DE, NL, ES, FR, IT)
result = classifier_multi("Film ini sangat bagus!")  # Indonesian!
print(result)
# [{'label': '5 stars', 'score': 0.73}]

# ===========================
# 3. GPU acceleration
# ===========================
classifier_gpu = pipeline("sentiment-analysis", device=0)  # GPU:0
# device=0 β†’ first GPU
# device=-1 β†’ CPU (default)
# device="mps" β†’ Apple Silicon

# ===========================
# 4. How pipeline() works internally
# ===========================
# pipeline("sentiment-analysis") is equivalent to:
# 1. tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
# 2. model = AutoModelForSequenceClassification.from_pretrained("...")
# 3. inputs = tokenizer(text, return_tensors="pt")
# 4. outputs = model(**inputs)
# 5. predictions = softmax(outputs.logits)
# 6. label = model.config.id2label[predicted_class]
# Pipeline wraps ALL of this in one call!

πŸŽ“ Pipeline: Apa yang Terjadi di Balik Layar?
Satu panggilan pipeline("sentiment-analysis")("text") melakukan 6 langkah:
1. Download model dari Hugging Face Hub (pertama kali saja, lalu di-cache)
2. Tokenize input β€” text β†’ subword tokens β†’ integer IDs + attention mask
3. Forward pass β€” jalankan model Transformer (BERT/DistilBERT/etc.)
4. Post-process β€” logits β†’ softmax β†’ probabilities
5. Map to labels β€” index β†’ "POSITIVE"/"NEGATIVE"
6. Format output β€” return list of dicts dengan label dan score
Anda akan belajar SEMUA langkah ini secara manual di section 7-9!

πŸŽ“ Pipeline: What Happens Behind the Scenes?
One call to pipeline("sentiment-analysis")("text") performs 6 steps:
1. Download model from Hugging Face Hub (first time only, then cached)
2. Tokenize input β€” text β†’ subword tokens β†’ integer IDs + attention mask
3. Forward pass β€” run Transformer model (BERT/DistilBERT/etc.)
4. Post-process β€” logits β†’ softmax β†’ probabilities
5. Map to labels β€” index β†’ "POSITIVE"/"NEGATIVE"
6. Format output β€” return list of dicts with label and score
You'll learn ALL of these steps manually in sections 7-9!

πŸ“

4. Pipeline NLP Tasks β€” Sentiment, NER, QA, Translation, Summarization

4. Pipeline NLP Tasks β€” Sentiment, NER, QA, Translation, Summarization

Satu API untuk semua tugas NLP β€” ganti nama task, dapat model baru
One API for all NLP tasks β€” change the task name, get a new model
02_nlp_pipelines.py β€” Semua NLP Pipelinepython
from transformers import pipeline

# ===========================
# 1. Named Entity Recognition (NER)
# Identifikasi entitas: orang, tempat, organisasi
# ===========================
ner = pipeline("ner", grouped_entities=True)
result = ner("Joko Widodo visited Google headquarters in Mountain View, California.")
for entity in result:
    print(f"  {entity['word']:20s} β†’ {entity['entity_group']:5s} ({entity['score']:.1%})")
# Joko Widodo          β†’ PER   (99.8%)
# Google               β†’ ORG   (99.6%)
# Mountain View        β†’ LOC   (99.9%)
# California           β†’ LOC   (99.9%)

# ===========================
# 2. Question Answering (extractive)
# Jawab pertanyaan berdasarkan konteks
# ===========================
qa = pipeline("question-answering")
result = qa(
    question="What is the capital of France?",
    context="France is a country in Europe. Its capital is Paris, a city known for the Eiffel Tower."
)
print(f"Answer: {result['answer']} (score: {result['score']:.1%})")
# Answer: Paris (score: 98.7%)

# ===========================
# 3. Text Summarization
# ===========================
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
article = """
Hugging Face has raised $235 million in a Series D funding round, 
bringing the company's valuation to $4.5 billion. The round was led 
by Salesforce Ventures, with participation from Google, Amazon, NVIDIA, 
Intel, AMD, and Qualcomm. The company plans to use the funding to 
expand its open-source AI platform and hire more researchers.
"""
summary = summarizer(article, max_length=50, min_length=20)
print(summary[0]['summary_text'])
# "Hugging Face raised $235M at $4.5B valuation, led by Salesforce..."

# ===========================
# 4. Translation
# ===========================
translator = pipeline("translation_en_to_fr")
result = translator("Hugging Face is the best AI platform.")
print(result[0]['translation_text'])
# "Hugging Face est la meilleure plateforme d'IA."

# Multi-language: Helsinki-NLP models
id_to_en = pipeline("translation", model="Helsinki-NLP/opus-mt-id-en")
result = id_to_en("Saya sedang belajar kecerdasan buatan.")
print(result[0]['translation_text'])
# "I'm learning artificial intelligence."

# ===========================
# 5. Text Generation (GPT-style)
# ===========================
generator = pipeline("text-generation", model="gpt2")
result = generator(
    "Artificial intelligence will",
    max_length=50,
    num_return_sequences=2,    # generate 2 variations
    temperature=0.7,           # creativity (0=deterministic, 1=random)
    do_sample=True
)
for i, r in enumerate(result):
    print(f"  Variation {i+1}: {r['generated_text'][:80]}...")

# ===========================
# 6. Fill-Mask (BERT-style)
# ===========================
fill = pipeline("fill-mask")
results = fill("The capital of Indonesia is [MASK].")
for r in results[:3]:
    print(f"  {r['token_str']:10s} ({r['score']:.1%})")
# Jakarta    (92.3%)
# Bandung    (2.1%)
# Surabaya   (1.4%)
πŸ–ΌοΈ

5. Pipeline Beyond NLP β€” Image, Audio, Zero-Shot

5. Pipeline Beyond NLP β€” Image, Audio, Zero-Shot

Hugging Face bukan hanya untuk teks β€” juga gambar, audio, dan multimodal
Hugging Face isn't just for text β€” also images, audio, and multimodal
03_beyond_nlp.py β€” Image, Audio & Zero-Shot Pipelinespython
from transformers import pipeline

# ===========================
# 1. Image Classification
# ===========================
img_classifier = pipeline("image-classification")
result = img_classifier("https://upload.wikimedia.org/wikipedia/commons/4/4d/Cat_November_2010-1a.jpg")
for r in result[:3]:
    print(f"  {r['label']:30s} ({r['score']:.1%})")
# tabby, tabby cat              (43.2%)
# Egyptian cat                  (22.1%)
# tiger cat                     (13.8%)

# ===========================
# 2. Object Detection
# ===========================
detector = pipeline("object-detection")
results = detector("https://example.com/street_scene.jpg")
for r in results:
    print(f"  {r['label']:10s} ({r['score']:.1%}) at {r['box']}")
# car        (97.2%) at {'xmin': 12, 'ymin': 50, ...}
# person     (95.1%) at {'xmin': 200, 'ymin': 30, ...}

# ===========================
# 3. Zero-Shot Classification (NO TRAINING NEEDED!)
# Classify text into ANY categories β€” even ones the model never saw!
# ===========================
zero_shot = pipeline("zero-shot-classification")
result = zero_shot(
    "Harga saham Tesla naik 15% setelah pengumuman earnings Q4.",
    candidate_labels=["finance", "sports", "technology", "politics", "health"]
)
for label, score in zip(result['labels'], result['scores']):
    print(f"  {label:12s}: {score:.1%}")
# finance     : 78.3%
# technology  : 15.2%
# politics    :  3.8%
# sports      :  1.5%
# health      :  1.2%

# ===========================
# 4. Automatic Speech Recognition
# ===========================
# asr = pipeline("automatic-speech-recognition", model="openai/whisper-base")
# result = asr("audio_file.mp3")
# print(result["text"])  # "Hello, how are you today?"
# Whisper supports 99 languages including Indonesian!

# ===========================
# 5. Text-to-Speech
# ===========================
# tts = pipeline("text-to-speech", model="microsoft/speecht5_tts")
# audio = tts("Hello, welcome to the Hugging Face tutorial!")
# # Returns audio array that can be saved as .wav

πŸŽ‰ Zero-Shot Classification β€” Superpower!
Zero-shot = klasifikasi tanpa training sama sekali. Anda cukup memberikan kategori yang diinginkan sebagai teks, dan model mencocokkan input dengan kategori tersebut menggunakan natural language understanding. Cocok untuk: prototyping cepat, label discovery, klasifikasi dengan kategori yang sering berubah.

πŸŽ‰ Zero-Shot Classification β€” Superpower!
Zero-shot = classification without any training. You just provide desired categories as text, and the model matches input to categories using natural language understanding. Great for: rapid prototyping, label discovery, classification with frequently changing categories.

Pipeline TaskDeskripsiDefault ModelInput β†’ Output
sentiment-analysisSentiment positif/negatifDistilBERT SST-2teks β†’ label + score
nerNamed Entity RecognitionBERT NERteks β†’ entitas + tipe
question-answeringJawab dari konteksDistilBERT SQuADquestion + context β†’ answer
summarizationRingkas teks panjangBART CNNteks panjang β†’ ringkasan
translation_xx_to_yyTerjemahanHelsinki-NLPteks bahasa A β†’ bahasa B
text-generationGenerate teks (GPT-style)GPT-2prompt β†’ teks lanjutan
fill-maskPrediksi kata yang hilangBERT baseteks + [MASK] β†’ kata
zero-shot-classificationKlasifikasi tanpa trainingBART MNLIteks + labels β†’ scores
image-classificationKlasifikasi gambarViT ImageNetgambar β†’ label + score
object-detectionDeteksi objekDETRgambar β†’ boxes + labels
automatic-speech-recognitionSpeech to textWhisperaudio β†’ teks
Pipeline TaskDescriptionDefault ModelInput β†’ Output
sentiment-analysisPositive/negative sentimentDistilBERT SST-2text β†’ label + score
nerNamed Entity RecognitionBERT NERtext β†’ entities + types
question-answeringAnswer from contextDistilBERT SQuADquestion + context β†’ answer
summarizationSummarize long textBART CNNlong text β†’ summary
translation_xx_to_yyTranslationHelsinki-NLPlanguage A text β†’ language B
text-generationGenerate text (GPT-style)GPT-2prompt β†’ continuation
fill-maskPredict missing wordBERT basetext + [MASK] β†’ word
zero-shot-classificationClassify without trainingBART MNLItext + labels β†’ scores
image-classificationClassify imagesViT ImageNetimage β†’ label + score
object-detectionDetect objectsDETRimage β†’ boxes + labels
automatic-speech-recognitionSpeech to textWhisperaudio β†’ text
🌐

6. Model Hub β€” 500k+ Models, Cara Memilih yang Tepat

6. Model Hub β€” 500k+ Models, Choosing the Right One

huggingface.co/models β€” filter by task, language, size, license
huggingface.co/models β€” filter by task, language, size, license

Dengan 500k+ model di Hub, bagaimana memilih yang tepat? Gunakan filter: task (sentiment, NER, dll), language (Indonesian, English), library (PyTorch, TensorFlow), dataset (model trained on what data), dan license (open vs restricted). Sort by downloads atau likes untuk model terpopuler.

With 500k+ models on the Hub, how to choose the right one? Use filters: task (sentiment, NER, etc.), language (Indonesian, English), library (PyTorch, TensorFlow), dataset (model trained on what data), and license (open vs restricted). Sort by downloads or likes for most popular models.

04_model_hub.py β€” Browse & Download Modelspython
from huggingface_hub import HfApi, list_models

# ===========================
# 1. Search models programmatically
# ===========================
api = HfApi()
models = api.list_models(
    filter="text-classification",
    sort="downloads",
    direction=-1,
    limit=5
)
for m in models:
    print(f"  {m.id:50s} ↓{m.downloads:>10,}")
# distilbert-base-uncased-finetuned-sst-2-english  ↓ 85,432,100
# nlptown/bert-base-multilingual-uncased-sentiment  ↓ 12,345,000
# cardiffnlp/twitter-roberta-base-sentiment-latest  ↓  8,765,000

# ===========================
# 2. Indonesian NLP models
# ===========================
id_models = api.list_models(
    filter="text-classification",
    search="indonesian",
    sort="downloads",
    direction=-1,
    limit=5
)
for m in id_models:
    print(f"  {m.id}")
# indobenchmark/indobert-base-p1
# indolem/indobert-base-uncased
# cahya/bert-base-indonesian-522M

# ===========================
# 3. Model naming convention
# ===========================
# Format: organization/model-name
# Examples:
# google-bert/bert-base-uncased         ← Google's BERT
# meta-llama/Llama-3.2-1B              ← Meta's LLaMA
# openai-community/gpt2                ← OpenAI's GPT-2
# facebook/bart-large-cnn              ← Meta's BART
# sentence-transformers/all-MiniLM-L6-v2 ← sentence embeddings

# ===========================
# 4. Download model manually (for offline use)
# ===========================
from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained("bert-base-uncased")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
# Downloads to ~/.cache/huggingface/ (~420MB for BERT base)

# Save locally
model.save_pretrained("./my_bert")
tokenizer.save_pretrained("./my_bert")

# Load from local
model = AutoModel.from_pretrained("./my_bert")
tokenizer = AutoTokenizer.from_pretrained("./my_bert")

πŸŽ“ Tips Memilih Model:
Prototyping: Mulai dengan default pipeline (biasanya DistilBERT β€” cepat dan bagus).
Production English: roberta-base atau deberta-v3-base (lebih akurat dari BERT).
Production Indonesian: indobert-base atau cahya/bert-base-indonesian.
Multilingual: xlm-roberta-base (100+ bahasa termasuk Indonesia).
Speed priority: DistilBERT (40% lebih cepat, 97% akurasi BERT).
LLM/Chat: meta-llama/Llama-3.2, Qwen/Qwen2.5, mistralai/Mistral.

πŸŽ“ Tips for Choosing Models:
Prototyping: Start with default pipeline (usually DistilBERT β€” fast and good).
Production English: roberta-base or deberta-v3-base (more accurate than BERT).
Production Indonesian: indobert-base or cahya/bert-base-indonesian.
Multilingual: xlm-roberta-base (100+ languages including Indonesian).
Speed priority: DistilBERT (40% faster, 97% of BERT accuracy).
LLM/Chat: meta-llama/Llama-3.2, Qwen/Qwen2.5, mistralai/Mistral.

πŸ”§

7. Auto Classes β€” AutoModel, AutoTokenizer, AutoConfig

7. Auto Classes β€” AutoModel, AutoTokenizer, AutoConfig

Satu API universal yang otomatis memilih class yang tepat berdasarkan model name
One universal API that automatically selects the right class based on model name

Auto Classes adalah abstraksi brilliant dari Hugging Face: Anda tidak perlu tahu apakah model itu BERT, RoBERTa, GPT-2, atau T5 β€” cukup gunakan AutoModel dan ia akan otomatis memilih class yang tepat. Ini memungkinkan Anda mengganti model tanpa mengubah kode.

Auto Classes are a brilliant abstraction from Hugging Face: you don't need to know if the model is BERT, RoBERTa, GPT-2, or T5 β€” just use AutoModel and it automatically selects the right class. This lets you swap models without changing code.

05_auto_classes.py β€” Auto Classes Deep Divepython
from transformers import (
    AutoModel, AutoTokenizer, AutoConfig,
    AutoModelForSequenceClassification,
    AutoModelForTokenClassification,
    AutoModelForQuestionAnswering,
    AutoModelForCausalLM,
    AutoModelForSeq2SeqLM,
)

# ===========================
# 1. AutoTokenizer β€” universal tokenizer loader
# ===========================
# Doesn't matter if model uses WordPiece, BPE, or SentencePiece!
tokenizer_bert = AutoTokenizer.from_pretrained("bert-base-uncased")      # WordPiece
tokenizer_gpt = AutoTokenizer.from_pretrained("gpt2")                    # BPE
tokenizer_t5 = AutoTokenizer.from_pretrained("google-t5/t5-small")      # SentencePiece
tokenizer_llama = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")  # BPE

# All have the SAME interface!
for name, tok in [("BERT", tokenizer_bert), ("GPT-2", tokenizer_gpt), ("T5", tokenizer_t5)]:
    encoded = tok("Hello world", return_tensors="pt")
    print(f"  {name:6s}: {encoded['input_ids'][0].tolist()}")
# BERT  : [101, 7592, 2088, 102]           ← [CLS] hello world [SEP]
# GPT-2 : [15496, 995]                      ← hello world (no special tokens)
# T5    : [8774, 296, 1]                    ← Hello▁world 

# ===========================
# 2. AutoModel β€” base model (no head)
# ===========================
model = AutoModel.from_pretrained("bert-base-uncased")
print(f"Type: {type(model).__name__}")  # BertModel
print(f"Params: {model.num_parameters():,}")  # 109,482,240
# Output: last_hidden_state (batch, seq_len, hidden_size)
# β†’ Raw embeddings, NO classification head

# ===========================
# 3. AutoModelForSequenceClassification β€” with classifier head
# ===========================
model_cls = AutoModelForSequenceClassification.from_pretrained(
    "bert-base-uncased",
    num_labels=3  # positive, negative, neutral
)
print(f"Type: {type(model_cls).__name__}")  # BertForSequenceClassification
# Output: logits (batch, num_labels) β†’ ready for classification!

# ===========================
# 4. Task-specific Auto Classes
# ===========================
# AutoModelForSequenceClassification  β†’ sentiment, topic classification
# AutoModelForTokenClassification     β†’ NER, POS tagging
# AutoModelForQuestionAnswering       β†’ extractive QA
# AutoModelForCausalLM                β†’ text generation (GPT-style)
# AutoModelForSeq2SeqLM               β†’ translation, summarization (T5-style)
# AutoModelForMaskedLM                β†’ fill-mask (BERT-style)
# AutoModelForImageClassification     β†’ image classification (ViT)
# AutoModelForObjectDetection         β†’ object detection (DETR)

# ===========================
# 5. AutoConfig β€” model configuration
# ===========================
config = AutoConfig.from_pretrained("bert-base-uncased")
print(f"Hidden size: {config.hidden_size}")       # 768
print(f"Num layers:  {config.num_hidden_layers}")  # 12
print(f"Num heads:   {config.num_attention_heads}") # 12
print(f"Vocab size:  {config.vocab_size}")         # 30522
Auto Classes β€” Satu API untuk Semua Model / One API for All Models AutoTokenizer.from_pretrained("model_name") β”‚ β”œβ”€β”€ bert-base-uncased β†’ BertTokenizerFast (WordPiece) β”œβ”€β”€ gpt2 β†’ GPT2TokenizerFast (BPE) β”œβ”€β”€ google-t5/t5-small β†’ T5TokenizerFast (SentencePiece) └── meta-llama/Llama-3.2 β†’ LlamaTokenizerFast (BPE) AutoModelForSequenceClassification.from_pretrained("model_name") β”‚ β”œβ”€β”€ bert-base-uncased β†’ BertForSequenceClassification β”œβ”€β”€ roberta-base β†’ RobertaForSequenceClassification β”œβ”€β”€ distilbert-base β†’ DistilBertForSequenceClassification └── xlm-roberta-base β†’ XLMRobertaForSequenceClassification Same code, different model β€” just change the model name string! model_name = "bert-base-uncased" # β†’ BERT model_name = "roberta-base" # β†’ RoBERTa (same code!) model_name = "xlm-roberta-base" # β†’ XLM-R multilingual (same code!)
βœ‚οΈ

8. Tokenisasi Mendalam β€” WordPiece, BPE, SentencePiece

8. Deep Dive: Tokenization β€” WordPiece, BPE, SentencePiece

Memahami SETIAP langkah: text β†’ tokens β†’ IDs β†’ attention mask β†’ model input
Understanding EVERY step: text β†’ tokens β†’ IDs β†’ attention mask β†’ model input
06_tokenization.py β€” Tokenization Deep Dive πŸ”¬python
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

# ===========================
# 1. Step by step tokenization
# ===========================
text = "Hugging Face's tokenizers are incredibly fast!"

# Step 1: Tokenize (split into subwords)
tokens = tokenizer.tokenize(text)
print(f"Tokens: {tokens}")
# ['hugging', 'face', "'", 's', 'token', '##ize', '##rs', 'are', 'incredibly', 'fast', '!']
# Note: "tokenizers" β†’ ["token", "##ize", "##rs"] (WordPiece subwords!)
# "##" prefix means "continuation of previous word"

# Step 2: Convert to IDs
ids = tokenizer.convert_tokens_to_ids(tokens)
print(f"IDs: {ids}")
# [17662, 2227, 1005, 1055, 19204, 4697, 2869, 2024, 12978, 3435, 999]

# Step 3: Add special tokens + create attention mask
encoded = tokenizer(text, return_tensors="pt")
print(f"input_ids:      {encoded['input_ids'][0].tolist()}")
print(f"attention_mask: {encoded['attention_mask'][0].tolist()}")
print(f"token_type_ids: {encoded['token_type_ids'][0].tolist()}")
# input_ids:      [101, 17662, 2227, ..., 999, 102]    ← [CLS] ... [SEP]
# attention_mask: [1, 1, 1, ..., 1, 1]                  ← all real tokens
# token_type_ids: [0, 0, 0, ..., 0, 0]                  ← single sentence

# ===========================
# 2. Decode back to text
# ===========================
decoded = tokenizer.decode(encoded['input_ids'][0])
print(f"Decoded: {decoded}")
# "[CLS] hugging face's tokenizers are incredibly fast! [SEP]"

decoded_skip = tokenizer.decode(encoded['input_ids'][0], skip_special_tokens=True)
print(f"Clean:   {decoded_skip}")
# "hugging face's tokenizers are incredibly fast!"

# ===========================
# 3. Padding & Truncation
# ===========================
texts = ["Short text.", "This is a much longer sentence that has more words in it."]

# Without padding: different lengths β†’ can't batch!
for t in texts:
    enc = tokenizer(t)
    print(f"  Length: {len(enc['input_ids'])}")
# Length: 4
# Length: 14  ← different! Can't make a tensor

# With padding + truncation: same length β†’ can batch!
batch = tokenizer(texts,
    padding=True,           # pad shorter sequences
    truncation=True,         # truncate if too long
    max_length=128,          # max sequence length
    return_tensors="pt"     # return PyTorch tensors
)
print(f"Batch shape: {batch['input_ids'].shape}")
# Batch shape: torch.Size([2, 14])  ← padded to longest!
print(f"Attention mask: {batch['attention_mask'][0].tolist()}")
# [1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
# 1=real token, 0=padding β†’ model IGNORES padding!

# ===========================
# 4. Special tokens per model
# ===========================
print(f"BERT special tokens:")
print(f"  CLS: {tokenizer.cls_token} (ID: {tokenizer.cls_token_id})")  # [CLS] = 101
print(f"  SEP: {tokenizer.sep_token} (ID: {tokenizer.sep_token_id})")  # [SEP] = 102
print(f"  PAD: {tokenizer.pad_token} (ID: {tokenizer.pad_token_id})")  # [PAD] = 0
print(f"  UNK: {tokenizer.unk_token} (ID: {tokenizer.unk_token_id})")  # [UNK] = 100
print(f"  Vocab size: {tokenizer.vocab_size}")  # 30522

# ===========================
# 5. Sentence pairs (for NLI, QA, etc.)
# ===========================
encoded_pair = tokenizer(
    "What is the capital?",     # sentence A
    "The capital of France is Paris.",  # sentence B
    return_tensors="pt"
)
print(encoded_pair['token_type_ids'][0].tolist())
# [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1]
# 0=sentence A, 1=sentence B
# [CLS] What is the capital ? [SEP] The capital of France is Paris . [SEP]

πŸŽ“ WordPiece vs BPE vs SentencePiece:
WordPiece (BERT): Split kata yang tidak dikenal menjadi subword. "tokenizers" β†’ ["token", "##ize", "##rs"]. Prefix ## = lanjutan.
BPE (GPT-2, RoBERTa): Byte Pair Encoding β€” merge pasangan byte paling sering. "lower" β†’ ["low", "er"]. Prefix Δ  = awal kata baru.
SentencePiece (T5, LLaMA): Language-agnostic, treat semua input sebagai byte sequence. ▁ = space/word boundary. Bekerja untuk SEMUA bahasa tanpa preprocessing.
Anda tidak perlu memilih β€” AutoTokenizer otomatis load tokenizer yang tepat untuk setiap model!

πŸŽ“ WordPiece vs BPE vs SentencePiece:
WordPiece (BERT): Split unknown words into subwords. "tokenizers" β†’ ["token", "##ize", "##rs"]. ## prefix = continuation.
BPE (GPT-2, RoBERTa): Byte Pair Encoding β€” merge most frequent byte pairs. "lower" β†’ ["low", "er"]. Δ  prefix = new word start.
SentencePiece (T5, LLaMA): Language-agnostic, treats all input as byte sequence. ▁ = space/word boundary. Works for ALL languages without preprocessing.
You don't need to choose β€” AutoTokenizer automatically loads the right tokenizer for each model!

πŸ”¬

9. Dari Tokenizer ke Model β€” Full Forward Pass Manual

9. From Tokenizer to Model β€” Full Manual Forward Pass

Memahami apa yang pipeline() lakukan di balik layar β€” langkah per langkah
Understanding what pipeline() does behind the scenes β€” step by step
07_manual_forward.py β€” Full Manual Inference πŸ”¬python
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# ===========================
# Step 1: Load tokenizer & model
# ===========================
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# ===========================
# Step 2: Tokenize input
# ===========================
text = "I absolutely love learning about Hugging Face!"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
print(f"Input IDs shape: {inputs['input_ids'].shape}")
print(f"Tokens: {tokenizer.convert_ids_to_tokens(inputs['input_ids'][0])}")
# ['[CLS]', 'i', 'absolutely', 'love', 'learning', 'about', 'hugging', 'face', '!', '[SEP]']

# ===========================
# Step 3: Forward pass (no gradient needed for inference!)
# ===========================
with torch.no_grad():  # disable gradient computation β†’ faster + less memory
    outputs = model(**inputs)

print(f"Output type: {type(outputs)}")
# SequenceClassifierOutput
print(f"Logits: {outputs.logits}")
# tensor([[-4.2532,  4.5687]])  ← raw scores (NOT probabilities!)

# ===========================
# Step 4: Post-process β€” logits β†’ probabilities
# ===========================
probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
print(f"Probabilities: {probabilities}")
# tensor([[0.0001, 0.9999]])  ← [NEGATIVE, POSITIVE]

# ===========================
# Step 5: Map to label
# ===========================
predicted_class = torch.argmax(probabilities, dim=-1).item()
label = model.config.id2label[predicted_class]
confidence = probabilities[0][predicted_class].item()

print(f"\\n🎯 Prediction: {label} ({confidence:.1%})")
# 🎯 Prediction: POSITIVE (99.99%)

# ===========================
# Compare with pipeline (should be identical!)
# ===========================
from transformers import pipeline
pipe = pipeline("sentiment-analysis", model=model_name)
print(f"Pipeline: {pipe(text)}")
# [{'label': 'POSITIVE', 'score': 0.9999}] ← identical! βœ“

πŸŽ‰ Sekarang Anda Paham Seluruh Flow!
Pipeline = Step 1-5 di atas digabung jadi satu baris. Tapi memahami setiap langkah penting karena: (1) Anda bisa custom preprocessing, (2) Anda bisa custom postprocessing, (3) Anda bisa debug masalah, dan (4) Fine-tuning (Page 2-3) membutuhkan pemahaman tentang tokenizer + model secara terpisah.

πŸŽ‰ Now You Understand the Entire Flow!
Pipeline = Steps 1-5 above combined into one line. But understanding each step matters because: (1) you can customize preprocessing, (2) you can customize postprocessing, (3) you can debug issues, and (4) Fine-tuning (Pages 2-3) requires understanding tokenizer + model separately.

🎯

10. First Look: Fine-Tuning BERT β€” Preview Page 2

10. First Look: Fine-Tuning BERT β€” Page 2 Preview

Sneak peek: dari pre-trained model ke classifier custom Anda β€” dalam 20 baris
Sneak peek: from pre-trained model to your custom classifier β€” in 20 lines
08_finetuning_preview.py β€” First Taste of Fine-Tuning πŸ”₯python
from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments
from datasets import load_dataset

# ===========================
# Fine-tune BERT on IMDB β€” PREVIEW (Page 2 = full version)
# ===========================

# 1. Load dataset
dataset = load_dataset("imdb")
print(dataset)
# DatasetDict({'train': Dataset(25000 rows), 'test': Dataset(25000 rows)})

# 2. Load tokenizer & model
model_name = "distilbert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)

# 3. Tokenize dataset
def tokenize(batch):
    return tokenizer(batch["text"], truncation=True, padding="max_length", max_length=256)

tokenized = dataset.map(tokenize, batched=True)

# 4. Training arguments
args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=64,
    eval_strategy="epoch",
    learning_rate=2e-5,
    weight_decay=0.01,
    fp16=True,             # mixed precision!
)

# 5. Create Trainer & train!
trainer = Trainer(
    model=model,
    args=args,
    train_dataset=tokenized["train"],
    eval_dataset=tokenized["test"],
    tokenizer=tokenizer,
)

trainer.train()
# β†’ 93%+ accuracy on IMDB in ~15 minutes on Google Colab T4!
# Compare: BiLSTM from TF series = 87%. BERT = 93%+. That's the power!

# Page 2 will cover: full Trainer API, custom metrics, hyperparameter
# tuning, data collators, and pushing models to the Hub.

🎯 Preview: 93%+ IMDB Accuracy dalam 15 Menit!
Bandingkan dengan seri sebelumnya:
β€’ Seri NN (manual NumPy): ~80% (ratusan baris kode, berjam-jam training)
β€’ Seri TF Page 5 (BiLSTM): ~87% (25 baris, 30 menit training)
β€’ Seri TF Page 6 (BERT TF Hub): ~95% (lebih kompleks setup)
β€’ Hugging Face (Trainer API): 93%+ (20 baris, 15 menit!) πŸ†
Page 2 akan membahas ini secara mendalam β€” stay tuned!

🎯 Preview: 93%+ IMDB Accuracy in 15 Minutes!
Compare with previous series:
β€’ NN Series (manual NumPy): ~80% (hundreds of lines, hours of training)
β€’ TF Series Page 5 (BiLSTM): ~87% (25 lines, 30 min training)
β€’ TF Series Page 6 (BERT TF Hub): ~95% (more complex setup)
β€’ Hugging Face (Trainer API): 93%+ (20 lines, 15 minutes!) πŸ†
Page 2 will cover this in depth β€” stay tuned!

πŸ“

11. Ringkasan Page 1

11. Page 1 Summary

Semua yang sudah kita pelajari
Everything we learned
KonsepApa ItuKode Kunci
Pipeline1-line inference untuk 20+ taskspipeline("sentiment-analysis")(text)
Model Hub500k+ models siap downloadhuggingface.co/models
AutoTokenizerUniversal tokenizer loaderAutoTokenizer.from_pretrained(name)
AutoModelUniversal model loaderAutoModelForXxx.from_pretrained(name)
TokenizationText β†’ tokens β†’ IDs β†’ tensorstokenizer(text, return_tensors="pt")
Padding/TruncationFixed-length batchingpadding=True, truncation=True
Forward Passmodel(**inputs) β†’ logitsoutputs = model(**inputs)
Post-processlogits β†’ softmax β†’ labelsoftmax(logits) β†’ argmax β†’ id2label
Zero-ShotClassify tanpa trainingpipeline("zero-shot-classification")
Trainer (preview)Fine-tuning APITrainer(model, args, train_dataset)
ConceptWhat It IsKey Code
Pipeline1-line inference for 20+ taskspipeline("sentiment-analysis")(text)
Model Hub500k+ ready-to-download modelshuggingface.co/models
AutoTokenizerUniversal tokenizer loaderAutoTokenizer.from_pretrained(name)
AutoModelUniversal model loaderAutoModelForXxx.from_pretrained(name)
TokenizationText β†’ tokens β†’ IDs β†’ tensorstokenizer(text, return_tensors="pt")
Padding/TruncationFixed-length batchingpadding=True, truncation=True
Forward Passmodel(**inputs) β†’ logitsoutputs = model(**inputs)
Post-processlogits β†’ softmax β†’ labelsoftmax(logits) β†’ argmax β†’ id2label
Zero-ShotClassify without trainingpipeline("zero-shot-classification")
Trainer (preview)Fine-tuning APITrainer(model, args, train_dataset)
πŸ“˜

Coming Next: Page 2 β€” Fine-Tuning BERT & Trainer API

Deep dive fine-tuning! Page 2 membahas: Datasets library (load, preprocess, tokenize), Trainer API lengkap (TrainingArguments, callbacks, logging), fine-tuning BERT/DistilBERT/RoBERTa untuk text classification, custom metrics (F1, precision, recall), data collator dan dynamic padding, push model ke Hugging Face Hub, dan hyperparameter tuning. Dari IMDB sentiment sampai custom dataset Anda sendiri!

πŸ“˜

Coming Next: Page 2 β€” Fine-Tuning BERT & Trainer API

Deep dive into fine-tuning! Page 2 covers: Datasets library (load, preprocess, tokenize), complete Trainer API (TrainingArguments, callbacks, logging), fine-tuning BERT/DistilBERT/RoBERTa for text classification, custom metrics (F1, precision, recall), data collator and dynamic padding, pushing models to Hugging Face Hub, and hyperparameter tuning. From IMDB sentiment to your own custom datasets!