π Daftar Isi β Page 1
π Table of Contents β Page 1
- Apa Itu Neural Network? β Otak buatan dalam kode
- Instalasi β Hanya butuh NumPy
- Perceptron β Unit terkecil neural network
- Activation Functions β Gerbang keputusan neuron
- Forward Propagation β Data mengalir ke depan
- Backpropagation β Mesin belajar: hitung error mundur
- Neural Network Pertama β Bangun dari nol, bisa belajar sendiri
- Ringkasan & Preview Page 2
- What Is a Neural Network? β An artificial brain in code
- Installation β Only NumPy needed
- Perceptron β The smallest unit of a neural network
- Activation Functions β Neuron decision gates
- Forward Propagation β Data flows forward
- Backpropagation β The learning engine: error flows backward
- Your First Neural Network β Built from scratch, learns on its own
- Summary & Page 2 Preview
1. Apa Itu Neural Network?
1. What Is a Neural Network?
Neural Network (jaringan saraf tiruan) adalah model komputasi yang terinspirasi dari cara kerja otak manusia. Sama seperti otak kita yang terdiri dari miliaran neuron yang saling terhubung, neural network terdiri dari node-node (neuron buatan) yang tersusun dalam layer dan saling terkoneksi.
A Neural Network is a computational model inspired by how the human brain works. Just like our brain is made of billions of interconnected neurons, a neural network consists of nodes (artificial neurons) arranged in layers and connected to each other.
Setiap koneksi punya weight (bobot) β angka yang menentukan seberapa penting sinyal tersebut. Neural network belajar dengan menyesuaikan weight ini berdasarkan data, sampai bisa membuat prediksi yang akurat.
Each connection has a weight β a number that determines how important that signal is. A neural network learns by adjusting these weights based on data, until it can make accurate predictions.
π‘ Analogi Sederhana
π‘ Simple Analogy
Bayangkan Anda seorang juri masak. Setiap hidangan punya beberapa aspek: rasa, tampilan, aroma. Anda memberi bobot berbeda untuk tiap aspek (misal rasa 60%, tampilan 25%, aroma 15%), lalu mengombinasikannya untuk memberi skor akhir. Weight = bobot tiap aspek. Training = proses Anda belajar dari pengalaman untuk menyesuaikan bobot-bobot ini agar penilaian semakin akurat.
Imagine you're a cooking judge. Each dish has several aspects: taste, appearance, aroma. You assign different weights to each (e.g. taste 60%, appearance 25%, aroma 15%), then combine them for a final score. Weight = the importance of each aspect. Training = the process of learning from experience to adjust these weights for increasingly accurate scoring.
ποΈ Arsitektur Dasar Neural Network
ποΈ Basic Neural Network Architecture
π Tiga Komponen Utama:
Input Layer β menerima data mentah (fitur).
Hidden Layer β memproses data, menemukan pola. Bisa lebih dari satu layer (= "deep" learning).
Output Layer β menghasilkan prediksi akhir.
π Three Main Components:
Input Layer β receives raw data (features).
Hidden Layer β processes data, discovers patterns. Can be more than one layer (= "deep" learning).
Output Layer β produces the final prediction.
2. Instalasi β Hanya Butuh NumPy
2. Installation β Only NumPy Needed
Dalam tutorial ini, kita tidak pakai TensorFlow atau PyTorch. Kita bangun neural network dari nol menggunakan Python murni + NumPy saja. Ini cara terbaik untuk benar-benar memahami apa yang terjadi di balik layar.
In this tutorial, we won't use TensorFlow or PyTorch. We'll build a neural network from scratch using only pure Python + NumPy. This is the best way to truly understand what happens under the hood.
# Install NumPy pip install numpy # Verify python -c "import numpy; print(numpy.__version__)" # Output: 2.2.x (or latest) # Optional: matplotlib for visualization pip install matplotlib
π‘ Tip: Gunakan Google Colab jika ingin langsung mulai tanpa install apapun β NumPy sudah pre-installed.
π‘ Tip: Use Google Colab if you want to start right away without installing anything β NumPy is already pre-installed.
3. Perceptron β Unit Terkecil Neural Network
3. Perceptron β The Smallest Unit of a Neural Network
Perceptron adalah neuron paling sederhana. Ia menerima beberapa input, mengalikan masing-masing dengan weight, menjumlahkannya, menambahkan bias, lalu melewatkan hasilnya melalui activation function.
A Perceptron is the simplest neuron. It takes several inputs, multiplies each by a weight, sums them up, adds a bias, then passes the result through an activation function.
Rumusnya:Formula: z = (xβΒ·wβ + xβΒ·wβ + ... + xβΒ·wβ) + bias β output = activation(z)
import numpy as np # =========================== # Simple Perceptron # =========================== def perceptron(inputs, weights, bias): """Single neuron: weighted sum + bias""" z = np.dot(inputs, weights) + bias # Activation: step function (1 if z >= 0, else 0) return 1 if z >= 0 else 0 # Example: AND gate # Input: 2 boolean values (0 or 1) # Output: 1 only if BOTH inputs = 1 weights = np.array([1.0, 1.0]) bias = -1.5 # threshold print("=== AND Gate with Perceptron ===") print(f"0 AND 0 = {perceptron(np.array([0, 0]), weights, bias)}") # 0 print(f"0 AND 1 = {perceptron(np.array([0, 1]), weights, bias)}") # 0 print(f"1 AND 0 = {perceptron(np.array([1, 0]), weights, bias)}") # 0 print(f"1 AND 1 = {perceptron(np.array([1, 1]), weights, bias)}") # 1 β # =========================== # OR Gate β just change the bias! # =========================== bias_or = -0.5 print("\n=== OR Gate with Perceptron ===") print(f"0 OR 0 = {perceptron(np.array([0, 0]), weights, bias_or)}") # 0 print(f"0 OR 1 = {perceptron(np.array([0, 1]), weights, bias_or)}") # 1 print(f"1 OR 0 = {perceptron(np.array([1, 0]), weights, bias_or)}") # 1 print(f"1 OR 1 = {perceptron(np.array([1, 1]), weights, bias_or)}") # 1 β
π Kenapa butuh Bias?
Bias memungkinkan neuron untuk "menggeser" batas keputusan. Tanpa bias, garis keputusan selalu melewati titik origin (0,0). Dengan bias, neuron lebih fleksibel β bisa memutuskan kapan "aktif" meskipun input-nya kecil.
π Why do we need Bias?
Bias allows the neuron to "shift" its decision boundary. Without bias, the decision line always passes through the origin (0,0). With bias, the neuron is more flexible β it can decide when to "fire" even when inputs are small.
4. Activation Functions β Gerbang Keputusan
4. Activation Functions β Decision Gates
Activation function menentukan apakah neuron "aktif" atau tidak. Tanpa activation function, neural network hanya bisa memodelkan hubungan linear β tidak bisa menyelesaikan masalah kompleks seperti pengenalan gambar atau bahasa.
An activation function determines whether a neuron "fires" or not. Without activation functions, a neural network can only model linear relationships β it can't solve complex problems like image recognition or language understanding.
import numpy as np # =========================== # 1. Sigmoid β output between 0 and 1 # Great for probabilities # =========================== def sigmoid(z): return 1 / (1 + np.exp(-z)) def sigmoid_derivative(z): """Derivative: Ο(z) * (1 - Ο(z))""" s = sigmoid(z) return s * (1 - s) print(sigmoid(0)) # 0.5 β midpoint print(sigmoid(5)) # 0.993 β close to 1 print(sigmoid(-5)) # 0.007 β close to 0 # =========================== # 2. ReLU β Rectified Linear Unit # Most popular for hidden layers # =========================== def relu(z): return np.maximum(0, z) def relu_derivative(z): return (z > 0).astype(float) print(relu(np.array([-2, -1, 0, 1, 2]))) # [0, 0, 0, 1, 2] β negatives become 0, positives stay # =========================== # 3. Tanh β output between -1 and 1 # "Shifted" sigmoid # =========================== def tanh(z): return np.tanh(z) def tanh_derivative(z): return 1 - np.tanh(z) ** 2
π Kapan Pakai Yang Mana?
Sigmoid β output layer untuk klasifikasi biner (ya/tidak).
ReLU β hidden layers (cepat, simple, paling sering dipakai).
Tanh β hidden layers ketika butuh output negatif.
Softmax β output layer untuk klasifikasi multi-kelas (akan dibahas di Page 2).
π When to Use Which?
Sigmoid β output layer for binary classification (yes/no).
ReLU β hidden layers (fast, simple, most commonly used).
Tanh β hidden layers when negative outputs are needed.
Softmax β output layer for multi-class classification (covered in Page 2).
5. Forward Propagation β Data Mengalir ke Depan
5. Forward Propagation β Data Flows Forward
Forward propagation adalah proses data mengalir dari input layer, melewati hidden layer(s), hingga menghasilkan output (prediksi). Setiap layer melakukan operasi yang sama: z = WΒ·x + b β a = activation(z).
Forward propagation is the process of data flowing from the input layer, through the hidden layer(s), to produce an output (prediction). Each layer performs the same operation: z = WΒ·x + b β a = activation(z).
import numpy as np def sigmoid(z): return 1 / (1 + np.exp(-z)) # =========================== # Network: 2 input β 3 hidden β 1 output # =========================== # Random weights & biases (will be trained later) np.random.seed(42) # Hidden layer: 2 input β 3 neurons W1 = np.random.randn(2, 3) * 0.5 # shape: (2, 3) b1 = np.zeros((1, 3)) # shape: (1, 3) # Output layer: 3 hidden β 1 output W2 = np.random.randn(3, 1) * 0.5 # shape: (3, 1) b2 = np.zeros((1, 1)) # shape: (1, 1) # =========================== # Forward pass # =========================== X = np.array([[0.5, 0.8]]) # 1 sample, 2 features # Layer 1: hidden z1 = X @ W1 + b1 # linear transformation a1 = sigmoid(z1) # activation print(f"Hidden layer output: {a1}") # Layer 2: output z2 = a1 @ W2 + b2 # linear transformation a2 = sigmoid(z2) # activation (= final prediction) print(f"Prediction: {a2}") # Output: a number between 0-1 (probability)
6. Backpropagation β Mesin Belajar
6. Backpropagation β The Learning Engine
Backpropagation adalah algoritma inti yang membuat neural network bisa "belajar". Setelah forward pass menghasilkan prediksi, kita hitung error (loss), lalu propagasi error mundur ke setiap layer menggunakan chain rule dari kalkulus untuk menghitung gradient β seberapa banyak setiap weight harus diubah.
Backpropagation is the core algorithm that enables neural networks to "learn." After the forward pass produces a prediction, we compute the error (loss), then propagate that error backward through each layer using the chain rule from calculus to compute gradients β how much each weight should change.
π Analogi: Feedback di Restoran
π Analogy: Restaurant Feedback
Pelanggan bilang sup terlalu asin (loss). Manager menelusuri mundur: siapa yang menambah garam? Berapa banyak? (gradient). Lalu instruksi: "kurangi garam 20%" (weight update). Masakan berikutnya lebih enak. Itulah backpropagation.
A customer says the soup is too salty (loss). The manager traces back: who added the salt? How much? (gradient). Then the instruction: "reduce salt by 20%" (weight update). The next dish tastes better. That's backpropagation.
import numpy as np def sigmoid(z): return 1 / (1 + np.exp(-z)) def sigmoid_derivative(z): s = sigmoid(z) return s * (1 - s) # =========================== # Setup: 2 input β 3 hidden β 1 output # =========================== np.random.seed(42) W1 = np.random.randn(2, 3) * 0.5 b1 = np.zeros((1, 3)) W2 = np.random.randn(3, 1) * 0.5 b2 = np.zeros((1, 1)) # Data: XOR problem X = np.array([[0,0], [0,1], [1,0], [1,1]]) y = np.array([[0], [1], [1], [0]]) # XOR! lr = 1.0 # learning rate # =========================== # TRAINING LOOP # =========================== for epoch in range(10000): # --- FORWARD PASS --- z1 = X @ W1 + b1 # (4, 3) a1 = sigmoid(z1) # (4, 3) z2 = a1 @ W2 + b2 # (4, 1) a2 = sigmoid(z2) # (4, 1) β prediction # --- COMPUTE LOSS --- loss = np.mean((y - a2) ** 2) # MSE # --- BACKWARD PASS --- # Output layer gradients dL_da2 = -2 * (y - a2) / 4 # dLoss/da2 da2_dz2 = sigmoid_derivative(z2) # da2/dz2 dz2 = dL_da2 * da2_dz2 # chain rule! dW2 = a1.T @ dz2 # dLoss/dW2 db2 = np.sum(dz2, axis=0, keepdims=True) # Hidden layer gradients da1 = dz2 @ W2.T dz1 = da1 * sigmoid_derivative(z1) dW1 = X.T @ dz1 # dLoss/dW1 db1 = np.sum(dz1, axis=0, keepdims=True) # --- UPDATE WEIGHTS --- W2 -= lr * dW2 b2 -= lr * db2 W1 -= lr * dW1 b1 -= lr * db1 if (epoch + 1) % 2000 == 0: print(f"Epoch {epoch+1:>5}, Loss: {loss:.6f}") # =========================== # CHECK RESULTS # =========================== print("\n=== XOR Predictions ===") print(np.round(a2, 3)) # [[0.019] β 0 XOR 0 = ~0 β # [0.981] β 0 XOR 1 = ~1 β # [0.981] β 1 XOR 0 = ~1 β # [0.023]] β 1 XOR 1 = ~0 β
π XOR Problem Solved!
Perceptron tunggal tidak bisa menyelesaikan XOR β ini terbukti secara matematis oleh Minsky & Papert (1969). Tapi dengan menambahkan hidden layer, neural network kita berhasil! Inilah kekuatan deep learning: hidden layers menangkap pola non-linear yang tidak terlihat.
π XOR Problem Solved!
A single perceptron cannot solve XOR β this was mathematically proven by Minsky & Papert (1969). But by adding a hidden layer, our neural network succeeds! This is the power of deep learning: hidden layers capture non-linear patterns that are invisible to single neurons.
π Bagaimana Backpropagation Bekerja
π How Backpropagation Works
7. Neural Network Pertama β Class Lengkap
7. Your First Neural Network β Complete Class
Sekarang kita gabungkan semuanya menjadi class Python yang rapi. Neural network ini bisa di-train pada data apapun β kita akan test pada XOR dan juga pada data custom.
Now let's combine everything into a clean Python class. This neural network can be trained on any data β we'll test it on XOR and also on custom data.
import numpy as np # ===================================================== # NEURAL NETWORK CLASS β from scratch, no framework # ===================================================== class NeuralNetwork: """ Simple Neural Network: - 1 Hidden Layer - Sigmoid activation - MSE loss - Gradient Descent optimizer """ def __init__(self, input_size, hidden_size, output_size): # Initialize weights randomly (Xavier init) self.W1 = np.random.randn(input_size, hidden_size) * np.sqrt(2/input_size) self.b1 = np.zeros((1, hidden_size)) self.W2 = np.random.randn(hidden_size, output_size) * np.sqrt(2/hidden_size) self.b2 = np.zeros((1, output_size)) def sigmoid(self, z): return 1 / (1 + np.exp(-np.clip(z, -500, 500))) def sigmoid_deriv(self, z): s = self.sigmoid(z) return s * (1 - s) def forward(self, X): """Forward pass: input β hidden β output""" self.z1 = X @ self.W1 + self.b1 self.a1 = self.sigmoid(self.z1) self.z2 = self.a1 @ self.W2 + self.b2 self.a2 = self.sigmoid(self.z2) return self.a2 def backward(self, X, y, lr=0.1): """Backward pass: compute gradients, update weights""" m = X.shape[0] # number of samples # Output layer dz2 = (-2/m) * (y - self.a2) * self.sigmoid_deriv(self.z2) dW2 = self.a1.T @ dz2 db2 = np.sum(dz2, axis=0, keepdims=True) # Hidden layer da1 = dz2 @ self.W2.T dz1 = da1 * self.sigmoid_deriv(self.z1) dW1 = X.T @ dz1 db1 = np.sum(dz1, axis=0, keepdims=True) # Update! self.W2 -= lr * dW2 self.b2 -= lr * db2 self.W1 -= lr * dW1 self.b1 -= lr * db1 def train(self, X, y, epochs=5000, lr=1.0, verbose=True): """Complete training loop""" for epoch in range(epochs): pred = self.forward(X) loss = np.mean((y - pred) ** 2) self.backward(X, y, lr) if verbose and (epoch + 1) % 1000 == 0: print(f" Epoch {epoch+1:>5} β Loss: {loss:.6f}") return loss def predict(self, X): """Predict new data""" return self.forward(X) # ===================================================== # TEST 1: XOR Problem # ===================================================== print("π§ Test 1: XOR Problem") print("=" * 40) X_xor = np.array([[0,0], [0,1], [1,0], [1,1]]) y_xor = np.array([[0], [1], [1], [0]]) nn_xor = NeuralNetwork(input_size=2, hidden_size=4, output_size=1) nn_xor.train(X_xor, y_xor, epochs=10000, lr=2.0) print("\nXOR Predictions:") pred = nn_xor.predict(X_xor) for i in range(4): print(f" {X_xor[i]} β {pred[i][0]:.4f} (target: {y_xor[i][0]})") # ===================================================== # TEST 2: Learn pattern y = sin(x) > 0 # ===================================================== print("\nπ§ Test 2: sin(x) > 0 classifier") print("=" * 40) X_sin = np.random.uniform(-3, 3, (200, 1)) y_sin = (np.sin(X_sin) > 0).astype(float) nn_sin = NeuralNetwork(input_size=1, hidden_size=8, output_size=1) final_loss = nn_sin.train(X_sin, y_sin, epochs=5000, lr=1.0) accuracy = np.mean((nn_sin.predict(X_sin) > 0.5) == y_sin) * 100 print(f"\nAccuracy: {accuracy:.1f}%") # Output: Accuracy: ~95%+ β captures non-linear pattern! π
π Apa yang Terjadi di Training?
1. Forward: Data masuk β model prediksi.
2. Loss: Hitung error (seberapa jauh prediksi dari target).
3. Backward: Chain rule β hitung gradient setiap weight.
4. Update: Geser weight searah gradient (gradient descent).
Ulangi ribuan kali β model menemukan pola dari data. Ini seluruh inti neural network.
π What Happens During Training?
1. Forward: Data enters β model makes a prediction.
2. Loss: Compute error (how far the prediction is from target).
3. Backward: Chain rule β compute gradient for each weight.
4. Update: Shift weights in the direction of the gradient (gradient descent).
Repeat thousands of times β model discovers patterns from data. This is the entire core of neural networks.
π Selamat! Anda baru saja membangun neural network dari nol tanpa TensorFlow, tanpa PyTorch β hanya Python dan NumPy. Model ini menemukan pola XOR dan sin(x) tanpa diberitahu rumusnya. Inilah esensi machine learning.
π Congratulations! You just built a neural network from scratch without TensorFlow, without PyTorch β just Python and NumPy. This model discovered the XOR and sin(x) patterns without being told the formula. This is the essence of machine learning.
8. Ringkasan Page 1
8. Page 1 Summary
| Konsep | Apa Itu | Kode Kunci |
|---|---|---|
| Neuron / Perceptron | Unit terkecil NN β weighted sum + activation | z = X @ W + b |
| Weight & Bias | Parameter yang di-adjust selama training | np.random.randn() |
| Activation Function | Memberi kemampuan non-linear ke neuron | sigmoid(z), relu(z) |
| Forward Pass | Data mengalir dari input β output | a = sigmoid(X @ W + b) |
| Loss / Error | Mengukur seberapa jauh prediksi dari target | np.mean((y-pred)**2) |
| Backpropagation | Hitung gradient mundur via chain rule | dW = X.T @ dz |
| Gradient Descent | Update weight searah gradient | W -= lr * dW |
| Learning Rate | Seberapa besar langkah update | lr = 0.01 |
| Training Loop | Forward β Loss β Backward β Update β Repeat | for epoch in range(N) |
| Concept | What It Is | Key Code |
|---|---|---|
| Neuron / Perceptron | Smallest NN unit β weighted sum + activation | z = X @ W + b |
| Weight & Bias | Parameters adjusted during training | np.random.randn() |
| Activation Function | Gives non-linear capability to neurons | sigmoid(z), relu(z) |
| Forward Pass | Data flows from input β output | a = sigmoid(X @ W + b) |
| Loss / Error | Measures how far prediction is from target | np.mean((y-pred)**2) |
| Backpropagation | Compute gradients backward via chain rule | dW = X.T @ dz |
| Gradient Descent | Update weights in the gradient direction | W -= lr * dW |
| Learning Rate | How large each update step is | lr = 0.01 |
| Training Loop | Forward β Loss β Backward β Update β Repeat | for epoch in range(N) |
Coming Next: Page 2 β Multi-Layer Network & Real Dataset
Menambahkan multiple hidden layers, menggunakan dataset nyata (Iris, MNIST), implementasi Mini-Batch Gradient Descent, dan visualisasi decision boundary. Stay tuned!
Coming Next: Page 2 β Multi-Layer Network & Real Dataset
Adding multiple hidden layers, using real datasets (Iris, MNIST), implementing Mini-Batch Gradient Descent, and visualizing decision boundaries. Stay tuned!