š Daftar Isi ā Page 2
š Table of Contents ā Page 2
- Apa Itu Keras? ā High-level API di dalam TensorFlow
- Sequential API ā Stack layers: paling simpel, paling sering dipakai
- Compile & Fit ā Optimizer, loss, metrics, dan training
- Evaluate & Predict ā Testing dan inference
- Functional API ā Multi-input, skip connections, arsitektur kompleks
- Callbacks ā EarlyStopping, ModelCheckpoint, TensorBoard, ReduceLR
- Custom Layers & Models ā Buat layer sendiri dengan subclassing
- Proyek: MNIST Classifier 98%+ ā End-to-end dalam 30 baris
- Ringkasan & Preview Page 3
- What Is Keras? ā High-level API inside TensorFlow
- Sequential API ā Stack layers: simplest, most commonly used
- Compile & Fit ā Optimizer, loss, metrics, and training
- Evaluate & Predict ā Testing and inference
- Functional API ā Multi-input, skip connections, complex architectures
- Callbacks ā EarlyStopping, ModelCheckpoint, TensorBoard, ReduceLR
- Custom Layers & Models ā Create your own layers with subclassing
- Project: MNIST Classifier 98%+ ā End-to-end in 30 lines
- Summary & Page 3 Preview
1. Apa Itu Keras? ā Tiga Cara Membangun Model
1. What Is Keras? ā Three Ways to Build Models
Keras adalah API high-level yang terintegrasi di dalam TensorFlow (tf.keras). Ia menyediakan building blocks (layers, optimizers, losses, metrics) yang bisa dirangkai menjadi model. Ada 3 cara membangun model di Keras ā pilih berdasarkan kompleksitas:
Keras is a high-level API integrated into TensorFlow (tf.keras). It provides building blocks (layers, optimizers, losses, metrics) that can be assembled into models. There are 3 ways to build models in Keras ā choose based on complexity:
2. Sequential API ā Stack Layers Sederhana
2. Sequential API ā Simple Layer Stacking
Sequential = tumpukan layer linear dari input ke output. Cocok untuk model yang datanya mengalir lurus tanpa percabangan. Ini cara tercepat untuk membuat model di Keras.
Sequential = a linear stack of layers from input to output. Perfect for models where data flows straight through without branching. This is the fastest way to build a model in Keras.
import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers # =========================== # Cara 1: List di constructor # =========================== model = keras.Sequential([ layers.Dense(256, activation='relu', input_shape=(784,)), layers.BatchNormalization(), layers.Dropout(0.3), layers.Dense(128, activation='relu'), layers.BatchNormalization(), layers.Dropout(0.2), layers.Dense(10, activation='softmax') ]) # =========================== # Cara 2: .add() satu per satu # =========================== model2 = keras.Sequential() model2.add(layers.Dense(256, activation='relu', input_shape=(784,))) model2.add(layers.BatchNormalization()) model2.add(layers.Dropout(0.3)) model2.add(layers.Dense(128, activation='relu')) model2.add(layers.Dense(10, activation='softmax')) # =========================== # Inspect model # =========================== model.summary() # āāāāāāāāāāāāāāāāāāāāā³āāāāāāāāāāāāāāāāāā³āāāāāāāāāāā # ā Layer (type) ā Output Shape ā Param # ā # ā£āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā« # ā dense (Dense) ā (None, 256) ā 200,960 ā # ā batch_norm ā (None, 256) ā 1,024 ā # ā dropout ā (None, 256) ā 0 ā # ā dense_1 (Dense) ā (None, 128) ā 32,896 ā # ā ... ā ā ā # āāāāāāāāāāāāāāāāāāāāā»āāāāāāāāāāāāāāāāāā»āāāāāāāāāāā # Total params: ~236k # Common layers cheat sheet: # Dense(units, activation) ā fully connected # Conv2D(filters, kernel) ā convolution (Page 3) # LSTM(units) ā recurrent (Page 5) # BatchNormalization() ā normalize activations # Dropout(rate) ā regularization # Flatten() ā 3D ā 1D # Embedding(vocab, dim) ā word ā vector
š Activation Functions di Keras:'relu' ā hidden layers (default terbaik).'softmax' ā output layer multi-kelas (probabilitas, sum=1).'sigmoid' ā output layer biner (0 atau 1).'linear' ā output layer regresi (nilai kontinu). Atau: None = no activation.
š Activation Functions in Keras:'relu' ā hidden layers (best default).'softmax' ā multi-class output layer (probabilities, sum=1).'sigmoid' ā binary output layer (0 or 1).'linear' ā regression output layer (continuous values). Or: None = no activation.
3. Compile & Fit ā Training dalam 3 Baris
3. Compile & Fit ā Training in 3 Lines
import tensorflow as tf from tensorflow import keras # =========================== # 1. COMPILE ā configure training # =========================== model.compile( optimizer='adam', # or keras.optimizers.Adam(lr=1e-3) loss='sparse_categorical_crossentropy', # labels: integers [0,1,2...] # loss='categorical_crossentropy', # labels: one-hot [[1,0,0],...] # loss='binary_crossentropy', # binary classification # loss='mse', # regression metrics=['accuracy'] # track during training ) # =========================== # 2. FIT ā train the model! # =========================== history = model.fit( X_train, y_train, # training data epochs=20, # number of full passes batch_size=32, # samples per gradient update validation_split=0.2, # 20% for validation # validation_data=(X_val, y_val), # or provide explicit val set verbose=1, # progress bar ) # Epoch 1/20 ā loss: 0.3421 ā accuracy: 0.9012 ā val_loss: 0.1542 # Epoch 2/20 ā loss: 0.1523 ā accuracy: 0.9543 ā val_loss: 0.1123 # ... # =========================== # 3. History ā training curves # =========================== print(history.history.keys()) # dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy']) # Plot training curves import matplotlib.pyplot as plt plt.plot(history.history['accuracy'], label='Train') plt.plot(history.history['val_accuracy'], label='Val') plt.xlabel('Epoch'); plt.ylabel('Accuracy'); plt.legend() plt.show()
š Loss Function Cheat Sheet:sparse_categorical_crossentropy ā labels integer (0, 1, 2, ...) ā paling umum.categorical_crossentropy ā labels one-hot ([1,0,0], [0,1,0], ...).binary_crossentropy ā binary (0 atau 1).mse ā regresi (prediksi angka kontinu).
Tips: Jika bingung antara sparse vs categorical ā pakai sparse (lebih mudah, tidak perlu one-hot encode labels).
š Loss Function Cheat Sheet:sparse_categorical_crossentropy ā integer labels (0, 1, 2, ...) ā most common.categorical_crossentropy ā one-hot labels ([1,0,0], [0,1,0], ...).binary_crossentropy ā binary (0 or 1).mse ā regression (predict continuous numbers).
Tip: If confused between sparse vs categorical ā use sparse (easier, no need to one-hot encode labels).
4. Evaluate & Predict ā Testing dan Inference
4. Evaluate & Predict ā Testing and Inference
# =========================== # Evaluate on test set # =========================== test_loss, test_acc = model.evaluate(X_test, y_test, verbose=0) print(f"Test Loss: {test_loss:.4f}") print(f"Test Accuracy: {test_acc:.1%}") # 98.1% # =========================== # Predict on new data # =========================== predictions = model.predict(X_test[:5]) print(predictions.shape) # (5, 10) ā probabilities per class print(predictions[0]) # [0.00, 0.00, 0.01, 0.97, ...] ā class 3 # Get predicted class import numpy as np predicted_classes = np.argmax(predictions, axis=1) print(f"Predicted: {predicted_classes}") # [3, 1, 7, 2, 4] print(f"Actual: {y_test[:5]}") # [3, 1, 7, 2, 4] ā # =========================== # Save & Load model # =========================== model.save("my_model.keras") # save entire model loaded = keras.models.load_model("my_model.keras") # load back # loaded.predict(X_test[:5]) ā same results! ā
5. Functional API ā Arsitektur Kompleks
5. Functional API ā Complex Architectures
Untuk arsitektur yang lebih kompleks dari "stack lurus", gunakan Functional API. Anda mendefinisikan alur data sebagai graf ā setiap layer adalah fungsi yang dipanggil pada output layer sebelumnya.
For architectures more complex than a "straight stack", use the Functional API. You define the data flow as a graph ā each layer is a function called on the output of the previous layer.
from tensorflow import keras from tensorflow.keras import layers # =========================== # 1. Basic Functional API # =========================== inputs = keras.Input(shape=(784,), name="digits") x = layers.Dense(128, activation="relu")(inputs) x = layers.Dropout(0.3)(x) x = layers.Dense(64, activation="relu")(x) outputs = layers.Dense(10, activation="softmax")(x) model = keras.Model(inputs=inputs, outputs=outputs, name="digit_classifier") # =========================== # 2. Residual Connection (Skip Connection) # Sama seperti di ResNet! # =========================== inputs = keras.Input(shape=(256,)) x = layers.Dense(256, activation="relu")(inputs) x = layers.BatchNormalization()(x) x = layers.Dense(256)(x) # no activation yet! x = layers.Add()([x, inputs]) # ā SKIP CONNECTION! x = layers.Activation("relu")(x) # activate after add outputs = layers.Dense(10, activation="softmax")(x) residual_model = keras.Model(inputs, outputs, name="residual_net") # =========================== # 3. Multi-Input Model # e.g., Image + Text ā Classification # =========================== img_input = keras.Input(shape=(224, 224, 3), name="image") txt_input = keras.Input(shape=(100,), name="text") # Image branch img_features = layers.Conv2D(32, 3, activation="relu")(img_input) img_features = layers.GlobalAveragePooling2D()(img_features) img_features = layers.Dense(64)(img_features) # Text branch txt_features = layers.Dense(64, activation="relu")(txt_input) # Merge branches merged = layers.Concatenate()([img_features, txt_features]) merged = layers.Dense(64, activation="relu")(merged) output = layers.Dense(5, activation="softmax")(merged) multi_model = keras.Model( inputs=[img_input, txt_input], outputs=output, name="multimodal" ) multi_model.summary() # Train with dict inputs: # multi_model.fit({"image": img_data, "text": txt_data}, labels)
š Sequential vs Functional ā Kapan Pakai Mana?
Sequential: model lurus (Dense ā Dense ā Output). ~80% kasus.
Functional: ada percabangan, merge, skip connection, multi-input/output. ResNet, Inception, multi-modal models.
Subclassing: butuh dynamic logic di forward pass (if/else, loops). Research / custom architectures.
š Sequential vs Functional ā When to Use Which?
Sequential: straight-line model (Dense ā Dense ā Output). ~80% of cases.
Functional: branching, merging, skip connections, multi-input/output. ResNet, Inception, multi-modal models.
Subclassing: need dynamic logic in forward pass (if/else, loops). Research / custom architectures.
6. Callbacks ā Kontrol Training Otomatis
6. Callbacks ā Automatic Training Control
Callbacks adalah fungsi yang dipanggil di titik-titik tertentu selama training (akhir epoch, akhir batch, dll). Ini memungkinkan Anda mengontrol training secara otomatis tanpa mengawasi manual.
Callbacks are functions called at certain points during training (end of epoch, end of batch, etc). They let you automatically control training without manual monitoring.
from tensorflow import keras callbacks = [ # 1. EarlyStopping ā stop jika val_loss tidak membaik keras.callbacks.EarlyStopping( monitor='val_loss', # watch validation loss patience=5, # wait 5 epochs before stopping restore_best_weights=True # rollback to best epoch! ), # 2. ModelCheckpoint ā save model terbaik keras.callbacks.ModelCheckpoint( filepath='best_model.keras', monitor='val_accuracy', save_best_only=True, # only save if improved verbose=1 ), # 3. ReduceLROnPlateau ā kurangi LR jika stuck keras.callbacks.ReduceLROnPlateau( monitor='val_loss', factor=0.5, # LR Ć 0.5 patience=3, # wait 3 epochs min_lr=1e-6, # floor verbose=1 ), # 4. TensorBoard ā visualisasi training keras.callbacks.TensorBoard( log_dir='./logs', histogram_freq=1, # log weight histograms ), ] # Use in training: model.fit(X_train, y_train, epochs=100, callbacks=callbacks, validation_split=0.2) # Training might stop at epoch 15 if val_loss plateaus! # Best model auto-saved to best_model.keras # View in TensorBoard: tensorboard --logdir ./logs
š Best Practice: Selalu gunakan minimal EarlyStopping + ModelCheckpoint di setiap training. Set epochs ke angka besar (100-1000) dan biarkan EarlyStopping menentukan kapan berhenti. Ini mencegah overfitting sekaligus memastikan Anda selalu punya model terbaik tersimpan.
š Best Practice: Always use at least EarlyStopping + ModelCheckpoint in every training run. Set epochs to a large number (100-1000) and let EarlyStopping decide when to stop. This prevents overfitting while ensuring you always have the best model saved.
7. Custom Layers & Models ā Subclassing
7. Custom Layers & Models ā Subclassing
import tensorflow as tf from tensorflow import keras # =========================== # 1. Custom Layer # =========================== class ResidualBlock(keras.layers.Layer): """Residual block: x + F(x)""" def __init__(self, units, **kwargs): super().__init__(**kwargs) self.dense1 = keras.layers.Dense(units, activation='relu') self.dense2 = keras.layers.Dense(units) # no activation! self.bn = keras.layers.BatchNormalization() self.add = keras.layers.Add() def call(self, inputs, training=False): x = self.dense1(inputs) x = self.bn(x, training=training) x = self.dense2(x) return tf.nn.relu(self.add([x, inputs])) # residual! # =========================== # 2. Custom Model # =========================== class MyClassifier(keras.Model): def __init__(self, num_classes): super().__init__() self.flatten = keras.layers.Flatten() self.block1 = ResidualBlock(128) self.block2 = ResidualBlock(128) self.dropout = keras.layers.Dropout(0.3) self.classifier = keras.layers.Dense(num_classes, activation='softmax') def call(self, inputs, training=False): x = self.flatten(inputs) x = keras.layers.Dense(128, activation='relu')(x) # project to 128 x = self.block1(x, training=training) x = self.block2(x, training=training) x = self.dropout(x, training=training) return self.classifier(x) # Use exactly like any Keras model! model = MyClassifier(num_classes=10) model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']) # model.fit(X_train, y_train, ...)
š Penting: training=True/False
Beberapa layer berperilaku berbeda saat training vs inference: Dropout (aktif saat training, off saat inference), BatchNorm (pakai batch stats saat training, running stats saat inference). Selalu forward training parameter ke layer-layer ini!
š Important: training=True/False
Some layers behave differently during training vs inference: Dropout (active during training, off during inference), BatchNorm (uses batch stats during training, running stats during inference). Always forward the training parameter to these layers!
8. Proyek: MNIST Classifier 98%+ ā End to End
8. Project: MNIST Classifier 98%+ ā End to End
import tensorflow as tf from tensorflow import keras import numpy as np # =========================== # 1. LOAD & PREPROCESS DATA # =========================== (X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data() # Normalize: [0, 255] ā [0, 1] X_train = X_train.reshape(-1, 784).astype('float32') / 255.0 X_test = X_test.reshape(-1, 784).astype('float32') / 255.0 print(f"Train: {X_train.shape}, Test: {X_test.shape}") # Train: (60000, 784), Test: (10000, 784) # =========================== # 2. BUILD MODEL # =========================== model = keras.Sequential([ keras.layers.Dense(256, activation='relu', input_shape=(784,)), keras.layers.BatchNormalization(), keras.layers.Dropout(0.3), keras.layers.Dense(128, activation='relu'), keras.layers.BatchNormalization(), keras.layers.Dropout(0.2), keras.layers.Dense(10, activation='softmax') ]) # =========================== # 3. COMPILE # =========================== model.compile( optimizer=keras.optimizers.Adam(learning_rate=1e-3), loss='sparse_categorical_crossentropy', metrics=['accuracy'] ) # =========================== # 4. TRAIN with callbacks # =========================== callbacks = [ keras.callbacks.EarlyStopping(patience=5, restore_best_weights=True), keras.callbacks.ReduceLROnPlateau(factor=0.5, patience=3) ] history = model.fit( X_train, y_train, epochs=50, batch_size=64, validation_split=0.1, callbacks=callbacks, verbose=1 ) # =========================== # 5. EVALUATE # =========================== test_loss, test_acc = model.evaluate(X_test, y_test, verbose=0) print(f"\nšÆ Test Accuracy: {test_acc:.1%}") # šÆ Test Accuracy: 98.2% ā with just Dense layers! š # With CNN (Page 3) ā 99%+ # =========================== # 6. SAVE # =========================== model.save("mnist_classifier.keras") print("ā Model saved!")
š 98.2% Akurasi MNIST ā Hanya Dense Layers!
Bandingkan: di seri Neural Network, kita butuh ratusan baris NumPy manual untuk mencapai 97%. Dengan Keras: 30 baris kode, lengkap dengan BatchNorm, Dropout, Adam optimizer, EarlyStopping, dan save model. Dan ini bahkan belum pakai CNN (Page 3)!
š 98.2% MNIST Accuracy ā Just Dense Layers!
Compare: in our Neural Network series, we needed hundreds of lines of manual NumPy to achieve 97%. With Keras: 30 lines of code, complete with BatchNorm, Dropout, Adam optimizer, EarlyStopping, and model saving. And this doesn't even use CNN (Page 3)!
9. Ringkasan Page 2
9. Page 2 Summary
| Konsep | Apa Itu | Kode Kunci |
|---|---|---|
| Sequential API | Stack layers linear sederhana | Sequential([Dense(), ...]) |
| Functional API | Arsitektur kompleks (multi-input, skip) | Model(inputs, outputs) |
| compile() | Pilih optimizer, loss, metrics | model.compile('adam', ...) |
| fit() | Training loop otomatis | model.fit(X, y, epochs=N) |
| evaluate() | Test performa di test set | model.evaluate(X_test, y) |
| predict() | Inference pada data baru | model.predict(X_new) |
| EarlyStopping | Stop jika val_loss tidak turun | patience=5 |
| ModelCheckpoint | Auto-save model terbaik | save_best_only=True |
| Custom Layer | Buat layer sendiri via subclass | class MyLayer(Layer) |
| Concept | What It Is | Key Code |
|---|---|---|
| Sequential API | Simple linear layer stack | Sequential([Dense(), ...]) |
| Functional API | Complex architectures (multi-input, skip) | Model(inputs, outputs) |
| compile() | Choose optimizer, loss, metrics | model.compile('adam', ...) |
| fit() | Automatic training loop | model.fit(X, y, epochs=N) |
| evaluate() | Test performance on test set | model.evaluate(X_test, y) |
| predict() | Inference on new data | model.predict(X_new) |
| EarlyStopping | Stop if val_loss doesn't improve | patience=5 |
| ModelCheckpoint | Auto-save best model | save_best_only=True |
| Custom Layer | Create own layer via subclass | class MyLayer(Layer) |
Page 1 ā Pengenalan TensorFlow & Tensor Operations
Coming Next: Page 3 ā CNN & Image Classification
Computer vision dengan TensorFlow: Conv2D, MaxPooling, data augmentation (tf.image & Keras layers), transfer learning dengan MobileNet & ResNet, klasifikasi CIFAR-10 dan custom dataset, TensorBoard visualization. Stay tuned!
Coming Next: Page 3 ā CNN & Image Classification
Computer vision with TensorFlow: Conv2D, MaxPooling, data augmentation (tf.image & Keras layers), transfer learning with MobileNet & ResNet, CIFAR-10 and custom dataset classification, TensorBoard visualization. Stay tuned!