πŸ“ Artikel ini ditulis dalam Bahasa Indonesia & English
πŸ“ This article is available in English & Bahasa Indonesia

🎨 Belajar TensorFlow β€” Page 8Learn TensorFlow β€” Page 8

GAN & Generative Models
di TensorFlow

GAN & Generative Models
in TensorFlow

Membuat gambar dari nol. Page 8 membahas secara mendalam: konsep GAN (Generator vs Discriminator), arsitektur DCGAN lengkap dengan Conv2DTranspose, adversarial training loop (dari Page 7), Variational Autoencoder (VAE) dan reparameterization trick, conditional GAN untuk generate kelas tertentu, Wasserstein GAN (WGAN) untuk training stabil, latent space interpolation dan exploration, tips menghindari mode collapse, dan perbandingan GAN vs VAE vs Diffusion Models.

Creating images from scratch. Page 8 covers in depth: GAN concept (Generator vs Discriminator), complete DCGAN architecture with Conv2DTranspose, adversarial training loop (from Page 7), Variational Autoencoder (VAE) and reparameterization trick, conditional GAN for specific class generation, Wasserstein GAN (WGAN) for stable training, latent space interpolation and exploration, tips to avoid mode collapse, and comparison of GAN vs VAE vs Diffusion Models.

πŸ“… MaretMarch 2026⏱ 32 menit baca32 min read
🏷 DCGANVAEConditional GANWGANImage GenerationLatent Space
πŸ“š Seri Belajar TensorFlow:Learn TensorFlow Series:

πŸ“‘ Daftar Isi β€” Page 8

πŸ“‘ Table of Contents β€” Page 8

  1. Konsep GAN β€” Generator vs Discriminator: permainan adversarial
  2. DCGAN Generator β€” Conv2DTranspose: noise β†’ gambar
  3. DCGAN Discriminator β€” Conv2D: gambar β†’ real/fake
  4. GAN Training Loop β€” Alternating D & G updates
  5. VAE β€” Variational Autoencoder & reparameterization
  6. Conditional GAN β€” Generate kelas tertentu
  7. WGAN β€” Wasserstein loss untuk stabilitas
  8. Latent Space β€” Interpolasi dan eksplorasi
  9. Tips Training GAN β€” Menghindari mode collapse
  10. Ringkasan & Preview Page 9
  1. GAN Concept β€” Generator vs Discriminator: adversarial game
  2. DCGAN Generator β€” Conv2DTranspose: noise β†’ image
  3. DCGAN Discriminator β€” Conv2D: image β†’ real/fake
  4. GAN Training Loop β€” Alternating D & G updates
  5. VAE β€” Variational Autoencoder & reparameterization
  6. Conditional GAN β€” Generate specific classes
  7. WGAN β€” Wasserstein loss for stability
  8. Latent Space β€” Interpolation and exploration
  9. GAN Training Tips β€” Avoiding mode collapse
  10. Summary & Page 9 Preview
🎭

1. Konsep GAN β€” Permainan Adversarial

1. GAN Concept β€” The Adversarial Game

Generator membuat gambar palsu, Discriminator mencoba membedakan asli vs palsu
Generator creates fake images, Discriminator tries to tell real from fake

Di seri Neural Network Page 7, kita membahas GAN dari nol. Sekarang kita implementasi di TensorFlow dengan arsitektur production-grade. Konsepnya tetap sama: Generator (G) membuat gambar palsu dari noise random, Discriminator (D) berusaha membedakan gambar asli dari palsu. Keduanya saling bersaing β€” G semakin jago membuat gambar realistis, D semakin jago mendeteksi palsu.

In Neural Network series Page 7, we discussed GAN from scratch. Now we implement it in TensorFlow with production-grade architecture. The concept remains the same: Generator (G) creates fake images from random noise, Discriminator (D) tries to distinguish real from fake. They compete against each other β€” G gets better at making realistic images, D gets better at detecting fakes.

GAN Architecture β€” Generator vs Discriminator Random Noise Generator (G) Fake Image z ~ N(0, 1) Dense β†’ Reshape β†’ ConvT β†’ ConvT 28Γ—28Γ—1 (100-dim) "Pemalsu uang / Counterfeiter" ↓ fake images β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” Real Images β†’ β”‚ Discriminator (D) β”‚ β†’ Real (1) or Fake (0) (from dataset) β”‚ Conv β†’ Conv β†’ β”‚ β”‚ Dense β†’ Sigmoid β”‚ β”‚ "Polisi / Detective" β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Training: 1. Train D: maximize P(real=1) + P(fake=0) β†’ "catch the counterfeiter" 2. Train G: maximize P(D(fake)=1) β†’ "fool the detective" 3. Nash Equilibrium: D can't tell β†’ G generates perfect images! D_loss = -[log(D(real)) + log(1 - D(G(z)))] G_loss = -log(D(G(z))) ← maximize D's confusion
πŸ—οΈ

2. DCGAN Generator β€” Noise β†’ Gambar

2. DCGAN Generator β€” Noise β†’ Image

Conv2DTranspose: upsampling yang dipelajari β€” kebalikan dari Conv2D
Conv2DTranspose: learned upsampling β€” the reverse of Conv2D
53_dcgan_generator.py β€” Generator Architecturepython
import tensorflow as tf
from tensorflow.keras import layers

# ===========================
# DCGAN Generator: noise (100,) β†’ image (28, 28, 1)
# Uses Conv2DTranspose for learned upsampling
# ===========================
def make_generator(noise_dim=100):
    model = tf.keras.Sequential([
        # Step 1: Dense projection β€” noise β†’ feature volume
        layers.Dense(7 * 7 * 256, use_bias=False, input_shape=(noise_dim,)),
        layers.BatchNormalization(),
        layers.LeakyReLU(0.2),
        layers.Reshape((7, 7, 256)),    # β†’ (7, 7, 256)

        # Step 2: Conv2DTranspose β€” upsample 7β†’7 (strides=1)
        layers.Conv2DTranspose(128, (5, 5), strides=(1, 1),
                               padding='same', use_bias=False),
        layers.BatchNormalization(),
        layers.LeakyReLU(0.2),           # β†’ (7, 7, 128)

        # Step 3: Conv2DTranspose β€” upsample 7β†’14
        layers.Conv2DTranspose(64, (5, 5), strides=(2, 2),
                               padding='same', use_bias=False),
        layers.BatchNormalization(),
        layers.LeakyReLU(0.2),           # β†’ (14, 14, 64)

        # Step 4: Conv2DTranspose β€” upsample 14β†’28 (output!)
        layers.Conv2DTranspose(1, (5, 5), strides=(2, 2),
                               padding='same', use_bias=False,
                               activation='tanh'),  # β†’ (28, 28, 1)
    ], name="generator")
    return model

G = make_generator()
G.summary()
# Total params: ~1.8M
# Input:  (batch, 100) ← random noise
# Output: (batch, 28, 28, 1) ← generated MNIST-sized image
# Activation tanh β†’ output range [-1, 1] (normalize real data to match!)

# Test
noise = tf.random.normal([1, 100])
fake_image = G(noise, training=False)
print(fake_image.shape)  # (1, 28, 28, 1) βœ“

πŸŽ“ Conv2D vs Conv2DTranspose:
Conv2D: Downsampling β€” gambar menjadi lebih kecil. (32Γ—32 β†’ 16Γ—16 dengan stride=2)
Conv2DTranspose: Upsampling β€” gambar menjadi lebih besar. (7Γ—7 β†’ 14Γ—14 dengan stride=2)
Conv2DTranspose = "kebalikan" dari Conv2D β€” tapi dengan learnable weights (bukan hanya interpolasi biasa). Inilah yang memungkinkan Generator membuat gambar detail dari noise.

πŸŽ“ Conv2D vs Conv2DTranspose:
Conv2D: Downsampling β€” image gets smaller. (32Γ—32 β†’ 16Γ—16 with stride=2)
Conv2DTranspose: Upsampling β€” image gets larger. (7Γ—7 β†’ 14Γ—14 with stride=2)
Conv2DTranspose = the "reverse" of Conv2D β€” but with learnable weights (not just simple interpolation). This is what allows the Generator to create detailed images from noise.

πŸ”

3. DCGAN Discriminator β€” Gambar β†’ Real/Fake

3. DCGAN Discriminator β€” Image β†’ Real/Fake

Standard CNN classifier β€” persis seperti Page 3 tapi binary output
Standard CNN classifier β€” exactly like Page 3 but with binary output
54_dcgan_discriminator.py β€” Discriminator Architecturepython
def make_discriminator():
    model = tf.keras.Sequential([
        # Conv Block 1: 28Γ—28 β†’ 14Γ—14
        layers.Conv2D(64, (5, 5), strides=(2, 2), padding='same',
                      input_shape=(28, 28, 1)),
        layers.LeakyReLU(0.2),
        layers.Dropout(0.3),

        # Conv Block 2: 14Γ—14 β†’ 7Γ—7
        layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same'),
        layers.LeakyReLU(0.2),
        layers.Dropout(0.3),

        # Flatten & classify
        layers.Flatten(),                # β†’ (7*7*128) = 6272
        layers.Dense(1),                 # β†’ logit (NO sigmoid!)
    ], name="discriminator")
    return model

D = make_discriminator()
D.summary()
# Total params: ~400k
# Input:  (batch, 28, 28, 1) ← real or fake image
# Output: (batch, 1) ← logit (use from_logits=True in loss!)

# IMPORTANT design choices:
# 1. LeakyReLU (not ReLU) β€” prevents dead neurons in D
# 2. NO BatchNorm in D (or use SpectralNorm) β€” BN can destabilize GAN
# 3. NO sigmoid at output β€” use from_logits=True (numerically stable)
# 4. Dropout in D β€” prevents D from getting too strong too fast
πŸ”„

4. GAN Training Loop β€” Adversarial Training

4. GAN Training Loop β€” Adversarial Training

Train D β†’ Train G β†’ repeat β€” menggunakan custom loop dari Page 7
Train D β†’ Train G β†’ repeat β€” using custom loop from Page 7
55_gan_training.py β€” Complete DCGAN Training πŸ”₯python
import tensorflow as tf
from tensorflow import keras
import numpy as np
import time

# ===========================
# 1. Setup
# ===========================
NOISE_DIM = 100
BATCH_SIZE = 256
EPOCHS = 50

G = make_generator()
D = make_discriminator()
g_opt = keras.optimizers.Adam(2e-4, beta_1=0.5)  # beta_1=0.5 for GANs!
d_opt = keras.optimizers.Adam(2e-4, beta_1=0.5)
bce = keras.losses.BinaryCrossentropy(from_logits=True)

# Load MNIST
(train_images, _), (_, _) = keras.datasets.mnist.load_data()
train_images = train_images.reshape(-1, 28, 28, 1).astype('float32')
train_images = (train_images - 127.5) / 127.5  # normalize to [-1, 1]!

train_ds = (tf.data.Dataset.from_tensor_slices(train_images)
    .shuffle(60000).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE))

# ===========================
# 2. Training step
# ===========================
@tf.function
def train_step(real_images):
    noise = tf.random.normal([BATCH_SIZE, NOISE_DIM])

    with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
        # Generate fake images
        fake_images = G(noise, training=True)

        # Discriminator predictions
        real_output = D(real_images, training=True)
        fake_output = D(fake_images, training=True)

        # Discriminator loss: real→1, fake→0
        d_loss_real = bce(tf.ones_like(real_output), real_output)
        d_loss_fake = bce(tf.zeros_like(fake_output), fake_output)
        d_loss = d_loss_real + d_loss_fake

        # Generator loss: fool D β†’ make D think fake is real
        g_loss = bce(tf.ones_like(fake_output), fake_output)

    # Update D
    d_grads = disc_tape.gradient(d_loss, D.trainable_variables)
    d_opt.apply_gradients(zip(d_grads, D.trainable_variables))

    # Update G
    g_grads = gen_tape.gradient(g_loss, G.trainable_variables)
    g_opt.apply_gradients(zip(g_grads, G.trainable_variables))

    return d_loss, g_loss

# ===========================
# 3. Training loop
# ===========================
seed_noise = tf.random.normal([16, NOISE_DIM])  # fixed noise for visualization

for epoch in range(EPOCHS):
    start = time.time()
    for batch in train_ds:
        d_loss, g_loss = train_step(batch)

    elapsed = time.time() - start
    print(f"Epoch {epoch+1:3d}/{EPOCHS} ({elapsed:.1f}s) | "
          f"D_loss: {d_loss:.4f} | G_loss: {g_loss:.4f}")

    # Generate sample images every 5 epochs
    if (epoch + 1) % 5 == 0:
        generated = G(seed_noise, training=False)
        # save_grid(generated, f"gen_epoch_{epoch+1}.png")
        print(f"  β†’ Sample images saved")

# Save models
G.save("generator.keras")
D.save("discriminator.keras")
print("🎨 GAN training complete!")
🧬

5. Variational Autoencoder (VAE)

5. Variational Autoencoder (VAE)

Belajar distribusi latent yang smooth β€” bisa generate dan interpolasi
Learn smooth latent distribution β€” can generate and interpolate

VAE berbeda dari GAN: alih-alih adversarial training, VAE belajar mengkompresi data ke distribusi latent (encoding) lalu merekonstruksi (decoding). Loss = reconstruction loss + KL divergence (menjaga distribusi latent mendekati Gaussian).

VAE differs from GAN: instead of adversarial training, VAE learns to compress data to a latent distribution (encoding) then reconstruct (decoding). Loss = reconstruction loss + KL divergence (keeps latent distribution close to Gaussian).

56_vae.py β€” Variational Autoencoder Completepython
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# ===========================
# 1. Sampling layer (reparameterization trick)
# ===========================
class Sampling(layers.Layer):
    """z = mu + sigma * epsilon, where epsilon ~ N(0,1)
    This trick allows gradients to flow through the sampling step!
    """
    def call(self, inputs):
        z_mean, z_log_var = inputs
        epsilon = tf.random.normal(shape=tf.shape(z_mean))
        return z_mean + tf.exp(0.5 * z_log_var) * epsilon

# ===========================
# 2. Encoder: image β†’ z_mean, z_log_var, z
# ===========================
LATENT_DIM = 16

encoder_input = keras.Input(shape=(28, 28, 1))
x = layers.Conv2D(32, 3, strides=2, padding='same', activation='relu')(encoder_input)
x = layers.Conv2D(64, 3, strides=2, padding='same', activation='relu')(x)
x = layers.Flatten()(x)
x = layers.Dense(256, activation='relu')(x)
z_mean = layers.Dense(LATENT_DIM, name='z_mean')(x)
z_log_var = layers.Dense(LATENT_DIM, name='z_log_var')(x)
z = Sampling()([z_mean, z_log_var])

encoder = keras.Model(encoder_input, [z_mean, z_log_var, z], name='encoder')

# ===========================
# 3. Decoder: z β†’ image
# ===========================
decoder_input = keras.Input(shape=(LATENT_DIM,))
x = layers.Dense(7 * 7 * 64, activation='relu')(decoder_input)
x = layers.Reshape((7, 7, 64))(x)
x = layers.Conv2DTranspose(64, 3, strides=2, padding='same', activation='relu')(x)
x = layers.Conv2DTranspose(32, 3, strides=2, padding='same', activation='relu')(x)
decoder_output = layers.Conv2DTranspose(1, 3, padding='same', activation='sigmoid')(x)

decoder = keras.Model(decoder_input, decoder_output, name='decoder')

# ===========================
# 4. VAE Model with custom train_step
# ===========================
class VAE(keras.Model):
    def __init__(self, encoder, decoder, **kwargs):
        super().__init__(**kwargs)
        self.encoder = encoder
        self.decoder = decoder

    def train_step(self, data):
        with tf.GradientTape() as tape:
            z_mean, z_log_var, z = self.encoder(data)
            reconstruction = self.decoder(z)

            # Reconstruction loss (how good is the reconstruction?)
            recon_loss = tf.reduce_mean(
                keras.losses.binary_crossentropy(data, reconstruction)
            ) * 28 * 28

            # KL divergence (how close is latent to N(0,1)?)
            kl_loss = -0.5 * tf.reduce_mean(
                1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var))

            total_loss = recon_loss + kl_loss

        grads = tape.gradient(total_loss, self.trainable_variables)
        self.optimizer.apply_gradients(zip(grads, self.trainable_variables))
        return {"loss": total_loss, "recon": recon_loss, "kl": kl_loss}

# Train
vae = VAE(encoder, decoder)
vae.compile(optimizer=keras.optimizers.Adam(1e-3))
vae.fit(train_images_01, epochs=30, batch_size=128)  # normalized [0,1]

# Generate new images from random latent vectors!
z_sample = tf.random.normal([16, LATENT_DIM])
generated = decoder(z_sample)
# β†’ 16 new MNIST-like images! 🎨
AspekGANVAE
TrainingAdversarial (2 models)Reconstruction + KL (1 model)
Image QualitySharper, lebih realistisBlurrier, smoother
Training StabilitySulit (mode collapse)Stabil dan konsisten
Latent SpaceTidak terstrukturSmooth, bisa interpolasi
LikelihoodTidak bisa menghitungBisa (ELBO)
Best ForRealistis image generationReconstruction, interpolasi
AspectGANVAE
TrainingAdversarial (2 models)Reconstruction + KL (1 model)
Image QualitySharper, more realisticBlurrier, smoother
Training StabilityDifficult (mode collapse)Stable and consistent
Latent SpaceUnstructuredSmooth, interpolatable
LikelihoodCannot computeCan compute (ELBO)
Best ForRealistic image generationReconstruction, interpolation
🎯

6. Conditional GAN β€” Generate Kelas Tertentu

6. Conditional GAN β€” Generate Specific Classes

Minta GAN membuat "angka 7" atau "wajah tersenyum" β€” control apa yang di-generate
Tell the GAN to make "number 7" or "smiling face" β€” control what gets generated
57_conditional_gan.py β€” cGAN Conceptpython
import tensorflow as tf
from tensorflow.keras import layers

# ===========================
# Conditional GAN: noise + label β†’ specific class image
# ===========================

# Generator: (noise, label) β†’ image
def make_cgan_generator(noise_dim=100, num_classes=10):
    # Noise input
    noise_in = layers.Input(shape=(noise_dim,))
    # Label input (one-hot or embedding)
    label_in = layers.Input(shape=(num_classes,))

    # Concatenate noise + label
    combined = layers.Concatenate()([noise_in, label_in])  # (110,)

    x = layers.Dense(7*7*128, activation='relu')(combined)
    x = layers.Reshape((7, 7, 128))(x)
    x = layers.Conv2DTranspose(64, 5, strides=2, padding='same', activation='relu')(x)
    x = layers.BatchNormalization()(x)
    img = layers.Conv2DTranspose(1, 5, strides=2, padding='same', activation='tanh')(x)

    return tf.keras.Model([noise_in, label_in], img, name='cgan_gen')

# Generate specific digit:
# noise = tf.random.normal([1, 100])
# label_7 = tf.one_hot([7], 10)  # "generate a 7"
# fake_7 = generator([noise, label_7])
# β†’ Generates an image that looks like the digit 7! 🎯
πŸ“Š

7. Wasserstein GAN (WGAN) β€” Training Stabil

7. Wasserstein GAN (WGAN) β€” Stable Training

Ganti BCE loss dengan Wasserstein distance β€” menyelesaikan mode collapse
Replace BCE loss with Wasserstein distance β€” solves mode collapse
58_wgan.py β€” Wasserstein GANpython
import tensorflow as tf

# ===========================
# WGAN: replace BCE with Wasserstein distance
# Critic (not "discriminator") outputs unbounded score
# ===========================

# Critic loss: maximize E[C(real)] - E[C(fake)]
def critic_loss(real_output, fake_output):
    return tf.reduce_mean(fake_output) - tf.reduce_mean(real_output)

# Generator loss: minimize -E[C(fake)]
def generator_loss(fake_output):
    return -tf.reduce_mean(fake_output)

# WGAN-GP: gradient penalty (replaces weight clipping)
def gradient_penalty(critic, real, fake, batch_size):
    alpha = tf.random.uniform([batch_size, 1, 1, 1])
    interpolated = alpha * real + (1 - alpha) * fake

    with tf.GradientTape() as tape:
        tape.watch(interpolated)
        pred = critic(interpolated, training=True)

    grads = tape.gradient(pred, interpolated)
    norm = tf.sqrt(tf.reduce_sum(tf.square(grads), axis=[1, 2, 3]) + 1e-8)
    penalty = tf.reduce_mean((norm - 1.0) ** 2)
    return penalty

# Training: update Critic 5Γ— per Generator update
# for each batch:
#   for _ in range(5):  # train critic more!
#     c_loss = critic_loss + 10 * gradient_penalty
#   g_loss = generator_loss

# WGAN advantages over standard GAN:
# βœ… Meaningful loss curve (correlates with image quality!)
# βœ… No mode collapse
# βœ… More stable training
# βœ… No need to carefully balance G and D
🌌

8. Latent Space β€” Interpolasi dan Eksplorasi

8. Latent Space β€” Interpolation and Exploration

Jelajahi ruang latent: morphing antar gambar, arithmetic pada gambar
Explore latent space: morphing between images, image arithmetic
59_latent_space.py β€” Latent Space Explorationpython
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

# ===========================
# 1. Linear interpolation between two points
# ===========================
z1 = tf.random.normal([1, 100])  # point A in latent space
z2 = tf.random.normal([1, 100])  # point B in latent space

# Generate images along the path from A to B
steps = 10
images = []
for alpha in np.linspace(0, 1, steps):
    z = z1 * (1 - alpha) + z2 * alpha  # linear interpolation (LERP)
    img = generator(z, training=False)
    images.append(img[0])

# Show: smooth transition from image A to image B!
fig, axes = plt.subplots(1, steps, figsize=(20, 2))
for i, ax in enumerate(axes):
    ax.imshow(images[i][:, :, 0], cmap='gray')
    ax.axis('off')
plt.suptitle('Latent Space Interpolation')
plt.show()

# ===========================
# 2. Spherical interpolation (SLERP β€” better for high-dim)
# ===========================
def slerp(z1, z2, alpha):
    """Spherical linear interpolation β€” better than LERP for latent space"""
    z1_norm = z1 / (tf.norm(z1) + 1e-8)
    z2_norm = z2 / (tf.norm(z2) + 1e-8)
    omega = tf.acos(tf.clip_by_value(
        tf.reduce_sum(z1_norm * z2_norm), -1.0, 1.0))
    so = tf.sin(omega)
    return tf.sin((1 - alpha) * omega) / so * z1 + tf.sin(alpha * omega) / so * z2

# ===========================
# 3. Latent space arithmetic (like Word2Vec for images!)
# ===========================
# With a face GAN:
# z_man_glasses = encoder(man_with_glasses)
# z_man = encoder(man_without_glasses)
# z_woman = encoder(woman_without_glasses)
# z_woman_glasses = z_woman + (z_man_glasses - z_man)
# result = decoder(z_woman_glasses)  β†’ woman with glasses! ✨

# ===========================
# 4. 2D latent grid visualization (for VAE with latent_dim=2)
# ===========================
# grid_x = np.linspace(-3, 3, 20)
# grid_y = np.linspace(-3, 3, 20)
# for i, xi in enumerate(grid_x):
#     for j, yi in enumerate(grid_y):
#         z = np.array([[xi, yi]])
#         img = decoder.predict(z)  β†’ shows entire manifold of digits!
πŸ’‘

9. Tips Training GAN β€” Menghindari Mode Collapse

9. GAN Training Tips β€” Avoiding Mode Collapse

GAN terkenal sulit di-train β€” ini tips dari pengalaman praktis
GANs are notoriously hard to train β€” here are practical tips

πŸŽ“ 10 Tips Training GAN Stabil:
1. Normalize images ke [-1, 1], gunakan tanh di output Generator
2. Gunakan LeakyReLU (0.2) di Discriminator, bukan ReLU
3. Jangan pakai BatchNorm di D (atau pakai SpectralNorm)
4. from_logits=True di loss β€” numerically lebih stabil
5. Label smoothing: real labels = 0.9 bukan 1.0
6. Adam beta_1=0.5 (bukan default 0.9) β€” empirically better
7. LR = 2e-4 untuk kedua model β€” jangan terlalu beda
8. Train D lebih banyak jika G loss collapse (WGAN: 5Γ—)
9. Monitor kedua loss β€” jika D_loss β†’ 0, D terlalu kuat
10. Gunakan WGAN-GP jika standard GAN tidak stabil

πŸŽ“ 10 Tips for Stable GAN Training:
1. Normalize images to [-1, 1], use tanh in Generator output
2. Use LeakyReLU (0.2) in Discriminator, not ReLU
3. No BatchNorm in D (or use SpectralNorm)
4. from_logits=True in loss β€” numerically more stable
5. Label smoothing: real labels = 0.9 not 1.0
6. Adam beta_1=0.5 (not default 0.9) β€” empirically better
7. LR = 2e-4 for both models β€” don't make them too different
8. Train D more if G loss collapses (WGAN: 5Γ—)
9. Monitor both losses β€” if D_loss β†’ 0, D is too strong
10. Use WGAN-GP if standard GAN is unstable

Generative Models Landscape β€” GAN vs VAE vs Diffusion GAN (2014) VAE (2013) Diffusion (2020) ──────────── ──────────── ──────────── Sharp images Blurry images Best quality Unstable training Stable training Slow generation No likelihood Likelihood (ELBO) Likelihood Mode collapse risk No mode collapse No mode collapse Famous models: Famous models: Famous models: StyleGAN, BigGAN Ξ²-VAE, VQ-VAE DALL-E 2, Stable ProGAN, CycleGAN NVAE Diffusion, Midjourney 2024+ trend: Diffusion Models dominate image generation But GANs still relevant for real-time generation (faster!)
πŸ“

10. Ringkasan Page 8

10. Page 8 Summary

Semua yang sudah kita pelajari
Everything we learned
KonsepApa ItuKode Kunci
GeneratorNoise β†’ fake imageConv2DTranspose, tanh output
DiscriminatorImage β†’ real/fake scoreConv2D, LeakyReLU, no sigmoid
DCGANCNN-based GANConvT(strides=2) + Conv(strides=2)
VAEEncoder + Decoder + KLSampling(z_mean, z_log_var)
cGANGAN + class conditioningConcatenate([noise, label])
WGAN-GPWasserstein distance + gradient penaltycritic_loss + 10 * gp
InterpolationSmooth morphing antar gambarz = z1*(1-Ξ±) + z2*Ξ±
ConceptWhat It IsKey Code
GeneratorNoise β†’ fake imageConv2DTranspose, tanh output
DiscriminatorImage β†’ real/fake scoreConv2D, LeakyReLU, no sigmoid
DCGANCNN-based GANConvT(strides=2) + Conv(strides=2)
VAEEncoder + Decoder + KLSampling(z_mean, z_log_var)
cGANGAN + class conditioningConcatenate([noise, label])
WGAN-GPWasserstein distance + gradient penaltycritic_loss + 10 * gp
InterpolationSmooth morphing between imagesz = z1*(1-Ξ±) + z2*Ξ±
← Page Sebelumnya← Previous Page

Page 7 β€” Custom Training & Advanced Keras

πŸ“˜

Coming Next: Page 9 β€” TF Serving & Deployment

Model di notebook tidak berguna sampai di-deploy! Page 9 membahas: SavedModel format, TF Serving REST & gRPC API, TFLite untuk Android/iOS, TF.js untuk browser, Docker containerization, model versioning & A/B testing, dan monitoring production models.

πŸ“˜

Coming Next: Page 9 β€” TF Serving & Deployment

A model in a notebook is useless until deployed! Page 9 covers: SavedModel format, TF Serving REST & gRPC API, TFLite for Android/iOS, TF.js for browser, Docker containerization, model versioning & A/B testing, and production model monitoring.