GenAIHub
Back to Labs
Intermediate

Stable Diffusion Image Generation

Download Lab Files

Generate AI images using Stable Diffusion. Run locally on Mac (Apple Silicon), via Docker, or on Kaggle with free GPU.

Why This Lab?

Run Locally on Mac

Native Apple Silicon support with MPS acceleration

Free GPU on Kaggle

Use Kaggle's T4/P100 GPUs at no cost

Gradio Web UI

Beautiful browser-based interface for image generation

Docker & Poetry Support

Multiple deployment options for any environment

How It Works

graph LR A[Text Prompt] --> B[Diffusers Pipeline] B --> C[Stable Diffusion Model] C --> D[Generated Image] D --> E[Save to Disk]

Diffusers Library: HuggingFace's library for diffusion models - simpler and lighter than Automatic1111 for local use.

Environment Setup

Recommended for Mac: Uses MPS (Metal Performance Shaders) for GPU acceleration on M1/M2/M3/M4 chips.

# Navigate to the project directory
cd Handson/Image_generation

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Run the WebUI (browser-based)
python webui.py
# Open http://127.0.0.1:7860

# OR run CLI version
python stable_diffusion_mac.py

First run: Downloads ~5GB model. After that, it's cached in ~/.cache/huggingface

1 Load Model

Click "πŸ“₯ Load Model" button. First time downloads ~5GB.

2 Enter Prompt

Describe the image you want in English.

3 Adjust Settings

Steps (quality), Guidance (creativity), Size.

4 Generate!

Click "🎨 Generate" and wait ~15-30 seconds.

Pro Tips:

  • Use descriptive prompts: "A majestic lion in a field of flowers, digital painting, highly detailed"
  • Add style keywords: "artstation, 8k, photorealistic, cinematic lighting"
  • Use negative prompt to exclude: "blurry, bad quality, distorted"

Kaggle Setup (Detailed Steps)

# Clone the Automatic1111 repository
!git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui /kaggle/working/stable-diffusion-webui

# Navigate to the project folder
%cd /kaggle/working/stable-diffusion-webui
# Create directories
!mkdir -p models/Stable-diffusion models/VAE

# Download Stable Diffusion 1.5 checkpoint (~4GB)
!wget -O models/Stable-diffusion/v1-5-pruned-emaonly.safetensors \
"https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors"

# Download VAE (improves colors)
!wget -O models/VAE/vae-ft-mse-840000-ema-pruned.safetensors \
"https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors"
# Remove conflicting packages
!pip uninstall -y torch torchvision torchaudio xformers -q

# Install PyTorch with CUDA 12.4
!pip install -q torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 \
--index-url https://download.pytorch.org/whl/cu124

# Install xformers
!pip install -q xformers==0.0.29.post3 --index-url https://download.pytorch.org/whl/cu124

# Install WebUI dependencies
!pip install -q -r requirements_versions.txt

# Install pyngrok
!pip install -q pyngrok

RESTART RUNTIME after this step!

Go to Run β†’ Restart Session. Then continue from Step 4.

from pyngrok import ngrok
import threading, subprocess, time, requests

# Set your ngrok token
NGROK_AUTH_TOKEN = "YOUR_NGROK_TOKEN_HERE"
ngrok.set_auth_token(NGROK_AUTH_TOKEN)

# Start WebUI in background
def run_webui():
    subprocess.run(["python", "launch.py", "--listen", "--port", "7860",
                    "--enable-insecure-extension-access", "--xformers"])

threading.Thread(target=run_webui, daemon=True).start()

# Wait for WebUI to start
time.sleep(120)

# Create ngrok tunnel
url = ngrok.connect(7860)
print(f"πŸŽ‰ Access WebUI at: {url}")

Expected Output

πŸš€ Starting Stable Diffusion Web UI...
   Open http://127.0.0.1:7860 in your browser

βœ… Model loaded on MPS!
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 25/25 [00:18<00:00, 1.38it/s]

βœ… Generated! Saved to generated_images/img_random.png

Important Notes

First Run Download

The Stable Diffusion model (~5GB) is downloaded on first use. Subsequent runs use the cached model.

Mac Compatibility

M1/M2/M3/M4 Macs use MPS (Metal). Intel Macs fall back to CPU (slow). The code uses float32 for MPS stability.

Kaggle Session Limits

Kaggle sessions may disconnect after inactivity. This is ideal for testing, not production.

Learning Checklist