Back to Labs

Intermediate

Stable Diffusion Image Generation

Download Lab Files

Generate AI images using Stable Diffusion. Run locally on Mac (Apple Silicon), via Docker, or on Kaggle with free GPU.

Why This Lab?

Run Locally on Mac

Native Apple Silicon support with MPS acceleration

Free GPU on Kaggle

Use Kaggle's T4/P100 GPUs at no cost

Gradio Web UI

Beautiful browser-based interface for image generation

Docker & Poetry Support

Multiple deployment options for any environment

How It Works

graph LR A[Text Prompt] --> B[Diffusers Pipeline] B --> C[Stable Diffusion Model] C --> D[Generated Image] D --> E[Save to Disk]

Diffusers Library: HuggingFace's library for diffusion models - simpler and lighter than Automatic1111 for local use.

Environment Setup

Recommended for Mac: Uses MPS (Metal Performance Shaders) for GPU acceleration on M1/M2/M3/M4 chips.

# Navigate to the project directory
cd Handson/Image_generation

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Run the WebUI (browser-based)
python webui.py
# Open http://127.0.0.1:7860

# OR run CLI version
python stable_diffusion_mac.py

First run: Downloads ~5GB model. After that, it's cached in ~/.cache/huggingface

Note: Docker uses CPU on Mac. For Apple Silicon, native Python (Mac tab) is faster.

# Navigate to the project
cd Handson/Image_generation

# Build and run with docker-compose
docker-compose up

# Access at http://localhost:7860

For NVIDIA GPUs, uncomment the GPU section in docker-compose.yml.

# Navigate to the project
cd Handson/Image_generation

# Install Poetry (if not installed)
pip install poetry

# Install project dependencies
poetry install

# Run WebUI
poetry run python webui.py

# OR run CLI
poetry run python stable_diffusion_mac.py

Free GPU: Kaggle provides free T4/P100 GPUs. Uses Automatic1111 WebUI exposed via ngrok.

Prerequisites:

Kaggle Account with GPU notebook
ngrok Auth Token

# In Kaggle notebook with GPU enabled:

# 1. Upload stable_diffusion_kaggle.py
# 2. Replace YOUR_NGROK_TOKEN_HERE with your token
# 3. Run all cells in order
# 4. After cell 3, restart runtime!
# 5. Continue from cell 4

→ See detailed Kaggle steps below

1 Load Model

Click "📥 Load Model" button. First time downloads ~5GB.

2 Enter Prompt

Describe the image you want in English.

3 Adjust Settings

Steps (quality), Guidance (creativity), Size.

4 Generate!

Click "🎨 Generate" and wait ~15-30 seconds.

Pro Tips:

Use descriptive prompts: "A majestic lion in a field of flowers, digital painting, highly detailed"
Add style keywords: "artstation, 8k, photorealistic, cinematic lighting"
Use negative prompt to exclude: "blurry, bad quality, distorted"

Kaggle Setup (Detailed Steps)

# Clone the Automatic1111 repository
!git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui /kaggle/working/stable-diffusion-webui

# Navigate to the project folder
%cd /kaggle/working/stable-diffusion-webui

# Create directories
!mkdir -p models/Stable-diffusion models/VAE

# Download Stable Diffusion 1.5 checkpoint (~4GB)
!wget -O models/Stable-diffusion/v1-5-pruned-emaonly.safetensors \
"https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors"

# Download VAE (improves colors)
!wget -O models/VAE/vae-ft-mse-840000-ema-pruned.safetensors \
"https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors"

# Remove conflicting packages
!pip uninstall -y torch torchvision torchaudio xformers -q

# Install PyTorch with CUDA 12.4
!pip install -q torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 \
--index-url https://download.pytorch.org/whl/cu124

# Install xformers
!pip install -q xformers==0.0.29.post3 --index-url https://download.pytorch.org/whl/cu124

# Install WebUI dependencies
!pip install -q -r requirements_versions.txt

# Install pyngrok
!pip install -q pyngrok

RESTART RUNTIME after this step!

Go to Run → Restart Session. Then continue from Step 4.

from pyngrok import ngrok
import threading, subprocess, time, requests

# Set your ngrok token
NGROK_AUTH_TOKEN = "YOUR_NGROK_TOKEN_HERE"
ngrok.set_auth_token(NGROK_AUTH_TOKEN)

# Start WebUI in background
def run_webui():
    subprocess.run(["python", "launch.py", "--listen", "--port", "7860",
                    "--enable-insecure-extension-access", "--xformers"])

threading.Thread(target=run_webui, daemon=True).start()

# Wait for WebUI to start
time.sleep(120)

# Create ngrok tunnel
url = ngrok.connect(7860)
print(f"🎉 Access WebUI at: {url}")

Expected Output

🚀 Starting Stable Diffusion Web UI...
   Open http://127.0.0.1:7860 in your browser

✅ Model loaded on MPS!
100%|██████████| 25/25 [00:18<00:00, 1.38it/s]

✅ Generated! Saved to generated_images/img_random.png

Important Notes

First Run Download

The Stable Diffusion model (~5GB) is downloaded on first use. Subsequent runs use the cached model.

Mac Compatibility

M1/M2/M3/M4 Macs use MPS (Metal). Intel Macs fall back to CPU (slow). The code uses float32 for MPS stability.

Kaggle Session Limits

Kaggle sessions may disconnect after inactivity. This is ideal for testing, not production.

Learning Checklist

I can run Stable Diffusion locally on my Mac I understand the Diffusers library from HuggingFace I can use effective prompts for image generation I can set up Automatic1111 on Kaggle with free GPU I know how to use ngrok to expose local services

Additional Resources

Diffusers Documentation Automatic1111 GitHub Stable Diffusion 1.5 on HF Kaggle (Free GPU)