Free tier 5 min read

Modal Not Working?

Cold start too slow, image build failing, function timeout, volume writes not persisting, secrets missing, or GPU quota exceeded? Check live status and fix it fast.

Modal live status

Modal — live status

Updated every 5 minutes. Full history at prismix.dev/service/modal.

Full status →

What's wrong? Diagnose fast

🥶

Cold start taking 10-30+ seconds

GPU cold start = image pull + container boot + Python imports. Use keep_warm=1 to maintain a warm container. Pre-load models outside the function body (module-level or with @modal.enter()) so re-warm starts reuse the loaded model without re-downloading weights.

🚧

Image build failed

Check the Builds tab in the Modal dashboard for the error. Common cause: pip package needs a system library. Add apt_install() before pip_install(). Examples: libgl1 (OpenCV), libsndfile1 (audio), ffmpeg (video processing). Use image.run_commands("...") for complex setup.

Function times out

Default timeout = 300s (5 min). Long ML jobs need explicit timeout: @app.function(timeout=3600). Max = 86400s (24h). Checkpoint to Modal Volumes periodically inside long-running functions — timed-out functions lose all in-memory outputs. Monitor progress with modal.experimental.stop_fetching_outputs() for streaming jobs.

💾

Volume writes not persisting

Must call vol.commit() inside the function before it returns. Uncommitted writes are discarded when the container exits. For long training jobs: commit every N steps. Pattern: with vol.batch_upload() as batch: batch.put_file(...).

🔐

Secret not accessible / os.environ missing

Modal Secrets are not available in local code — only inside functions running on Modal. Add to decorator: @app.function(secrets=[modal.Secret.from_name("MY_SECRET")]). Inside the function: os.environ["MY_VAR"]. To create: modal secret create MY_SECRET MY_VAR=value.

💳

GPU quota exceeded or billing

Free tier: $30 credit at signup. Check usage at modal.com/billing. If GPU requests fail: account may have reached the concurrency limit. Request an increase at modal.com/support. GPU pricing: A10G $0.30/hr, A100 $2.50/hr, H100 $4.50/hr. Use gpu_config=modal.GPU.A10G(count=1) for multi-GPU.

Modal patterns quick reference

Warm-start GPU function with model pre-loading

import modal

app = modal.App("my-app")

image = (
    modal.Image.debian_slim()
    .apt_install("libgl1", "ffmpeg")  # system deps FIRST
    .pip_install("torch", "transformers")
)

@app.cls(gpu="A10G", image=image, keep_warm=1, timeout=600)
class Model:
    @modal.enter()
    def load(self):
        # runs once when container starts — model stays warm
        from transformers import pipeline
        self.pipe = pipeline("text-generation", model="gpt2")

    @modal.method()
    def generate(self, prompt: str) -> str:
        return self.pipe(prompt)[0]["generated_text"]

Volume with explicit commit

vol = modal.Volume.from_name("my-volume", create_if_missing=True)

@app.function(volumes={"/data": vol}, timeout=3600)
def train():
    import torch
    model = MyModel()
    for epoch in range(100):
        train_epoch(model)
        # checkpoint every 10 epochs
        if epoch % 10 == 0:
            torch.save(model.state_dict(), f"/data/checkpoint_{epoch}.pt")
            vol.commit()  # REQUIRED — writes lost without this

Secrets usage

# Create secret: modal secret create openai-secret OPENAI_API_KEY=sk-...

@app.function(secrets=[modal.Secret.from_name("openai-secret")])
def call_openai():
    import os, openai
    client = openai.OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    return client.chat.completions.create(...)

Modal GPU options

GPU VRAM Price/hr Use case
T4 16 GB ~$0.06 Light inference, dev/test
L4 24 GB ~$0.20 Mid-tier inference, fine-tuning small models
A10G 24 GB ~$0.30 General inference, fine-tuning 7B-13B
A100 (40GB) 40 GB ~$2.50 Training, large model inference
A100 (80GB) 80 GB ~$3.50 Large models (Llama 70B, fine-tune 30B+)
H100 80 GB ~$4.50 Fastest training, cutting-edge models

Step-by-step fix

  1. 1

    Check live Modal status

    Visit prismix.dev/service/modal. Modal tracks scheduling, builds, and dashboard independently.

  2. 2

    Fix cold starts

    Add keep_warm=1 to your function/class decorator. Move model loading to a @modal.enter() method inside a @app.cls class. This loads the model once when the container starts, not on every call.

  3. 3

    Fix image build failures

    Open the Builds tab in the Modal dashboard. Read the error message. Add system dependencies with .apt_install("pkg-name") BEFORE .pip_install("...") in your image definition. Use .run_commands("bash -c ...") for arbitrary build steps.

  4. 4

    Fix function timeout

    Add timeout=3600 (or higher) to your function decorator. Maximum is 86400 (24h). For long training jobs: add periodic vol.commit() calls to checkpoint progress to a Volume.

  5. 5

    Fix volume writes / secrets

    Volume writes: call vol.commit() before the function returns. Secrets: add secrets=[modal.Secret.from_name("MY_SECRET")] to the decorator, then access inside the function via os.environ["MY_VAR"].

🔔

Get alerted when Modal goes down

Star Modal on Prismix and get emailed the moment status changes. Free, no credit card.

Frequently asked questions

Why is Modal not working?

Modal issues: (1) cold start slow (GPU functions take 5-30s — use keep_warm=1 and @modal.enter() for model loading); (2) image build failed (apt_install system deps before pip_install, check Builds tab); (3) function timeout (default 300s — add timeout=3600 or higher); (4) volume writes lost (call vol.commit() before function return); (5) secret missing (add secrets=[modal.Secret.from_name("NAME")] to decorator); (6) outage (prismix.dev/service/modal).

Is Modal down right now?

Check prismix.dev/service/modal for live Modal status. Also status.modal.com. Modal may have partial outages affecting scheduling, image builds, or storage independently.

Modal cold start slow — how to make it faster?

Modal cold start fix: (1) add keep_warm=1 to keep one container warm at all times; (2) move expensive imports and model loading to @modal.enter() method in @app.cls() class — runs once per container start, not per call; (3) try snapshot_restore pattern for large Python dependency trees; (4) use a smaller GPU (T4 starts faster than A100) for low-latency tasks.

Modal volume writes not persisting — why?

Modal Volumes require explicit commit. Call vol.commit() inside the function before it exits. Uncommitted writes are discarded. For long training loops: commit every N steps. Pattern: torch.save(model.state_dict(), "/data/checkpoint.pt"); vol.commit().

Modal image build failing — how to fix?

Check the Builds tab in the Modal dashboard for the full error log. Most common cause: pip package needs a system library not in the base image. Fix: add .apt_install("package-name") before .pip_install("...") in your image chain. Common system deps: libgl1 (OpenCV/cv2), libsndfile1 (torchaudio), ffmpeg (video), libssl-dev (cryptography).

Related GPU compute and AI APIs