Modal cold start taking too long — how to fix?

Modal cold start fix: (1) use keep_warm=1 in the function decorator to keep one container warm: @app.function(gpu='A10G', keep_warm=1); (2) pre-load the model at module level or in a method decorated with @modal.build() / @modal.enter() so re-warm starts reuse the loaded model; (3) use snapshot_restore with modal.Image for faster Python import by snapshotting after heavy imports.

Modal function timeout — how to increase it?

Modal default timeout is 300 seconds (5 minutes). Increase with timeout parameter: @app.function(timeout=3600). Maximum is 86400 seconds (24 hours). If your function is a long-running training job, also checkpoint to Modal Volumes with vol.commit() periodically — timed-out functions lose their outputs.

Free tier 5 min read

Modal Not Working?

Cold start too slow, image build failing, function timeout, volume writes not persisting, secrets missing, or GPU quota exceeded? Check live status and fix it fast.

Modal — live status

Updated every 5 minutes. Full history at prismix.dev/service/modal.

Full status →

What's wrong? Diagnose fast

🥶

Cold start taking 10-30+ seconds

GPU cold start = image pull + container boot + Python imports. Use keep_warm=1 to maintain a warm container. Pre-load models outside the function body (module-level or with @modal.enter()) so re-warm starts reuse the loaded model without re-downloading weights.

🚧

Image build failed

Check the Builds tab in the Modal dashboard for the error. Common cause: pip package needs a system library. Add apt_install() before pip_install(). Examples: libgl1 (OpenCV), libsndfile1 (audio), ffmpeg (video processing). Use image.run_commands("...") for complex setup.

⏱

Function times out

Default timeout = 300s (5 min). Long ML jobs need explicit timeout: @app.function(timeout=3600). Max = 86400s (24h). Checkpoint to Modal Volumes periodically inside long-running functions — timed-out functions lose all in-memory outputs. Monitor progress with modal.experimental.stop_fetching_outputs() for streaming jobs.

💾

Volume writes not persisting

Must call vol.commit() inside the function before it returns. Uncommitted writes are discarded when the container exits. For long training jobs: commit every N steps. Pattern: with vol.batch_upload() as batch: batch.put_file(...).

🔐

Secret not accessible / os.environ missing

Modal Secrets are not available in local code — only inside functions running on Modal. Add to decorator: @app.function(secrets=[modal.Secret.from_name("MY_SECRET")]). Inside the function: os.environ["MY_VAR"]. To create: modal secret create MY_SECRET MY_VAR=value.

💳

GPU quota exceeded or billing

Free tier: $30 credit at signup. Check usage at modal.com/billing. If GPU requests fail: account may have reached the concurrency limit. Request an increase at modal.com/support. GPU pricing: A10G $0.30/hr, A100 $2.50/hr, H100 $4.50/hr. Use gpu_config=modal.GPU.A10G(count=1) for multi-GPU.

Modal patterns quick reference

Warm-start GPU function with model pre-loading

import modal

app = modal.App("my-app")

image = (
    modal.Image.debian_slim()
    .apt_install("libgl1", "ffmpeg")  # system deps FIRST
    .pip_install("torch", "transformers")
)

@app.cls(gpu="A10G", image=image, keep_warm=1, timeout=600)
class Model:
    @modal.enter()
    def load(self):
        # runs once when container starts — model stays warm
        from transformers import pipeline
        self.pipe = pipeline("text-generation", model="gpt2")

    @modal.method()
    def generate(self, prompt: str) -> str:
        return self.pipe(prompt)[0]["generated_text"]

Volume with explicit commit

vol = modal.Volume.from_name("my-volume", create_if_missing=True)

@app.function(volumes={"/data": vol}, timeout=3600)
def train():
    import torch
    model = MyModel()
    for epoch in range(100):
        train_epoch(model)
        # checkpoint every 10 epochs
        if epoch % 10 == 0:
            torch.save(model.state_dict(), f"/data/checkpoint_{epoch}.pt")
            vol.commit()  # REQUIRED — writes lost without this

Secrets usage

# Create secret: modal secret create openai-secret OPENAI_API_KEY=sk-...

@app.function(secrets=[modal.Secret.from_name("openai-secret")])
def call_openai():
    import os, openai
    client = openai.OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    return client.chat.completions.create(...)

Modal GPU options

GPU	VRAM	Price/hr	Use case
T4	16 GB	~$0.06	Light inference, dev/test
L4	24 GB	~$0.20	Mid-tier inference, fine-tuning small models
A10G	24 GB	~$0.30	General inference, fine-tuning 7B-13B
A100 (40GB)	40 GB	~$2.50	Training, large model inference
A100 (80GB)	80 GB	~$3.50	Large models (Llama 70B, fine-tune 30B+)
H100	80 GB	~$4.50	Fastest training, cutting-edge models

Step-by-step fix

1

Check live Modal status

Visit prismix.dev/service/modal. Modal tracks scheduling, builds, and dashboard independently.
2

Fix cold starts

Add keep_warm=1 to your function/class decorator. Move model loading to a @modal.enter() method inside a @app.cls class. This loads the model once when the container starts, not on every call.
3

Fix image build failures

Open the Builds tab in the Modal dashboard. Read the error message. Add system dependencies with .apt_install("pkg-name") BEFORE .pip_install("...") in your image definition. Use .run_commands("bash -c ...") for arbitrary build steps.
4

Fix function timeout

Add timeout=3600 (or higher) to your function decorator. Maximum is 86400 (24h). For long training jobs: add periodic vol.commit() calls to checkpoint progress to a Volume.
5

Fix volume writes / secrets

Volume writes: call vol.commit() before the function returns. Secrets: add secrets=[modal.Secret.from_name("MY_SECRET")] to the decorator, then access inside the function via os.environ["MY_VAR"].

🔔

Get alerted when Modal goes down

Star Modal on Prismix and get emailed the moment status changes. Free, no credit card.