How fast is 10 tokens per second really?
Mirrored from Simon Willison for archival readability. Support the source by reading on the original site.
How fast is 10 tokens per second really?
Neat little HTML app by Mike Veerman (source code here) which simulates LLM token output speeds from 5/second to 800/second.Useful if you see a model advertised as "30 tokens/second" and want to get a feel for what that actually looks like.
Via Hacker News
Tags: ai, generative-ai, llms
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.