llama.cpp releases · · 1 min read

b9260

Mirrored from llama.cpp releases for archival readability. Support the source by reading on the original site.

opencl: refactor backend initilization (#23318)

  • opencl: refactor initialization

  • opencl: refactor GPU identification

  • opencl: rename for consistency

  • opencl: cache global mem size in dev_ctx

  • opencl: adjust log level

  • opencl: load argsort and flash_attn kernels in supports_op

  • argsort kernel must be built for supports_op for querying the max
    workgroups

  • flash_attn kernel has many variants, only load them when needed

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from llama.cpp releases