r/LocalLLaMA ยท ยท 2 min read

RAG on Snapdragon X2 Laptop, 200K documents.

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

RAG on Snapdragon X2 Laptop, 200K documents.

Qualcomm recently released the new ๐’๐ง๐š๐ฉ๐๐ซ๐š๐ ๐จ๐ง ๐—2 ๐ฅ๐š๐ฉ๐ญ๐จ๐ฉ ๐œ๐ก๐ข๐ฉ๐ฌ๐ž๐ญ. I immediately ordered one: ASUS Zenbook A16 16" 3K OLED Touchscreen Laptop โ€” Snapdragon X2 Elite Extreme (2026)

A few things I really like about this machine:

  1. ๐„๐ฑ๐ญ๐ซ๐ž๐ฆ๐ž๐ฅ๐ฒ ๐ฅ๐ข๐ ๐ก๐ญ.
    Recently, I carried it single-handedly across Hong Kong Airport from customs all the way to Gate G46 while still running programs before boarding. I felt I was holding a big cell phone.

  2. ๐•๐ž๐ซ๐ฒ ๐ฉ๐จ๐ซ๐ญ๐š๐›๐ฅ๐ž ๐ฉ๐จ๐ฐ๐ž๐ซ ๐š๐๐š๐ฉ๐ญ๐จ๐ซ.
    Compared to the heavy power brick required by RTX laptops, the adaptor is dramatically lighter. Nevertheless, its power consumption still exceeds the in-flight charging limit on United.

  3. ๐’๐ญ๐ซ๐จ๐ง๐  ๐๐๐” ๐ฉ๐ž๐ซ๐Ÿ๐จ๐ซ๐ฆ๐š๐ง๐œ๐ž.
    When the NPU is properly utilized, performance is good. For example, embedding/indexing speed reaches roughly 50% of an RTX 5060 laptop, while operating in a much lighter and quieter form factor.

The attached video demonstrates VecMLโ€™s AI-PC software running on this laptop.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:

โ€ข ๐Œ๐š๐ฌ๐ฌ๐ข๐ฏ๐ž ๐๐จ๐œ๐ฎ๐ฆ๐ž๐ง๐ญ ๐œ๐จ๐ฅ๐ฅ๐ž๐œ๐ญ๐ข๐จ๐ง: ~200,000 files being indexed (~100,000 completed in this run)

โ€ข ๐‹๐จ๐ฐ-๐ญ๐จ๐ค๐ž๐ง ๐ซ๐ž๐ญ๐ซ๐ข๐ž๐ฏ๐š๐ฅ: only ~1200 retrieval tokens used in this experiment

โ€ข ๐‹๐จ๐ฐ-๐ฆ๐ž๐ฆ๐จ๐ซ๐ฒ ๐‘๐€๐†: most data offloaded to disk with only a 128-shard active buffer

โ€ข ๐…๐š๐ฌ๐ญ ๐š๐ง๐ ๐š๐œ๐œ๐ฎ๐ซ๐š๐ญ๐ž ๐‘๐€๐† ๐ฉ๐ž๐ซ๐Ÿ๐จ๐ซ๐ฆ๐š๐ง๐œ๐ž ๐จ๐ง-๐๐ž๐ฏ๐ข๐œ๐ž

๐๐ž๐ก๐ข๐ง๐ ๐ญ๐ก๐ž ๐ฌ๐œ๐ž๐ง๐ž๐ฌ, ๐•๐ž๐œ๐Œ๐‹โ€™๐ฌ ๐š๐ฅ๐ฅ-๐ข๐ง-๐จ๐ง๐ž ๐€๐ˆ ๐๐š๐ญ๐š๐›๐š๐ฌ๐ž ๐ฉ๐ฅ๐š๐ฒ๐ฌ ๐š ๐ค๐ž๐ฒ ๐ซ๐จ๐ฅ๐ž.

Enterprise-scale AI systems typically require multiple databases working together:
โ€ข Vector database
โ€ข Graph database
โ€ข Relational database
โ€ข Key-value store
โ€ข Search database
โ€ข Document database

We developed an in-house AI database platform that integrates the core functionality of all six systems into a unified architecture for enterprise AI and agent systems.

This enables joint optimization across indexing, retrieval, graph traversal, storage, and memory management, helping achieve low-token, low-memory, fast, and accurate AI systems on both cloud and AI-PC deployments.

The demo shown here runs on a Snapdragon X2 Windows laptop. ๐Ž๐ฎ๐ซ ๐ฆ๐š๐œ๐Ž๐’ ๐€๐ˆ-๐๐‚ ๐ฌ๐จ๐Ÿ๐ญ๐ฐ๐š๐ซ๐ž ๐ข๐ฌ ๐ง๐จ๐ฐ ๐จ๐ฉ๐ž๐ง ๐Ÿ๐จ๐ซ ๐œ๐จ๐ง๐ญ๐ซ๐จ๐ฅ๐ฅ๐ž๐ ๐ญ๐ž๐ฌ๐ญ๐ข๐ง๐ .

submitted by /u/DueKitchen3102
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds โ€” email code or GitHub.

Sign in โ†’

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA