LLM Phone Home: Reliable Apps that can deliver inference from local backend
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
Hello all,
I’m wondering what suggestions there are for an ios app that can serve an openai compatible endpoint. I am using 3sparks which works GREAT for that specific use, BUT, there is no mcp, no web search, etc. I want to show people that a local model with web search on your phone is very impressive, but I can’t find an app that can mimic OWUI/LMS/etc.
Texting Hermes works but I was hoping to find a solution that is not using a slow agent, just calling requests from local server.
So far, I tried:
Apollo, Locally AI, Noema, and 3 Sparks. Previously I have gone through other apps that run models in situ (in the iphone) but they don’t have remote endpoint usage. Noema seemed promising but Deepseek V4 Flash from my mac studio never makes it through a request (works great with 3 Sparks, but no web search or mcp capability).
[link] [comments]
More from r/LocalLLaMA
-
Now that MTP is merged... What's the best outputs you're getting on Qwen 3.6 35B on 2x3090s?
May 16
-
Qwen3.5-122B-Q5-MTP - Qwen3.5-122B-Q6-MTP
May 16
-
I fitted the new δ-mem research for apple silicon using mlx and openclaw integration! My findings
May 16
-
gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic is Out Now, A Writing Finetune that Aims to Improve Gemma 4 31B it Writing Quality with More Natural English and Better Prose, Good for Creative Writings, Translations and RPs!
May 16
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.