Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124


For many people, the most attractive reason to use a local chatbot will be the amount of money you can save. Currently, using a local model of your iPhone involves a maximum one-time purchase of $5.
Compare that to a subscription from one of the big AI labs. For example, if you want to use ChatGPT without ads, you’ll need to spend at least $20 per month for OpenAI’s Plus plan. You could get away with the more affordable Go tier, or even stick with the free offering if you only plan to use ChatGPT sporadically, but then you also need to consider the speed limits. Similarly, Google AI plans start at $8 per month, but you can spend up to $100 each month on its Ultra subscription. When you launch an AI chatbot from your iPhone, you can use it as much as you want. As a power user, you are very likely to hit your daily usage limit with ChatGPT, Claude or Gemini if you don’t.
For privacy, local chatbots offer another advantage. None of the options I’ll recommend in this article require you to log in or share your data with the labs that trained the models you want to run. The app’s developers also say they don’t collect usage information. With proprietary models, you must accept your prompts and any information, images, audio or video you share will be used to train future models. There are rare exceptions. Proton’s Lumo chatbot, for example, is completely private by default. For most chatbots, including ChatGPT, you’ll have to do some digging to opt out of sharing your model training data.
Something you also can’t do with ChatGPT, Claude or Gemini is use them without an internet connection, while local chatbots can work even if you’re offline.
However, there are a few drawbacks worth noting. As capable as the latest open weight models are, they are not as sophisticated as the latest proprietary models from Anthropic, OpenAI, and other for-profit AI labs. For example, closed models, because of the powerful cloud hardware that powers them, typically offer longer context windows that allow them to reference information from past chats. In practice, this translates to chatbots that feel more intelligent and conversational, as you won’t need to repeat yourself often, if ever.
What’s more, both ChatGPT and Claude offer robust « memory » features that allow them to customize their responses to each user. My version of ChatGPT knows that my main ax is a 1993 Fender Stratocaster and will often mention that fact when I ask him guitar related questions. For some people, this is something that can make using a chatbot addictive, as the system seems to want to know them.
If you need a chatbot that can provide timely information, the local model probably won’t cut it. All LLMs have a knowledge limit. This is the point in time beyond which their training data does not cover. In the case of GPT-5.5 Instant, for example, it won’t be able to refer to events after August 2024. Meanwhile, for Llama 3.2, that date is December 2023.
To answer questions beyond the boundaries of their knowledge, the model will ideally turn to a robust web search tool. Patented models offer two advantages when it comes to timeliness. First, the current pace at which companies like OpenAI are releasing new models means that these systems inherently incorporate newer data because they are newer. Also, since you need an internet connection to use ChatGPT, Claude or Gemini, these chatbots can easily search the web to boost their responses. Open source models can use web search tools, but not without third-party extensions.
AI
#Run #Local #Chatbot #iPhone