With a 2B-4B model loaded, the remaining memory supports windows in the 4K-8K range: a long article, a meeting transcript, an email chain. Push further and the app either truncates silently or slows to a crawl; phone AI apps rarely surface which, so test with a known-length document.
The workable mobile pattern is summarize-and-carry: condense on the phone, accumulate summaries in notes, do corpus-scale questions on a Mac later. For genuine document analysis from your pocket, remote-control a home Mac (Tailscale plus Open WebUI) instead of fighting the phone.