Chat with PDF β without uploading your PDF anywhere
The category leader for "chat with PDF" is ChatPDF, and it works fine β but every PDF you ask a question of gets uploaded to a third-party server, parsed there, embedded there, and stored on someone else's disk. For a class reading or a public arXiv paper that's nothing. For a contract, a medical record, an internal report, or a draft you haven't published yet, that's a problem. This tool flips it: the PDF is parsed inside your browser using pdf.js (the same library Firefox ships), chunked locally, and only the relevant chunks plus your question are sent to your own API key at OpenAI, Anthropic, or Google. Your provider sees the question and the chunks, but nobody else does β including us. Our server is a static HTML page on a CDN; there's no backend to leak from.
How "chat with PDF" actually works under the hood
A naive approach would stuff the entire PDF text into the LLM prompt every time you ask a question. That works for short docs but blows your context window and burns dollars for long ones. The standard approach is RAG (retrieval-augmented generation): chunk the document into ~500-word passages, score each chunk against the question, send only the top few to the model. This tool does a lightweight in-browser version of that β chunks by page and overlap, then ranks chunks by TF-IDF-style keyword overlap against your question, sends the top 6 chunks plus your question to the model, and asks it to cite which chunks it used. The chunks-per-question is what keeps the cost low: a 60-page PDF doesn't cost 60 pages worth of tokens per question, it costs about 3,000 tokens β pennies on gpt-4o-mini, fractions of a cent on Gemini Flash.
BYOK β bring your own key, get the real pricing
"Bring your own key" (BYOK) means you paste your API key from OpenAI, Anthropic, or Google, and the browser sends requests directly to that provider. No middleman, no markup, no monthly subscription gating your usage. A 30-page document on gpt-4o-mini typically costs under a penny per question. The same document on Claude Sonnet costs a few cents per question. Gemini Flash is roughly free at this scale. The downside of BYOK is that you have to sign up for an API key once β links are in the "How to get a key" panel. The upside is you only pay for what you ask, and your data goes straight to your provider, not to a SaaS in the middle.
What this Chat with PDF tool supports
Any text-based PDF (selectable text in your viewer). The tool handles multi-column papers, financial reports, contracts, manuals, and books up to about 500 pages. It does not OCR scanned PDFs β for those, run them through an OCR tool like Tesseract.js first. It supports page-citation answers, follow-up questions in the same chat (history is included up to a budget), and instantly switching providers mid-conversation. Your key is never sent to our domain; open dev tools and check the network tab. The only outbound requests are to api.openai.com, api.anthropic.com, or generativelanguage.googleapis.com.
Chat with PDF β FAQ
Is the chat actually free? The tool is free. You pay your LLM provider directly for tokens β usually pennies per long document.
Does the PDF get uploaded? No. Open your browser's dev tools, watch the network tab. The PDF never leaves your machine.
Will it work for huge PDFs? Up to about 500 pages / 50 MB. Above that, browsers run out of memory and chunk ranking gets slow.
What about scanned PDFs? They have no extractable text. Run them through an OCR tool first.
Can I save my key? Yes, tick "Remember key on this device" β it's stored in this browser's localStorage and only used by this page.
Which model is best? Gemini Flash is the cheapest, gpt-4o-mini is the sweet spot, Claude Sonnet is the most accurate on long reasoning. For most "chat with PDF" use cases, gpt-4o-mini is plenty.