Repair PDF

Best-effort repair of a damaged PDF — re-save normalised or extract text.

This is a best-effort tool. Truncated files, files with destroyed cross-reference tables, or files encrypted with a lost password cannot be fully recovered. The tool will tell you honestly how far it got and offer a downloadable normalised PDF if successful, or extracted text content as a fallback.

Processed on your device. We never see your files.

How to use Repair PDF

What this tool does, and what it doesn’t

“Repair PDF” is a best-effort operation. PDFs go wrong in a wide variety of ways and no single tool fixes all of them. This page is honest about which kinds of damage it does fix, which it doesn’t, and what you get back in either case.

The strategy is two-tier:

Tier 1 — pdf-lib re-save. The damaged file is opened with pdf-lib’s permissive parser, parsed in full, and re-serialised to a clean PDF. This rebuilds the cross-reference table, drops orphaned objects, normalises object numbering, and writes a fresh %%EOF. For the most common kinds of damage, this alone produces a viewer-friendly file.
Tier 2 — pdf.js text fallback. If Tier 1 throws, the tool loads the file in pdf.js (which uses a different parser with different tolerances) and extracts whatever text and page structure it can. The output of this tier is a text-only PDF — you get the words back, but images, vector graphics, fonts and layout are lost. It’s the file equivalent of “this contract is shredded; here’s a transcript”.

You get to see which tier produced the result so you know what you’re holding.

What counts as repairable

PDF files end with a xref cross-reference table and a %%EOF marker. Most “PDF won’t open” errors come from one of:

A truncated or corrupt xref table — the index at the end of the file got partially overwritten, lost, or appended after another %%EOF. pdf-lib scans the body for object headers (N M obj), rebuilds the xref from the objects it finds, and writes the file out. Highly recoverable.
Dangling indirect references — an object claims to point to another object that doesn’t exist, or to a wrong byte offset. The re-serialisation simply omits the dead pointer.
Missing %%EOF — common when a transfer was interrupted, an email client truncated trailing bytes, or a generating program crashed. The parser scans for the last good object and writes a fresh trailer.
Multiple appended revisions where the last one is broken — PDFs allow incremental updates appended to the end of the file. If the last revision is corrupt, the parser can often roll back to the previous good revision and re-save.

These are the bulk of real-world PDF damage and they’re what this tool is good at.

What this tool cannot repair

Severely truncated files. If most of the body is gone (e.g. 95% of the file was lost in transit), there’s nothing to rebuild. pdf-lib will fail; the fallback may extract some text from whatever fragment remains, but a 200-page document reconstructed from a 4 KB tail is going to be mostly empty.
Destroyed content streams. Page content lives in compressed streams (/FlateDecode, sometimes others). If those streams are corrupt rather than the xref pointing to them, the parser can read the page object but can’t render its contents. pdf.js may recover partial text via the streams it can decode; the rest is lost.
Encrypted files with a lost password. Encryption is not damage. No tool — local, cloud, or commercial — can decrypt a modern PDF without its password.
Password-recovery via brute force. Out of scope. This is a parser-level repair tool, not a cryptanalysis tool.
Files corrupted by being re-saved as something else. A PDF renamed to .pdf after being saved as a Word document or a JPEG isn’t a PDF at all. Use the relevant convert tool, not repair.

Common use cases

A PDF that “opens but is blank” or “fails to load” in Chrome, Acrobat, or Preview — usually a Tier 1 repair fixes it in one pass.
An email attachment that downloaded with the wrong content length — Tier 1 rebuilds the trailer.
A file from an old archive that worked years ago and now doesn’t — modern parsers have got stricter; the lax re-serialisation here often makes it readable again.
Recovering text from a “mostly broken” PDF — Tier 2 gives you what’s left as searchable text, even if the layout is gone.

How to use this Repair PDF tool

Drop the damaged PDF onto the dropzone.
Click Repair. The tool tries Tier 1 first.
If Tier 1 succeeds, a Download button appears with the re-saved file. Open it in your usual reader to confirm.
If Tier 1 fails, the tool falls back to Tier 2 and offers you the text-only reconstruction. The download is labelled clearly so you know it isn’t a full restoration.
If both tiers fail, the file’s damage is beyond what’s repairable from the browser. Adobe Acrobat or a specialised recovery service is the next step.

Security and limits considerations

A “successful” Tier 1 repair means the file is structurally valid PDF that opens in any reader. It does not mean every page rendered identically to the original — if a content stream was silently corrupt and the parser quietly skipped a damaged region, you can end up with a missing image or a blank patch on a page. Compare the output to whatever you remember of the original before relying on it.

A “successful” Tier 2 repair is text only by design. Treat it as a transcript, not as the document itself. Fonts, page layout, images, signatures, form fields and annotations are gone.

Privacy

Both repair tiers run entirely in this browser tab. The damaged file is read into memory, parsed locally, re-serialised locally, and offered back to you as a Blob. There is no upload, no temporary cloud storage, and no telemetry on file contents. The only network requests this page makes are for its initial JavaScript bundle.

Compatibility notes

The repaired file is a standard PDF 1.7 document. Every modern reader opens it: Adobe Acrobat, Apple Preview, the browser viewers in Chrome / Edge / Firefox / Safari, and the system viewers on iOS and Android. The text-only fallback is also a standard PDF, just with no images or fonts beyond the default sans-serif.

Frequently asked questions

What kinds of damage can this tool actually repair?

Three common classes of damage are well within reach: a missing or corrupt cross-reference (xref) table (the index at the end of the file that says where each object lives), dangling object references (pointers to objects that don't exist or to slightly wrong offsets), and a missing or malformed %%EOF marker. pdf-lib's parser is tolerant of all three — it scans the body for objects, rebuilds the index, and writes a clean file. Damage outside those classes is increasingly likely to fail; see the limits section.

Can it repair a password-protected file I've lost the password to?

No. Encryption is not damage — it is the file working correctly to keep someone out. This tool refuses encrypted PDFs the same way every legitimate PDF tool does. If you have the password, unlock with PDF Password Remover first and then repair. If you don't, no repair tool — including Adobe's — will recover the contents without it. Lost-password recovery is a brute-force problem, not a parsing problem.

How does this compare to Adobe Acrobat's repair or Adobe Document Cloud?

Acrobat has more aggressive repair heuristics for some kinds of damage (especially partial content-stream corruption) because it ships with Adobe's own internal PDF parser, which is more permissive than pdf-lib's. For the common cases — broken xref, missing EOF, dangling references — the result here is comparable. The honest difference: Acrobat sometimes succeeds where this tool falls back to text-only extraction, and very occasionally vice versa. Try this tool first (it runs locally); if it can't rebuild the file, Adobe's tools are the next step.

Will the repaired file be smaller than the original?

Often yes, sometimes by a noticeable margin. The repair re-serialises the file with a fresh xref table and discards orphaned objects (objects that no page or catalog references — common residue from earlier broken edits). The visible content is unchanged, but the file is structurally tidier. Don't rely on this as a compressor — for genuine size reduction, use Compress PDF, which targets image streams.

Does the broken PDF get uploaded anywhere?

No. The damaged file is opened by pdf-lib (and, in the fallback case, pdf.js) in this browser tab. Both libraries run as JavaScript on your device. Nothing about the file's contents — visible or recovered — leaves the page. The only network traffic this page generates is for its initial JavaScript bundle; once loaded, you can disconnect from the network and the repair still works.

Related tools

PDF Compressor

Reduce PDF file size for easier sharing.

PDF Flatten

Flatten PDF layers and form fields into static content.

PDF Text Extractor

Pull the plain text content out of a PDF.

PDF to TXT

Extract a PDF's text and download it as a .txt file.

PDF Merger

Combine multiple PDF files into one document.

PDF Metadata Editor

Edit the title, author, and subject of a PDF.