Edit PDF metadata (2025): Info & XMP β safely
Learn how to view, edit, and remove PDF metadata (Info dictionary & XMP) without damaging files. Clean authorship, software traces, timestamps, and keywords. Verify with multiple tools and automate at scale.
Privacy Risk
Authors, software, dates
Lossless
Metadata-only
Automation
CLI friendly
PDF metadata: Info vs XMP
PDFs keep metadata in two places: the classic Info dictionary (Title, Author, Subject, Keywords, Creator, Producer, CreationDate, ModDate) and embedded XMP (XML packet). Tools may update only one; keep them consistent.
Inspect metadata (GUI/CLI)
ExifTool
Deep view (Info + XMP)
exiftool -G -a -s "doc.pdf"
Shows both Info and XMP, grouped by origin.
Edit common fields (Title, Author, Keywords)
ExifTool (preferred)
Updates Info & XMP
# Set basic fields exiftool -Title="Project Plan" -Author="ACME" -Subject="Q4" -Keywords="plan,Q4" doc.pdf # Ensure XMP and Info match exiftool -overwrite_original -Title= -Author= -Subject= -Keywords= -XMP:Title="Project Plan" -XMP:Author="ACME" -XMP:Subject="Q4" -XMP:Keywords="plan,Q4" doc.pdf
-overwrite_original only when confident; otherwise keep a backup.pdftk (Info only)
Legacy tool
# Write Info via a metadata file (UTF-8) printf "InfoKey: Title InfoValue: Project Plan " > meta.txt pdftk doc.pdf update_info_utf8 meta.txt output out.pdf
Remove metadata (XMP & Info)
# 1) Wipe everything with ExifTool (keeps PDF content) exiftool -all= -overwrite_original "doc.pdf" # 2) Alternatively, clear specific fields only exiftool -Title= -Author= -Subject= -Keywords= -Creator= -Producer= -CreateDate= -ModifyDate= "doc.pdf" # 3) Verify with multiple tools exiftool -G -a -s "doc.pdf" mutool info "doc.pdf"
Keep Info and XMP in sync
# Copy XMP values into Info (and vice versa) explicitly exiftool -Title<XMP:Title -Author<XMP:Author -Subject<XMP:Subject -Keywords<XMP:Keywords -overwrite_original "doc.pdf" # Or copy Info into XMP if you trust Info more exiftool -XMP:Title<Title -XMP:Author<Author -XMP:Subject<Subject -XMP:Keywords<Keywords -overwrite_original "doc.pdf"
Optimize & linearize (delivery)
qpdf
Linearize + sanitize
qpdf --linearize --object-streams=generate --stream-data=preserve clean.pdf clean_linear.pdf
Good for web delivery (fast first page).
Ghostscript
Rebuild (optional)
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.7 -dNOPAUSE -dBATCH -sOutputFile=rebuilt.pdf clean.pdf
CI/Batch workflows (GitHub Actions)
name: sanitize-pdf
on:
push:
paths: ["docs/**/*.pdf"]
jobs:
clean:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Wipe metadata
run: |
for f in docs/**/*.pdf; do
exiftool -all= -overwrite_original "$f"
done
- name: Verify
run: |
for f in docs/**/*.pdf; do
exiftool -G -a -s "$f" > "${f%.pdf}.meta.txt"
doneRedaction vs metadata β quick privacy tip
Removing metadata does not remove sensitive text/images from pages. Proper redaction deletes page content, not just overlays. Validate by copying text, searching, and inspecting objects with mutool show.
FAQ
Ship clean PDFs β keep teams private
Edit or wipe PDF metadata safely. For embedded images, scrub EXIF/GPS first to avoid accidental leaks.