Skip to content

Use PikePDF and split into pages before feeding to Wand

Hugo Kerstens requested to merge 322-wand-memory-usage into master

This MR:

  • Removes PyPDF2 as a dependency
  • Instead uses pikepdf for image extraction
  • Feeds only single pages to Wand instead of full PDFs

Memory usage for a celery worker is ~100 MB when idle, and 150-250 MB while processing a PDF (either PikePDF or Wand). After processing it returns to 100 MB.

Fixes #322 (closed)

Merge request reports