Literally one of the worst formats I deal with daily, from a security standpoint are PDFs. Very useful and predictable for the end user; yes, but very dangerous for the capabilities it allows.
Dangerzone works like this: You give it a document that you don’t know if you can trust (for example, an email attachment). Inside of a sandbox, Dangerzone converts the document to a PDF (if it isn’t already one), and then converts the PDF into raw pixel data: a huge list of RGB color values for each page. Then, in a separate sandbox, Dangerzone takes this pixel data and converts it back into a PDF.
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community’s icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
This is looking like it’ll be a valuable tool I’ll use frequently.
So it basically rasterizes it? I wonder how it affects file size
Yeah, definitely increases the size and removes some functionality that others may rely on. But for presentation of content which is what a PDF SHOULD BE, then it has typically worked fine. I’ve been using pandoc and some home grown scripts to do this sort of thing for a while.
Oh, I think you already know.
No mention of OCR? Copy-pasting links or data will be a joy…
There is an optional Ocr pass, from what I understand
I don’t know the pdf format very well, is it possible to just drop a few commands that make it vulnerable?
Cool concept.