The Problem
You have an untrusted PDF file. It may contain a malware and the malware may infect your computer and do terrible things to it.
You want to convert this PDF to a "trusted" PDF that does not have any malware without endangering your Ubuntu computer.
The Method
The idea is to install Multipass in your Ubuntu computer, and use the default primary Virtual Machine (VM) to "flatten" the untrusted PDF file. The process of flattening the PDF involves converting the PDF file to a postscript (PS) file and then convert the PS file back to PDF. The resulting PDF is "trusted" as the any malware in the original PDF is not expected to survive the double conversion process.
Finally, once the conversion is complete, the VM is destroyed. So that any changes that may be made to the VM by the malware in the original PDF is destroyed with it.
A Proof of Concept
This solution is command line based, where we will type (or paste) commands in the terminal.
First let us install Multipass in your computer with the following command:
sudo snap install multipass
You have to do it only once.
The rest of the work is done by a bash script. I call it flatten.sh. Save the script below in your home folder as flatten.sh
and make it executable.
#!/bin/bash
if [ -z $1 ]; then
echo "No argument set. Valid argument is a PDF filename.pdf in the $HOME folder"
read -ep "Enter filename: " FULLNAME
else
FULLNAME=$1
fi
if [ ! -f $FULLNAME ]; then
echo "The file $FULLNAME not found."
echo "Valid argument is a PDF filename.pdf in the $HOME folder"
echo "exiting..."
exit 1
fi
INPNAME=$(basename $FULLNAME)
DIR=$(dirname $FULLNAME)
OUTNAME="Trusted-$INPNAME"
multipass start
multipass exec primary -- sudo apt update
multipass exec primary -- sudo apt install ghostscript -y
multipass exec primary -- cp "Home/$INPNAME" .
multipass exec primary -- pdf2ps "$INPNAME" temp
multipass exec primary -- ps2pdf temp "$OUTNAME"
multipass exec primary -- mv "$OUTNAME" Home/
multipass stop primary
multipass delete primary
multipass purge
Let us say, you have a file called test.pdf
that you don't trust. Use the following command to run the script:
./flatten.sh test.pdf
The test.pdf
should be in your $HOME folder. If you have your PDF file in a different folder, the script (as it is written) won't find it.
Here is the list of things that will happen once you start this script:
- A VM will be created
- A minimal version of Ubuntu will be installed in the VM
- The script will install
ghostscript
, needed for the conversion
- The untrusted PDF file will be copied to the VM's virtual storage.
- The untruested PDF will be converted to a temp PS file and
- The temp PS file will be converted to "trusted" PDF with the with the "Truted-" prefix.
- The trusted PDF will be moved back to your home folder.
- The VM will be stopped, deleted, and purged.
This whole process will take some time, particularly initiation of the VM and the installation of ghostscript
.
Note: if the untrusted PDF file is very big the Multipass VM may run out of the virtual memory allocated by default. See Multipass documentation on how to allocated more memory to the VM.
Downside
As far as I can tell there is no way to take a snapshot of the primary VM in Multipass after installing Ghostscript and spin that stored VM for the next time you need to sanitize a PDF. If this was possible it would make the process take a little less time.
Another Way
Another way to achieve similar results may be to use LDX/LXC containers. LXD supports snapshots and a custom container with just Ghostscript may be a little lighter than a full VM. However, I don't have any experience with LXD/LXC.
Hope this helps