Score:4

Would compressing encrypted data and compressing digital signatures be bad for security?

gn flag

I understand that compressing encrypted data and compressing digital signatures are not efficient because they are most likely incompressible. But in my application encrypted data and digital signatures (post quantum crypto) are stored in a sqlite database along with metadata like crypto algorithm name , iv . If LZMA compression is applied, the sqlite database itself is compressible. Actually i use sqlite database like a file format and transfer it across hostile network. I am wondering if compressing encrypted data or digital signatures would be a security disaster.

cce avatar
au flag
cce
Whether compressing encrypted data or digital signatures can be a security disaster
Score:4
ng flag

Applying lossless compression (like LZMA) to a database with encrypted or/and signed data is mostly neutral from a security standpoint: it's neither good, nor necessarily bad (but see second point below).

  • It provides no serious protection when the database is transferred across an hostile network. Argument: an attack working with the database not compressed can most likely be adapted to be carried with the database compressed, by decompressing, performing the attack, and re-compressing the database if it was modified by the attack.
  • If compression introduces a vulnerability that did not exist without compression, and under the assumption that the hostile network has full access to the data (rather than e.g. limited to the data length), said vulnerability can only be due to an error in the decompression code. These can lead to denial of service, or even code injection. That has happened many tines to many decompression code, see e.g. Security and Crash Bugs there, because decompression code is complex. Depending on your taste, that can be mitigated by one or more of:
    • Using a language or/and execution environment less likely to lead to exploitable bugs than bare C/C++ code is.
    • Reviewing the decompression code, perhaps even proving it correct.
    • Testing, perhaps by fuzzing, coverage analysis…
    • Authentication (including authenticated encryption) of the compressed data.

From a functional standpoint, the main drawback of compressing a database is that the database in unusable while compressed, and decompressing/re-compressing is costly in CPU and to some degree space and SSD/disk wear. But this is justifiable during transit, to reduce the amount of data transferred.

A simple approach to insuring database security while in transit in an hostile network is to send it PGP/GPG encrypted. This has built-in compression, and can give both confidentiality and integrity, if keys can be safely stored. As pointed in the question, what's stored encrypted in the database won't compressed (unless it's re-encoded to e.g. Base64). I don't know an alternative with post-quantum crypto. But then Cryptographically Relevant Quantum Computers remain highly hypothetical.

Maarten Bodewes avatar
in flag
I very much like the point that vulnerabilities in the code are dangerous especially if the encrypted data has not been authenticated. This happens a lot with security, e.g. there were serious vulnerabilities in the ASN.1 decoder that Microsoft used before it was even able to authenticate signatures. The larger the attack surface before a message can be authenticated the higher the chance that code vulnerabilities can exist.
Score:0
vu flag

Compression algorithms have exactly the same security property as identity functions in terms of confidentiality. So the only security disaster would be that if you don't apply checksums or error-resilient redundancy.

ANISH M 18CS006 avatar
gn flag
do you mean i should checksum the compressed sqlite database with digital signature or encrypted data using hash function like sha512?
DannyNiu avatar
vu flag
That's totally up to you, these are all possible options. Doing them have both benefits and hassle, and what fits for one may not necessarily fit for others.
fgrieu avatar
ng flag
The answer's first sentence is about _lossless_ compression algorithms, which includes correct implementations of LZMA of the question. About the second sentence: in a crypto context _"checksums or error-resilient redundancy"_ applied to a whole database containing encrypted or/and signed data are ___not___ to be considered helpful against manipulations by adversaries.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.