Score:0

Can knowledge of algorithm be used to reduce anonymity?

us flag

This is a bit of a crazy hypothetical, but I think it best illustrates what I'm looking to ask.

Scenario + Question

There's a distributed & decentralized cloud storage network that's starting to get popular. One thing people like about it is the fact that it allegedly grants users tons of privacy they can't get from alternatives like Dropbox.

The company's technical docs state that their solution is more private than the others because their app encrypts everyone's files (client side) before its sent to the network. Thus the nodes (distributed & decentralized ; let's say like Torrent), have no clue what the content is (since its encrypted).

Obvious Potential Dilemma

Accepting what was said above, the obvious question here becomes, 'How is the network able to retrieve requested content from the user?'.

This must be answered because:

  1. The network needs to be able to identify the content in question

  2. No auth is needed here though ; if user does not have keys to decrypt, they're left w blob (there's the issue of DDoS, but that's outside the scope here).

The common solution is to make the data content addressable by encoding it via a hash algorithm (like Blake3 for example). For the sake of brevity, I'll let you all fill in the blanks on all the other little things that need to be setup so that the server can retrieve said data.

Let's assume those measures have been implemented for this network.

Meet 'Raina' the Pornstar

note for mods: This is not a troll. I know the idea of "porn" itself is still taboo in the mainstream to a large extent, but that's part of the reason for why this hypothetical was crafted for this question. In a world where consequences for participating in the porn industry can range from being stigmatized socially to being punished [punitively] or even stoned.

Real-Life Example: A few years ago, an adult film actress by the name of "Belle Knox" made headlines all over U.S. media when one of her classmates at Duke University decided to 'dox' her to the student body & anyone else that cared after seeing her on campus and recognizing that she was the same woman known only [at that time] by the stage name, 'Belle Knox' in adult films. For some reason, the doxxing of her identity on Duke's campus led to a media firestorm, which of course made her collegiate experience (and life, overall), a nightmare from that point onward. Shortly after her exposure, it was reported that, "Knox began to receive threats of violence and death in person and via social media sites such as Twitter and Facebook", on top of the nationwide scrutiny, public embarrassment and humiliation as well as fear of being victimized by other unsavory individuals after online campaigns began cropping up that advocated 'stalking' her down to commit various sexual crimes against her.

Hypothetical Scenario

don't worry this is brief, I'll get to the point; I am only using this hypothetical because I found it extremely difficult to craft this question outright with enough specificity to allow the reader to entirely understand what it is I was asking

Raina is an adult film star. Raina is a professional. She's shot a ton of content over her career which she'd like to keep for portfolio purposes / share w agencies for future work.

The typical runtime of Raina's films typically is 30-60 minutes or more, so her personal computer won't do and she doesn't want to store everything on an external hard drive for fear of losing that device. So she elects to use this distributed, decentralized cloud storage network I described at the beginning of this post to store her data.

Given Raina's profession, privacy is important to her. In this hypothetical universe, if it were discovered that she was actively archiving & storing voluminous amounts of content produced from her tenure in the adult film industry, her personal life, finances, and even general safety would be compromised entirely.

Getting to the Point

The idea behind encrypting data client-side before sending it to the distributed cloud is to provide a simple workflow for the network's users that can protect them from prying eyes. However, Raina's identity on the network (as the requesting entity) becomes known.

Since she's publicly known under her stage name ('Raina') in the capacity of an adult film star, the automatic assumption (made by the info leak), is that she must be using this distributed cloud network to store her videos. Of course, she could be storing other content (i.e., photos, documents,etc.).

Main Question

Let's assume this storage network is open source with code published on GitHub along with meticulous documentation / wiki's breaking down all the granular details of the network.

These docs contain the encryption scheme used with the network (let's say AES-256 [whatever-mode-fits-best]).

Let's say the hacker is able to ascertain what content the (now-demasked) Raina is requesting on the network. Its still an encrypted blob. But I'm assuming once that blob is discovered, the snoop would be able to determine approximately how large the encrypted file is, right?

Let's say its a 40GB encrypted blob. In my opinion, this would be a devastating blow to Raina's opsec because:

A) Since her identity (as a requester is now known)

B) The fact she is an actress in the porn industry quickly follows from her de-masked identity (let's assume in this hypothetical world that no normal woman is ever named 'Raina'; or her full "porn name" is, 'Raina Racks').

C) Given 'A' and 'B', if someone is able to determine the size of the requested content by 'Raina', I believe the observer would be able to sufficiently deduce that the uploaded content is likely her filmed content that she's either archiving / storing for later release.

Sure, this may not reveal exactly what Raina uploaded to the network, but if we assume that Raina did not want anyone to even know the nature of what she was uploading, her privacy has been 100% exposed in this scenario.

I'm wondering whether cryptographers reading this scenario will agree with my conclusion here. If so, I'm wondering if there are any encryption / obfuscation schemes that could potentially account for this (admittedly very fringe) case.

In Closing: 'Explaining Why I Believe This Question / Case is Worthy of Attention'

I feared that the porn star example would make readers dismiss this post or perhaps even report it as spam but after giving it further consideration, I decided this was an unfair indictment of this stackexchange's participants.

Over the past few months, I learned that Visa, Mastercard, and other major transaction / payment processors can (and often do) refuse to allow access to their services if it is discovered that the vendor is engaging in activities that the company finds reprehensible. For those in the sex industry (United States) (i.e., adult film actress), it does not matter whether their activities are 100% regulated and legal under the law.

For women that generate most of their income in this manner, this action could cripple their ability to survive, entirely. Thus, in our hypothetical, it is not sufficient for Raina's content to be 'hidden' / unintelligible alone.

Digging into the 'Human' Element of Cryptography

I wanted to ask this question specifically because (in my opinion), despite the mathematical rigor involved in crafting, benchmarking, testing, and/or attempting to 'crack' a cryptographic scheme, we can't look at these algorithms / functions in a "vacuum". While not always explicitly emphasized in research, there's a very real, human element we must consider when it comes to cryptography and assessing its use / effectiveness. For example, I've seen various studies, write-ups etc. from cryptographers and security analysts that factor in a cryptographic algorithm's 'ease of deployment' in their overall assessment of the scheme's security.

Getting to the Point

I could be entirely misguided in this belief - but when we dig into the core purpose of encryption itself (as an idea), it is to completely mask / hide whatever data is being encrypted.

Applying my (potentially misguided) understanding of the (social / civilian; not military or gov) purpose of encryption,I wanted to pose the hypothetical above to everyone to see what their opinions were of the situation.

Please feel free to address / respond to this question in any manner you feel appropriate. You can answer the questions I posed above or just give general feedback or thoughts on what I wrote about above.

Note For Moderators: If this post is deleted, I understand and will not re-ask.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.