Score:3

Would sending audio fragments over a phone call be considered a form of cryptology?

ae flag

I have been wondering if sending audio fragments over a phone call would be considered a form of cryptology.

Let's say that you own two mobile phones and say that one of your phones is on the Verizon network and the other is on the AT&T network. You have a friend who also owns two mobile phones. Say that one of his phones is on the T-Mobile network and the other is on the U.S. Cellular network.

One day, you decide to call your friend with your Verizon smart phone and he answers on his T-Mobile phone. Without hanging up on the phone call you just made, you grab your AT&T mobile phone and call your friend again and he answers on his U.S. Cellular mobile phone.

Next, you put both of your mobile phones down on a table and you then put each of them on speaker. You then tell your friend to put both of his mobile phones down on a table and to turn on the speaker on each one.

You then turn on Mute on one of your smartphones and begin speaking to your friend. Every two seconds you press the Mute button on both of your smart phones at the same time, which turns off the Mute on one phone and turns on the Mute on the other phone.

As you are speaking, your friend should hear everything you are saying because he should hear your voice coming over on one of his smart phones. When your friend speaks to you, he does the same thing you did and you should hear everything he is saying because his voice is coming over on one of your mobile phones.

Now, let's say that your phone conversation happened to be listened to/recorded by an outside source. As they are listening, or when they play back the recorded phone conversation, they should hear a phone conversation of two-second long audio fragments followed by two seconds of silence, which should make understanding your phone conversation rather difficult to comprehend.

Moreover, say that more mobile phones are used, thus creating more concurrent phone calls between two parties. On each individual phone call, there would be longer periods of silence between each two-second long audio fragment. This should make it almost impossible for an outside source listening in, or listening to the playback of any of these recorded phone calls to understand the overall phone conversation.

Would sending audio fragments over a phone call be considered a form of cryptology?

Dan avatar
id flag
Dan
This is a kind of trunking, and not a form of encryption, no. https://en.wikipedia.org/wiki/Trunking As other answers point out, it's possible to piece the fragments together if you can gain access to them. Cryptography is designed so that even if you have all of the pieces, then you still don't get the data. It's the difference between cutting a letter up into pieces (shredding) and using some sort of cipher (e.g. a one-time pad) to scramble the letters.
Cort Ammon avatar
gb flag
As a rationale for why this would not be crypto related, consider that the entire purpose of the "packet switched" internet was to permit messages to take different paths and be re-assembled on the other side. If this counted as crypto, then quite literally the entirety of the internet would count as crypto.
Paul Uszak avatar
cn flag
@CortAmmon Err, Packet Switching guarantees that **one** stream is _"re-assembled on the other side."_ This question is about multiple independent streams, and the (random) phase difference between them. Then imagine reducing the 2 seconds down to mere tens of milliseconds and upping the stream count...
br flag
Y cn ftn fgr t th mnng f smthng wth mny prts mssng lk ll th vwls
R.. GitHub STOP HELPING ICE avatar
cn flag
While it had some historical usage, the word "cryptology" is so out-of-use and disconnected from modern senses of what cryptography means that I would treat any proposal to do something as a form of "cryptology" as **extremely suspect**. It's a red flag that the person proposing it has no domain expertise and likely no idea what they're talking about.
ZOMVID-21 avatar
za flag
You don't even need all the pieces. Voice communication is highly redundant, so 50% of the stream will be enough to recover 90-99% of the text with manual audio processing. With machine learning, probably better.
Score:12
fr flag

According to Wikipedia, cryptography, or cryptology, is “the practice and study of techniques for secure communication in the presence of adversarial behavior.” With that definition in mind, this would not be a form of cryptography because it doesn't in any way constitute secure communication.

In most contexts and threat models, we want communication between two parties to provide privacy (that is, an attacker cannot read or interpret the data), integrity (an attacker cannot modify or tamper with the data without being detected), and authenticity (one has reasonable confidence that the other party is the intended party). All of these are subject to the restriction that they should not be easier than brute force given the security claims of the algorithms in question.

It is very easy for an adversary to set up a fake cell site and route all local cell traffic to it, acquiring all of the data. In addition, because it's not encrypted in any way, it's possible for adversarial governments to acquire all the telephonic communications (because this ability is required in most countries) and piece them together with software. Finally, there's no integrity protection here, so anyone can tamper with the data without being detected, and there's no authentication, so I can pretend to be your friend with an AI voice generator after doing a SIM swap on the number to my phone.

A proper cryptographic solution would address all of these with some sort of cryptographic algorithm which provides a much lower risk of attack (on the order of $ 2^{-128} $).

Score:8
sa flag

This overly complicated idea is at best a form of communication link obfuscation against eavesdroppers. It is at the transmission/network level, albeit with multiple networks and a manual and complicated way of buying some physical security.

No encryption of any strength is performed. if you want to call it steganography because it hides the transmission, fine.

Also, if someone human was actually monitoring that voice conversation, as you assume, they would sense something was fishy right away attracting attention to the owner of the mobile phone service.

Lucas avatar
au flag
For this to be considered steganography, there should be a message concealed in another message. A sentence inside a bigger text or an image, an image inside a sound,etc. And it should not be evident to human inspection. A fragmented phone conversation is not concealed and is evident that there is a message.
Score:7
cn flag

Yes, it's starting to be.

You're starting along the lines of a manual version of Frequency Hopping Spread Spectrum techniques. You're using phones in this case, whereas in reality radios would be used. But the effect is the same, and is used for both civilian and military applications.

Now imagine instead of two phones (as you suggest), you had 50 or more. And imagine that you can un-mute the phones for less than 0.4 seconds. So each fragment of conversation can be on any phone for less than 0.4 seconds ($t_{ch}$). And no phone is reused for 20 seconds ($T$).These numbers are from US FCC guidelines, § 2.2 FHSS equipment. I can't speak for military specifications.

Now lastly imagine that the mute / un-mute order is pseudo random as determined by a preshared key. TCP/IP, cellular latency and propagation delays would mess up the relative packet order / phase given a short $t_{ch}$. Even if all the phone conversations could be intercepted, the assailant would only have a series of short disjoint throaty sounds that would be very difficult to re-order into a sensible conversation without knowledge of the key.

It's not nearly as effective as using multiple frequencies on a radio, and appears frivolous with phones. For teaching purposes though you can see how the principal works. Under the control of a secret key.

That's cryptography.


From § 2.2 FHSS equipment:-

$t_{ch}$ = average time of occupancy.

$T$ = period.

Gilles 'SO- stop being evil' avatar
cn flag
No, something is missing to provide any confidentiality against a passive adversary (e.g. recording traffic at the local cell tower). The adversary can sum the signals from all the phones, and will get the conversation — it's exactly what the other party in the conservation is doing. To achieve confidentiality, you'd have to send some kind of traffic on the non-active conversation(s) that the adversary can't just cancel out, and you'd need some method for the legitimate party to know which line to tune out when.
kodlu avatar
sa flag
FH Spread spectrum is most definitely not cryptography on its own, at the level that it is a more advanced version of the "channel hopping" the OP is proposing. If you are getting into generating FH sequences cryptographically, that is in no way similar to what the OP is proposing.
Paul Uszak avatar
cn flag
@Gilles'SO-stopbeingevil' "Yes, it's starting to be." & "TCP/IP, cellular latency and propagation delays would mess up the relative packet order / phase given a short $t_{ch}$"...
Paul Uszak avatar
cn flag
@kodlu "It's not nearly as effective as using multiple frequencies on a radio, and appears frivolous with phones. For teaching purposes though you can see how the principal works"...
ZOMVID-21 avatar
za flag
Restoring the fragments' order is a simple matter of doing a FFT and matching ends and beginnings of fragments by spectrum similarity.
Paul Uszak avatar
cn flag
@Therac Well done. And if there are spectroscopic similarities between independent randomised mS audio chunks? NATO might be interested in your theory...
ZOMVID-21 avatar
za flag
@PaulUszak NATO uses this and a lot more techniques for SIGINT. This is no more secure than an office shredder - time-consuming, but straightforward to reverse. It's cryptographically insecure, vulnerable to KPA, and could be used as a teaching example of what *not* to do - a good homework would be for each student to come up with their own attack to defeat this obfuscation.
Paul Uszak avatar
cn flag
@Therac "It's not nearly as effective as using multiple frequencies on a radio, and appears frivolous with phones. For teaching purposes though you can see how the principal works"...
ZOMVID-21 avatar
za flag
@PaulUszak It can be used as an example of what doesn't work. For cryptographic use, you'd have to switch phones 8,000 times per second - the equivalent of single-pixel jigsaw pieces.
ZOMVID-21 avatar
za flag
A cryptographer should assume worst-case plaintexts. If a siren sounds during the conversation (CPA/KPA), it provides a perfect cue for recovery, giving the attacker the key.
Score:3
cn flag

You're mixing up encoding with encryption, which is a common fallacy because lay people think of "code" as something obscure and (intentionally) incomprehensible. But this is really the opposite of what a code/encoding means. An encoding is a protocol for ensuring that any party who knows the protocol can recover the message. Encryption is a protocol for ensuring (among other properties) that any party who does not know a particular secret (even if they do know the protocol!) cannot recover the message.

What you have is a way of encoding information. It does not prevent anyone from recovering the message; it gives them a means for doing so.

Score:2
vu flag

You idea is a bit like stream ciphers, but much less efficient.

In a stream cipher, an algorithm is used to generates a stream of bits from a key, which toggles the bits in the plaintext message (assuming it's bit-encoded like MP3 or AAC) using the XOR operation. In your case, the bits are used to select communication channel (carrier).

Stream ciphers work at bit level, which maximizes encryption efficiency and security; switching channels quickly on the other hand, isn't efficient, and allows for message re-construction by any party that had harvested your communication.

jcaron avatar
my flag
The big difference here is that the stream cipher will modify the stream. Here the stream is just spread out over different channels, and given the other channels are muted when not it use, one just needs to add the different channels together to retrieve the original data.
DannyNiu avatar
vu flag
@jcaron Yes. And that's why I said "*allows for message re-construction by any party that had harvested your communication*" towards the end of my answer.
Score:1
ke flag

What you just described in theory similar to what we used in the military with regards to frequency hopping radios (HAVE QUICK is the name of the system). Essentially, the radio would hop frequencies while transmitting and receiving amoung many different frequencies per second so each fragment is just a millisecond of audio. It relied on each radio being keyed to a set of tables and the radios being synchronized through a tone or time slice. On top of the hoping in the clear, you could layer secure crypto on top of it to prevent anyone from collecting the small millisecond fragments wtihout having the same key. FYI talking on secure radio is annoying because you have a slight delay as you can imagine for technical reasons :)

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.