Score:0

One-time-pad Encryption in C

vn flag

I have been messing around with cryptography (for recreational use), and I created my own one-time-pad encryption script in C. Now, with that being said, I freely admit that I am by no means a cryptography expert. I know that Rule No. 1 of cryptography is not to do it yourself. However, I am genuinely interested in whether my encryption script is (theoretically) secure.

First, here is a basic debrief of what I am attempting to achieve with my encryption method:

The goal is to achieve One Time Pad Encryption, where both (1) a hash table and (2) an encryption key are used. The hash table (in this case) is pre-coded where the values are 01-98.

The encryption key is calculated as follows:

  1. Get random user input (at least the same length as the message)
  2. Get corresponding index for char in validChars[] (source code below)
  3. Get the number in defaultHashTable that corresponds to #2's index

The message is encrypted as follows:

  1. Take the message and convert it into the corresponding defaultHashTable values
  2. Take the previously generated key and add it to the converted message
  3. Now it is encrypted (assuming the same key is never used again)

For example:

  1. Message: hello
  2. Convert to corresponding defaultHashTable values: hello -> 0805121215
  3. Get random chars for key: abkdh
  4. Convert to corresponding defaultHashTable values: abkdh -> 0102110408
  5. Add the key: 0805121215 + 0102110408 = 0907231623
  6. Encrypted message: 0907231623

Here is the source code (NOTE: This is a combination of functions that were in separate C header files, so that is why I am not posting a main () function):

// The key (for the message) will be maximum of [50,000][3] (so an array where each is 2 characters):
typedef struct {
    char key[MAX_SIZE][3];
} Key;
Key globalKey;

// The encrypted message will be returned in this struct (because it is just an easy way to return a two dimensional array from a function in C):
typedef struct {
    char encryptedMessage[MAX_SIZE][3];
} EncryptedMessage;

// The hash table is a pre-coded two dimensional array (which is loaded in initDefaultHashTable()), which will be used in conjunction with the key to encrypt messages:
// NOTE: There is also another encryption mode that I am not including (in an attempt to make this concise) where you have to manually type in random two digit numbers (97 of them) for the hash table
typedef struct {
    char hashtable[HASHTABLE_CAPACITY][3];
} DefaultHashTable;
DefaultHashTable defaultHashTable; // Declare a global defaultHashTable that will store this hashtable

// Load the defaultHashTable with 1-98 respectively:
void initDefaultHashTable(){
    for (int i = 0; i < HASHTABLE_CAPACITY; i++){
        char s[3];
        sprintf(s, "%d", (i+1));

        if (i < 10){
            char tmp = s[0];
            s[0] = '0';
            s[1] = tmp;
        }

        for (int j = 0; j < 2; j++){
            defaultHashTable.hashtable[i][j] = s[j];
        }
    }
}

// Valid chars that the message can contain (97 of them):
char validChars[] = {'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z','!','@','#','$','%','^','&','*','(',')','-','+',' ',',','.',':',';','\'','\"','[',']','{','}','_','=','|','\\','/','<','>','?','`','~','\n','\t','0','1','2','3','4','5','6','7','8','9'};

// For char return fails (I feel like there is a better way to do this part):
char FAILED = (char)255;

// Find the index of a valid char (from validChars) or FALSE if it doesn't exist:
int findChar(char c){
    for (int i = 0; i < strlen(validChars); i++){
        if (validChars[i] == c){
            return i;
        }
    }
    return FALSE;
}

// Return char from validChars given index:
char returnChar(int index){
    return validChars[index];
}

// Get the index of a given hash table value (from defaultHashTable) and then, if applicable, the corresponding char in validChars:
char findHashTableValue(char n[3], char hashtable[][3]){
    for (int i = 0; i < HASHTABLE_CAPACITY; i++){
        if (hashtable[i][0] == n[0] && hashtable[i][1] == n[1])
            return returnChar(i);
    }
    return FAILED;
}

// MAIN part of the code (the main c function would call this): Encrypts using one time pad encryption, but using a default hash table to save time:
void goThroughLightEnryptionProcess(char * str, char * write_to_file){
     // Load the defaultHashTable:
    initDefaultHashTable();

    // Uses function to create random key (based off of random user input):
    generateRandomKey(strlen(str), MAX_SIZE, FALSE);

    // Enrypt the message:
    EncryptedMessage encryptMsg = otpEncrypt(defaultHashTable.hashtable, str, globalKey.key);

    // Loop through and print the contents (if not write to file):
    if (write_to_file == NULL){
        for (int i = 0; i < strlen(str); i++){
            for (int j = 0; j < 2; j++){
                printf("%c", encryptMsg.encryptedMessage[i][j]);
            }
        }
        printf("\n");
    } else {
        // Write encrypted message to file:
        // NOTE: this is another param you can pass in the actual program (which is much larger than this) where you can write it to a file instead of the displaying it in the terminal:
        // NOTE: I did not include this writeFileWithTwoDimensionalArray() array code in here because it is irrelevant, as it simply writes to a file.
        writeFileWithTwoDimensionalArray(encryptMsg.encryptedMessage, HASHTABLE_CAPACITY, write_to_file);
    }
}

// (Helper Function) Load the two char array into the key:
void loadIntoKeyForRandoKey(int at, char n[3]){
    for (int i = 0; i < 2; i++){
        globalKey.key[at][i] = n[i];
    }
}

// Generate a key based off of random user input:
void generateRandomKey(int password_length, int max_size, bool use_global){
    // @Uses globalHashTable | defaultHashTable
    // @Uses globalKey
    char response[max_size];

    printf("Enter random characters for the key (a-z,A-Z,!@#$%%^&*()-+=, min length of %d): ", password_length);
    fgets(response, max_size, stdin);

    // Remove '\n' character at the end:
    char * p;
    if ((p = strchr(response, '\n')) != NULL){
        *p = '\0';
    } else {
        scanf("%*[^\n]");
        scanf("%*c");
    }

    // Make sure user input is >= password_length:
    if (strlen(response) < password_length){
        printf("\n[ ERROR ] : Random characters must be greater than or equal to %d.\n", password_length);
        return generateRandomKey(password_length, max_size, use_global);
    }

    // Convert the random chars into their equivalence in the hashtable respectively:
    for (int i = 0; i < password_length; i++){
        int getCharIndex = findChar(response[i]);
        
        // Make sure it was successfully found:
        if (getCharIndex == FALSE){
            printf("\n[ ERROR ] Character '%c' is invalid. Try again.\n", response[i]);
            return generateRandomKey(password_length, max_size, use_global); // Do it again
        }

        // Load corresponding hashtable value into key:
        if (use_global == TRUE){
            loadIntoKeyForRandoKey(i, globalHashTable.hashtable[getCharIndex]);
        } else {
            loadIntoKeyForRandoKey(i, defaultHashTable.hashtable[getCharIndex]);
        }
    }

    // Write the random key to a file:
    createFileWithTwoDimensionalArray(globalKey.key, password_length, "key");
}

// (Helper Function) For loading into the EncryptedMessage struct:
void loadIntoEncryptedMessage(int at, char n[3], EncryptedMessage *encryptedMsg){
    if (strlen(n) == 1){
        // Append a '0':
        char tmp = n[0];
        n[0] = '0';
        n[1] = tmp;
    }

    for (int i = 0; i < 2; i++){
        encryptedMsg->encryptedMessage[at][i] = n[i];
    }
}

/*
    Encrypts a message given a valid hashtable and key
*/
EncryptedMessage otpEncrypt(char hashtable[][3], char * msg, char key[MAX_SIZE][3]){
    EncryptedMessage encryptedMsg;

    for (int i = 0; i < strlen(msg); i++){
        // Convert the key value into integer:
        int convertedKeyValueIntoInt = safelyConvertToInt(key[i]);

        // Make sure it converted correctly:
        if (convertedKeyValueIntoInt == FALSE){
            printf("[ ERROR ] : The key is corrupted at %d (value = %s).\n", i, key[i]);
            exit(1);
        }

        // Convert the user's message into its equivalence in the hashtable:
        int indexOfMsgChar = findChar(msg[i]);

        // Make sure that findChar() found the value correctly:
        if (indexOfMsgChar == FALSE){
            printf("[ ERROR ] : The password (msg) is corrupted at %d (value = %s). This may have occurred because a char '%c' is not allowed.\n", i, msg, msg[i]);
            exit(1);
        }

        char * correspondingEncryptMsgChars = hashtable[indexOfMsgChar];

        // Convert corresponding encryptMsg chars into int:
        int convertedEncryptMsgCharsIntoInt = safelyConvertToInt(correspondingEncryptMsgChars);

        // Make sure it converted correctly:
        if (convertedEncryptMsgCharsIntoInt == FALSE){
            printf("[ ERROR ] : Hash table is corrupted at %d (value = %s).\n", indexOfMsgChar, correspondingEncryptMsgChars);
            exit(1);
        } 

        // Make the calculation:
        int encryptedFrag = otpeAdd(convertedEncryptMsgCharsIntoInt, convertedKeyValueIntoInt);
        
        // Convert it into a string:
        char encryptedFragStr[3];
        sprintf(encryptedFragStr, "%d", encryptedFrag);
        
        loadIntoEncryptedMessage(i, encryptedFragStr, &encryptedMsg);
    }
    return encryptedMsg;
}

My immediate question would be: if I am using a pre-coded hash table (that anyone could infer), does that make the encryption insecure (even though the key that corresponds to the hash table values are completely random via user input)? Is it only secure if I randomize the hash table numbers (01-98) (and potentially randomize the validChars[])?

I am honestly interested whether my logic holds, so any comments, suggestions, or criticisms would be much appreciated.

Paul Uszak avatar
cn flag
Hiya. Excellent choice to use OTPs. But, _"(1) Get random user input (at least the same length as the message)"_ . How? If I encode a Tweet sized message, where does the random input come from? Remember that a OTP **has** to be generated physically via mechanical or biological hardware, **not** software.
vn flag
@PaulUszak Hi! I am getting random user input currently by having the user spontaneously type out random characters that are of or greater length than the message. Is that secure?
zw flag
Humans are indisputably terrible at creating random input. OTP *requires, by definition* truly random input as its keys.
Score:1
cn flag

Three problems immediately surface:-

  1. The old chestnut of message integrity. Pure one time pads do not have any means of authenticating themselves, i.e. they are malleable.

  2. The key (for the message) will be maximum of [50,000]. I specifically talked about Tweet sized messages for a reason. OTPs were originally created via typewriters and were very successfully. But they were small and there would have been many typists. There is no statistical test than can disproved the randomness of 160 characters (sensibly typed). But there are for 50,000. It is highly unlikely that 50,000 characters typed at random on a keyboard will be just that. Frequency analysis and heuristics will severely downgrade the presumed information based security of the OTP. And are users really going to type 50,000 letters? A hardware device (TRNG) is necessary for keys of that size.

  3. How will the recipient decrypt the message, unless [pronoun] have the same key that was used to encrypt it? So how will it get there?

3½.The hash table is pretty much unnecessary. Just use ASCII values as the security within a OTP comes from the key. It's worth while reading some of the OTP tagged questions here.

Score:0
zw flag

This is a lot of code for something that boils down to:

void otp(size_t len, uint8_t *key, uint8_t *message) {
    for (size_t i = 0; i < len; i++) {
        message[i] ^= key[i];
    }
}

Everything else (other than ensuring len(key) == len(message) is not just unnecessary, but actively detracts from it being an actual implementation of a one-time pad. Even saying "boils down to" here is probably misleading: this isn't just a simplified version; other than grabbing an appropriate-length key from a truly random source it is entirely complete. That omitted part is likely little more than

char *key = malloc(len);

assert(key != NULL);
assert(read(fd, key, len) == len);

I haven't looked at what your "hash table" does because whatever it's doing is wholly unnecessary, extremely likely to be a source of bugs causing catastrophic loss of security, and almost certainly violates the fundamental definition of what constitutes a one-time pad.

A one-time pad must have truly random keys. Characters typed on a keyboard are not truly random. The output of /dev/random is not truly random. Getting truly random bits is hard, but there do exist external devices that will collect them for you (assuming you trust the device). Those keys must be the same length as your message (or longer, I suppose).

And lastly, while it's not a strict necessity, cryptography is generally best performed on bits and bytes and not some notion of "characters". Attempting to do the latter is setting yourself up for serious bugs leading to catastrophic failure.

Mark avatar
ng flag
It is worth mentioning that "the one-time pad must have truly random keys" is not strictly true --- you can replace the random source with a PRG (in say counter mode) to get $\mathsf{CTR}\$$ encryption. This is of course not perfectly secure, but is perhaps worth mentioning to a newcomer for particularly simple encryption implementations.
zw flag
At that point it's a stream cipher and not a one-time pad. This isn't just an academic distinction, it means the crux of the entire task is now fundamentally different. It's not just XORing bits, the hard part is designing and writing a CSPRNG.
Mark avatar
ng flag
Yes, but it is still conceptually useful to understand stream ciphers. The OTP is conceptually simple, but key management is hard. CTR$ simply replaces the (hard) key management with the (hard) problem of designing a PRG. Fortunately, while both are hard, we seem to actually be able to design PRGs, while actually doing OTP key management at scale seems (mostly) hopeless.
Paul Uszak avatar
cn flag
Interesting. Why do you believe that `/dev/random` is not truly random? I'm referring to the one that block(ed)(s). Are you talking about the new one? Plus, it's not _that_ hard as you can do it without any external kit at all - cpu jitter - `System.nanoTime()` or `haveged`, or the Arduino entropy library.
zw flag
`/dev/random` and `/dev/urandom` [are both output from the exact same CSPRNG](https://www.2uo.de/myths-about-urandom/). Even if you wanted to argue that the kernel estimates a 1:1 ratio between input entropy and output entropy from the former, that guarantee goes out the window once you send it through a CSPRNG which only guarantees computational security and doesn't have a proof of perfect security. `haveged` adds entropy to the kernel entropy estimator, but this still runs into the same fundamental problem.
zw flag
Either way, the point isn't that it's impossible to get truly random numbers. The point is that OTP itself is fundamentally trivial, the "hard" part is getting numbers that are actually truly random and not "looks good to me" random. And then of course, if you have a channel by which you can share those securely with a third party, you might as well just exchange the message itself over that channel.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.