Score:4

Is there a standard for LFSRs to test against for use in a stream cipher?

wf flag

I am trying to implement a stream cipher that uses an LFSR PRNG. I have found resources online that give good primitive polynomials, but I am struggling to find resources with the initial states as well.

I really need to be able to have something to test against so I know my code is working as intended. I would really like something like the example below but with a "bigger" polynomial.

Example 3.4. Let us consider the first $31$ bits of the binary sequence produced by the LFSR of length $5$ with primitive feedback polynomial $X^5 + X^3 + 1$ from initial state $10000$: $$1000010101110110001111100110100$$ It can be checked that this sequence consists of $16$ ones and $15$ zeroes. It then satisfies the balance property.

kodlu avatar
sa flag
Did you see my answer? Was it useful?
Score:2
sa flag

One test is to pass the generated sequence through the Berlekamp-Massey algorithm and confirm that you obtain the correct LFSR polynomial.

Since you mentioned the balance property, you might be interested in all the Golomb's randomness postulates. Let $d$ be the length of the LFSR which is the degree of the connection polynomial and let $(s_0,s_1,\ldots)$ be the output bit sequence.

R1: In the one period output of length $N=2^d-1$ the number of $1$s is one more than the number of zeroes.

R2: At least half the runs have length $1$, at least one-fourth have length $2$, at least one-eighth have length $3$, etc., as long as the number of runs so indicated exceeds $1$. Moreover, for each of these lengths, there are (almost) equally many gaps (runs of zeroes) and blocks (runs of ones).

R3: The autocorrelation function $$C(t)=\sum_{i=0}^{N-1}(-1)^{s_i \oplus s_{i+t}}$$ is two-valued with $C(0)=N,$ and $C(t)=-1$ otherwise.

Actually R1 and R2 are special cases of the following property. Consider the windows $W_i=(s_i,\ldots,s_{i+k-1})$ of length $k$ observed in the output sequence. Then for $k\leq d-1,$ each nonzero window occurs $2^{d-k}$ times and the all zero window occurs $2^{d-k}-1$ times as $i$ ranges over a period, say $i=0,\ldots,N-1$.

kelalaka avatar
in flag
Your answer, actually, is more than what OP needed. It is the last paragraph they needed to find the connection polynomial and the initial state with the help of BM.
Score:1
ng flag

Is there a standard for LFSRs to test against?

I'm afraid that if there were, there would be several incompatible ones, because for a given initial state vector and feedback polynomial the output differs according to multiple LFSR definitions.

There is only one universal convention in LFSRs: the number $n$ of bits in the shift register is the number of bits in the initial state, and the highest exponent in the feedback polynomial (excluding sign if any). In the question's example, that's $n=5$. The output of the LFSR has period at most $2^n-1$ bits.

For the rest there are many different conventions: Fibonacci or Galois structure, shift direction (relative to reading order of initial state), bit used as output. From only the output of an LFSR, it's impossible to discern any of these details. However, in the question's example, we have additionally the initial state $10000$, and we notice that it matches the left/start of the output $1000010101110110001111100110100$. This is highly characteristic of a Fibonacci LFSR, shifting the state towards the left (opposite direction of reading order of the initial state), with output the leftmost bit; that's also the most natural convention for Fibonacci LFSRs.

Another convention matters: how the polynomial maps to the bits of the state that are XORed to form a new bit. Since that polynomial $X^5+X^3+1$ is non-symmetrical, we can ascertain this convention: the low-order term $1=X^0$ maps to the left of the register/input state; for $i<n$ the term $X^i$ maps $i$ bits to the right of that; the term $X^n$ maps to the bit to be shifted in computed as XOR of the other bits. LFSR

The initial state is $Q_0=1$, $Q_1=0$, $Q_2=0$, $Q_3=0$, $Q_4=0$. The successive outputs are the $Q_j$ starting from $j=0$ onward. The recurrence is $Q_{j+5}=Q_{j+3}\oplus Q_{j+0}$, and the additive constants for indexes in that formula match the exponents in the polynomial $X^5+X^3+X^0$.

The following Python program implements the question's LFSR, and produces the question's output:

p = [5,3,0]      # X**5 + X**3 + X**0
s = '10000'      # initial state
k = 31           # number of outputs
if len(s)==p[0]: # sanity check
    for j in range(k):
        print(end=s[0])
        b = False
        for i in p[1:]:
            b ^= s[i]!='0'
        s = s[1:]+('1' if b else '0')

Try it online! modified for the polynomial $X^{19}+X^{13}+X^9+X^4+X^0$ to get the first 140 bits of the output (out of $2^{19}-1$ for the full period)

10000000000000000001000001000101001001110101011001100110010001101000101111001000000101001011111011111111101010110110011111100101110011010111

This is the degree-19 primitive polynomial given by Janusz Rajski, Jerzy Tyszer: Primitive Polynomials Over GF(2) of Degree up to 660 with Uniformly Distributed Coefficients, in Journal of Electronic Testing: Theory and Applications, vol.19, pp.645-657, 2003. The coefficients are also in Jörg Arndt's Tables of mathematical data, eq-primpoly-w5.txt.

Notice that an LFSR alone is NOT a safe source of keystream in a stream cipher, even if the polynomial and initial state are secret! The Berlekamp-Massey algorithm breaks that cipher with $\lesssim 2n$ bits of known plaintext.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.