I would like to understand the algebra behind GCM's security.
Well, you wrote out the mechanics of how the authentication piece of GCM works; however to understand why those mechanics work, it'd be helpful to approach it in a different way.
GCM uses arithmetic in $GF(2^{128})$; this is a field; the important point is that, in a field, any nonzero polynomial of degree at most $n$ has at most $n$ zeros; that is, for any equation:
$$a_n x^n + a_{n-1} x^{n-1} + ... + a_0 x^0 = 0$$
has at most $n$ solutions for $x$ (assuming that at least one of the $a_i$ values is not zero).
Why this is critical will be clear later.
Now, what GCM does to authenticate is that it takes the encrypted data (and the AAD), and translate that into a polynomial; it then converts the nonce as the constant value of the polynomial, and then evaluates that polynomial at a secret value $k$, and the result of that evaluation is the tag; that is:
$$tag := a_n k^n + a_{n-1} k^{k-1} + ... + a_1 k^1 + \text{hash}(nonce)$$
Now, suppose we had a different message/tag pair for the same nonce; that message/tag would pass the GCM integrity check if:
$$tag' := a'_n k^n + a'_{n-1} k^{k-1} + ... + a'_1 k^1 + \text{hash}(nonce)$$
What we can do is subtract the two polynomials (the one corresponding to the good ciphertext that was generated by the valid encryptor, and the one corresponding to the guess by the attacker); we get then [1]:
$$0 = (a_n-a'_n)k^n + (a_{n-1} - a'_{n-1})k^{n-1} + ... + (a_1 - a'_1)k^1 + (tag - tag')k^0$$
This equation holds only if the second ciphertext/tag pair is valid.
Now, we go back to our original observation: this is a polynomial of degree at most $n$ and so there are at most $n$ different values of $k$ for which this equation is true. We also know that at least one coefficient of the polynomial is nonzero (the all zero polynomial is true if the 'forged' ciphertext is precisely the same as the valid one, which isn't considered a forgery). And, if $k$ is selected unpredictably, then there are $2^{128}$ possible values, and so the probability that it just happens to be one of those $n$ values is at most $n2^{-128}$, which is truly tiny.
Now, to turn this argument into an actual proof (which has been done), we need to assume that a) $k$ is selected uniformly at random, and b) $\text{hash}(nonce)$ also is selected randomly (because, if not, the attacker can deduce some things about the value of the polynomial, which would be bad); it turns out that we can tie both into the cryptographical strength of AES.
Now, to answer your questions:
- Is the above correct?
Close; you did get one detail wrong; using your notation, we get the tag computation as $tag_k(c_1, c_2,...c_n) = c_0 + \sum k^nc_n$ (note that $c_0$ is not multiplied by $k$):
- How does AES-GCM chose $k$?
It encrypts the all-zero block using AES using the GCM key.
- If $k=0$, $tag(\text{any_ciphertext})=0$
No, $c_0$ doesn't get multiplied by $k$, so we have in this case:
$$tag(\text{any_ciphertext}) = c_0$$
It still means that, in this specific case, the ciphertext can be modified arbitrarily without modifying the tag, however it isn't obvious.
The obvious question is: "isn't this still bad?". Well, with a 128 bit tag, the attacker always has a probability $2^{-128}$ of just blinding guessing a valid ciphertext/tag pair - $k=0$ has the same probability of occurring, and hence doesn't increase the forgery probability.
- Does $c_n=0$ pose a problem?
No; if you go through the above logic, $c_n=0$ is not a special case.
- But if $k$ happens to be of low order, the scheme will fail.
This specific failure scenario still falls within the probability $n2^{-128}$ as shown above; if you (say) test if $k$ had a small order and reject those values, you won't actually decrease the forgery probability.
[1]: In an even characteristic field such as $GF(2^{128})$, addition and subtraction are the same operation, and so we usually write both as $+$; I kept it as $-$ to make is somewhat clearer to someone unused to such conventions.