Proposition aimed at 1/2/3/4 of the last section of the question: chained hybrid encryption.
We'll use
- A fast symmetric authenticated encryption scheme, such as AES-GCM or ChaCha20-Poly1305 with 256-bit secret key, encryption with key $K$ noted $C=E_K(M)$ and decryption $M=D_K(C)$.
- An asymmetric encryption scheme capable of encrypting 256-bit messages, with encryption noted $C=\mathcal E_P(M)$ and decryption $M=\mathcal D_S(C)$, where $(P,S)$ is a (public, private/secret) key pair. RSAES-OAEP or ECIES will do.
We assume Bob and Carol have generated key pairs $(P_B,S_B)$ and $(P_C,S_C)$, and transmitted public keys $P_B$ and $P_C$ to Alice in a manner insuring integrity and proof of origin (perhaps, by way of digital certificates).
To encrypt towards Carol by proxy of Bob, Alice
- draws two random 256-bit keys $K_B$ and $K_C$
- computes $C_B=\mathcal E_{P_B}(K_B)$
- computes $C_C=\mathcal E_{P_C}(K_C)$
- computes $C_0=E_{K_B}(C_C)$
- computes $C_1=E_{K_C}(M)$
- sends to Bob the message $C=C_B\mathbin\|C_0\mathbin\|C_1$.
Bob receives and stores $C$. When Carol requests the encrypted message, Bob
- extracts $C_B$, $C_0$ and $C_1$ from $C$
- computes $K_B=\mathcal D_{S_B}(C_B)$
- computes $C_C=D_{K_B}(C_0)$
- forms and sends to Carol $C'=C_C\mathbin\|C_1$.
Carol gets $C'$ and
- extracts $C_C$ and $C_1$ from $C'$
- computes $K_C=\mathcal D_{S_C}(C_C)$
- computes $M=D_{K_C}(C_1)$
Whenever a decryption fails, Bob or Carol abort.
Notice that the bulk message $M$ is encrypted only once, meeting the performance requirement.
We can replace the asymmetric encryption scheme with a symmetric one, $(P_B,S_B)$ with the question's $K_2$ and $(P_C,S_C)$ with the question's $K_3$ (and then as an aside the system can be simplified to get rid of $K_B$). But by going symmetric we loose the benefit of asymmetric cryptography: that Alice needs not keep anything secret, and can't use such secret to try decipher other communication sent to Bob or Carol.