I have an API that is supposed to use accept a website
header supplied by the client to respond differently depending on which website the request is for.
I have whitelisted this header in AWS Cloudfront. To my understanding this should mean that Cloudfront includes it in the cache key.
When I repeat identical curl calls to my endpoint I get different results back from Cloudfront.
The response headers always indicate a cache hit from Cloudfront, but the response body is sometimes for the wrong website. In other words, Cloudfront does not appear to be including the website header in the cache key and is returning the response body for a different request key.
Here is a sample output from the script below:
$ tail -f one.txt
23245 - x-cache: Hit from cloudfront
56138 - x-cache: Hit from cloudfront
56138 - x-cache: Hit from cloudfront
56138 - x-cache: Hit from cloudfront
23245 - x-cache: Hit from cloudfront
Notice that the "total" is different (this is a JSON key in the response I get back)
I'm using a script to repeat the calls so I expect the request to be identical.
Why is Cloudfront sometimes returning the wrong response?
I am certain that the origin is always returning the correct response for the website header. I've verified this by running my script against the origin without Cloudfront in front of it, and I've also verified that my origin is not being hit when I run this script against Cloudfront.
How can I debug this further? I thought that perhaps I could use "via" to see if one particular edge node was always returning the wrong response, but that didn't work.
#!/bin/bash
files=("one-totals.txt" "two-totals.txt")
for i in "${files[@]}"
do
rm $i >& /dev/null
done
callWebsiteOne () {
curl --location --request GET 'https://my-api.example.com' \
--header 'Authorization: Bearer 123abc' \
--header 'website: one' \
-i > temp.txt
total=$(cat temp.txt | sed s/[^{]*// | jq -r .total)
edgenode=$(cat temp.txt | grep via:)
echo $total " - " $edgenode >> one-edges.txt
echo $total >> one-totals.txt
rm temp.txt
}
callWebsiteTwo () {
curl --location --request GET 'https://my-api.example.com' \
--header 'Authorization: Bearer 123abc' \
--header 'website: two' \
-i > temp.txt
total=$(cat temp.txt | sed s/[^{]*// | jq -r .total)
edgenode=$(cat temp.txt | grep via:)
echo $total " - " $edgenode >> two.txt
echo $total >> two-totals.txt
rm temp.txt
}
callRandomWebsite (){
random=$[RANDOM%4+1]
case $random in
1)
callWebsiteOne
;;
2)
callWebsiteTwo
;;
esac
}
for value in {1..100}
do
callRandomWebsite
sleep 0.25s
done
for i in "${files[@]}"
do
unique=$(sort $i | uniq | wc -l)
total=$(cat $i | wc -l)
echo $i " has " $unique " unique values in " $total " total lines"
done