I create a EKS cluster (1.24) via cloudformation, it works fine without a CNI plugin but fails when I add vpc-cni addon:
AddonCNI:
Type: 'AWS::EKS::Addon'
Properties:
AddonName: vpc-cni
AddonVersion: v1.12.0-eksbuild.1
ClusterName: !Ref ControlPlane
ResolveConflicts: OVERWRITE
ServiceAccountRoleArn: !GetAtt
- CNIRole
- Arn
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}/AddonCNI'
DependsOn:
- CNIRole
CNIRole:
Type: 'AWS::IAM::Role'
Properties:
AssumeRolePolicyDocument:
Statement:
- Action:
- 'sts:AssumeRole'
Effect: Allow
Principal:
Service:
- !FindInMap
- ServicePrincipalPartitionMap
- !Ref 'AWS::Partition'
- EKS
Version: 2012-10-17
ManagedPolicyArns:
- !Sub 'arn:${AWS::Partition}:iam::aws:policy/AmazonEKS_CNI_Policy'
- !Sub 'arn:${AWS::Partition}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly'
Tags:
- Key: Name
Value: !Sub '${AWS::StackName}/CNIRole'
I also added AmazonEKS_CNI_Policy
to node roles.
Node status is NotReady:
container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
Node pod logs:
Defaulted container "aws-node" out of: aws-node, aws-vpc-cni-init (init)
{"level":"info","ts":"2023-01-06T16:24:55.411Z","caller":"entrypoint.sh","msg":"Validating env variables ..."}
{"level":"info","ts":"2023-01-06T16:24:55.412Z","caller":"entrypoint.sh","msg":"Install CNI binaries.."}
{"level":"info","ts":"2023-01-06T16:24:55.424Z","caller":"entrypoint.sh","msg":"Starting IPAM daemon in the background ... "}
{"level":"info","ts":"2023-01-06T16:24:55.425Z","caller":"entrypoint.sh","msg":"Checking for IPAM connectivity ... "}
{"level":"info","ts":"2023-01-06T16:24:56.430Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2023-01-06T16:24:57.435Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
Logs from /host/var/log/aws-routed-eni/ipamd.log
inside node pod container (partial):
Defaulted container "aws-node" out of: aws-node, aws-vpc-cni-init (init)
{"level":"info","ts":"2023-01-06T13:19:47.759Z","caller":"logger/logger.go:52","msg":"Constructed new logger instance"}
{"level":"info","ts":"2023-01-06T13:19:47.759Z","caller":"eniconfig/eniconfig.go:61","msg":"Initialized new logger as an existing instance was not found"}
{"level":"info","ts":"2023-01-06T13:19:48.012Z","caller":"aws-k8s-agent/main.go:28","msg":"Starting L-IPAMD ..."}
{"level":"info","ts":"2023-01-06T13:19:48.020Z","caller":"aws-k8s-agent/main.go:39","msg":"Testing communication with server"}
{"level":"info","ts":"2023-01-06T13:19:48.063Z","caller":"wait/wait.go:211","msg":"Successful communication with the Cluster! Cluster Version is: v1.24+. git version: v1.24.8-eks-ffeb93d. git tree state: clean. commit: abb98ec0631dfe573ec5eae40dc48fd8f2017424. platform: linux/amd64"}
{"level":"warn","ts":"2023-01-06T13:19:48.083Z","caller":"awssession/session.go:64","msg":"HTTP_TIMEOUT env is not set or set to less than 10 seconds, defaulting to httpTimeout to 10sec"}
{"level":"debug","ts":"2023-01-06T13:19:48.085Z","caller":"ipamd/ipamd.go:379","msg":"Discovered region: us-east-1"}
{"level":"info","ts":"2023-01-06T13:19:48.085Z","caller":"ipamd/ipamd.go:379","msg":"Custom networking enabled false"}
{"level":"debug","ts":"2023-01-06T13:19:48.085Z","caller":"awsutils/awsutils.go:415","msg":"Found availability zone: us-east-1c "}
{"level":"debug","ts":"2023-01-06T13:19:48.086Z","caller":"awsutils/awsutils.go:415","msg":"Discovered the instance primary IPv4 address: 10.0.66.216"}
{"level":"debug","ts":"2023-01-06T13:19:48.086Z","caller":"awsutils/awsutils.go:415","msg":"Found instance-id: i-06b7496334df06d96 "}
{"level":"debug","ts":"2023-01-06T13:19:48.087Z","caller":"awsutils/awsutils.go:415","msg":"Found instance-type: c5.xlarge "}
{"level":"debug","ts":"2023-01-06T13:19:48.088Z","caller":"awsutils/awsutils.go:415","msg":"Found primary interface's MAC address: 0a:3f:bd:93:2a:8d"}
{"level":"debug","ts":"2023-01-06T13:19:48.088Z","caller":"awsutils/awsutils.go:415","msg":"eni-05097e0aa87b119d5 is the primary ENI of this instance"}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"awsutils/awsutils.go:415","msg":"Found subnet-id: subnet-0e9870d3f07c0c322 "}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"ipamd/ipamd.go:388","msg":"Using WARM_ENI_TARGET 1"}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"ipamd/ipamd.go:391","msg":"Using WARM_PREFIX_TARGET 1"}
{"level":"info","ts":"2023-01-06T13:19:48.089Z","caller":"ipamd/ipamd.go:409","msg":"Prefix Delegation enabled false"}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"ipamd/ipamd.go:414","msg":"Start node init"}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"ipamd/ipamd.go:446","msg":"Max ip per ENI 14 and max prefixes per ENI 0"}
{"level":"info","ts":"2023-01-06T13:19:48.089Z","caller":"ipamd/ipamd.go:456","msg":"Setting up host network... "}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"networkutils/network.go:280","msg":"Trying to find primary interface that has mac : 0a:3f:bd:93:2a:8d"}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"networkutils/network.go:280","msg":"Discovered interface: lo, mac: "}
{"level":"debug","ts":"2023-01-06T13:19:48.089Z","caller":"networkutils/network.go:280","msg":"Discovered interface: eth0, mac: 0a:3f:bd:93:2a:8d"}
{"level":"info","ts":"2023-01-06T13:19:48.089Z","caller":"networkutils/network.go:280","msg":"Discovered primary interface: eth0"}
{"level":"info","ts":"2023-01-06T13:19:48.089Z","caller":"ipamd/ipamd.go:456","msg":"Skip updating RPF for primary interface: net/ipv4/conf/eth0/rp_filter"}
{"level":"info","ts":"2023-01-06T13:19:48.090Z","caller":"awsutils/awsutils.go:1643","msg":"Will attempt to clean up AWS CNI leaked ENIs after waiting 4m41s."}
{"level":"debug","ts":"2023-01-06T13:19:48.090Z","caller":"networkutils/network.go:307","msg":"Found the Link that uses mac address 0a:3f:bd:93:2a:8d and its index is 2 (attempt 1/5)"}
{"level":"debug","ts":"2023-01-06T13:19:48.090Z","caller":"networkutils/network.go:383","msg":"Trying to find primary interface that has mac : 0a:3f:bd:93:2a:8d"}
{"level":"debug","ts":"2023-01-06T13:19:48.090Z","caller":"networkutils/network.go:383","msg":"Discovered interface: lo, mac: "}
{"level":"debug","ts":"2023-01-06T13:19:48.090Z","caller":"networkutils/network.go:383","msg":"Discovered interface: eth0, mac: 0a:3f:bd:93:2a:8d"}
{"level":"info","ts":"2023-01-06T13:19:48.090Z","caller":"networkutils/network.go:383","msg":"Discovered primary interface: eth0"}
{"level":"debug","ts":"2023-01-06T13:19:48.187Z","caller":"networkutils/network.go:403","msg":"Adding 10.0.0.0/16 CIDR to NAT chain"}
{"level":"debug","ts":"2023-01-06T13:19:48.187Z","caller":"networkutils/network.go:403","msg":"Total CIDRs to program - 1"}
{"level":"debug","ts":"2023-01-06T13:19:48.187Z","caller":"networkutils/network.go:403","msg":"Setup Host Network: iptables -N AWS-SNAT-CHAIN-0 -t nat"}
{"level":"debug","ts":"2023-01-06T13:19:48.189Z","caller":"networkutils/network.go:403","msg":"Setup Host Network: iptables -N AWS-SNAT-CHAIN-1 -t nat"}
{"level":"debug","ts":"2023-01-06T13:19:48.190Z","caller":"networkutils/network.go:403","msg":"Setup Host Network: iptables -A POSTROUTING -m comment --comment \"AWS SNAT CHAIN\" -j AWS-SNAT-CHAIN-0"}
{"level":"debug","ts":"2023-01-06T13:19:48.190Z","caller":"networkutils/network.go:403","msg":"Setup Host Network: iptables -A AWS-SNAT-CHAIN-0 ! -d {10.0.0.0/16 %!s(bool=false)} -t nat -j AWS-SNAT-CHAIN-1"}
{"level":"debug","ts":"2023-01-06T13:19:48.190Z","caller":"networkutils/network.go:714","msg":"Setup Host Network: loading existing iptables nat rules with chain prefix AWS-SNAT-CHAIN"}
{"level":"debug","ts":"2023-01-06T13:19:48.237Z","caller":"networkutils/network.go:714","msg":"host network setup: found potentially stale SNAT rule for chain AWS-SNAT-CHAIN-0: [-N AWS-SNAT-CHAIN-0]"}
{"level":"debug","ts":"2023-01-06T13:19:48.238Z","caller":"networkutils/network.go:714","msg":"host network setup: found potentially stale SNAT rule for chain AWS-SNAT-CHAIN-1: [-N AWS-SNAT-CHAIN-1]"}
{"level":"debug","ts":"2023-01-06T13:19:48.238Z","caller":"networkutils/network.go:509","msg":"Setup Host Network: computing stale iptables rules for %s table with chain prefix %s"}
{"level":"debug","ts":"2023-01-06T13:19:48.238Z","caller":"networkutils/network.go:509","msg":"Setup Host Network: active chain found: AWS-SNAT-CHAIN-0"}
{"level":"debug","ts":"2023-01-06T13:19:48.238Z","caller":"networkutils/network.go:509","msg":"Setup Host Network: active chain found: AWS-SNAT-CHAIN-1"}
{"level":"debug","ts":"2023-01-06T13:19:48.238Z","caller":"networkutils/network.go:403","msg":"iptableRules: [nat/POSTROUTING rule first SNAT rules for non-VPC outbound traffic shouldExist true rule [-m comment --comment AWS SNAT CHAIN -j AWS-SNAT-CHAIN-0] nat/AWS-SNAT-CHAIN-0 rule [0] AWS-SNAT-CHAIN shouldExist true rule [! -d 10.0.0.0/16 -m comment --comment AWS SNAT CHAIN -j AWS-SNAT-CHAIN-1] nat/AWS-SNAT-CHAIN-1 rule last SNAT rule for non-VPC outbound traffic shouldExist true rule [! -o vlan+ -m comment --comment AWS, SNAT -m addrtype ! --dst-type LOCAL -j SNAT --to-source 10.0.66.216 --random-fully] mangle/PREROUTING rule connmark for primary ENI shouldExist true rule [-m comment --comment AWS, primary ENI -i eth0 -m addrtype --dst-type LOCAL --limit-iface-in -j CONNMARK --set-mark 0x80/0x80] mangle/PREROUTING rule connmark restore for primary ENI shouldExist true rule [-m comment --comment AWS, primary ENI -i eni+ -j CONNMARK --restore-mark --mask 0x80] mangle/PREROUTING rule connmark restore for primary ENI from vlan shouldExist true rule [-m comment --comment AWS, primary ENI -i vlan+ -j CONNMARK --restore-mark --mask 0x80]]"}
{"level":"debug","ts":"2023-01-06T13:19:48.238Z","caller":"networkutils/network.go:407","msg":"execute iptable rule : first SNAT rules for non-VPC outbound traffic"}
{"level":"debug","ts":"2023-01-06T13:19:48.240Z","caller":"networkutils/network.go:407","msg":"rule nat/POSTROUTING rule first SNAT rules for non-VPC outbound traffic shouldExist true rule [-m comment --comment AWS SNAT CHAIN -j AWS-SNAT-CHAIN-0] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.241Z","caller":"networkutils/network.go:407","msg":"execute iptable rule : [0] AWS-SNAT-CHAIN"}
{"level":"debug","ts":"2023-01-06T13:19:48.242Z","caller":"networkutils/network.go:407","msg":"rule nat/AWS-SNAT-CHAIN-0 rule [0] AWS-SNAT-CHAIN shouldExist true rule [! -d 10.0.0.0/16 -m comment --comment AWS SNAT CHAIN -j AWS-SNAT-CHAIN-1] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.243Z","caller":"networkutils/network.go:407","msg":"execute iptable rule : last SNAT rule for non-VPC outbound traffic"}
{"level":"debug","ts":"2023-01-06T13:19:48.244Z","caller":"networkutils/network.go:407","msg":"rule nat/AWS-SNAT-CHAIN-1 rule last SNAT rule for non-VPC outbound traffic shouldExist true rule [! -o vlan+ -m comment --comment AWS, SNAT -m addrtype ! --dst-type LOCAL -j SNAT --to-source 10.0.66.216 --random-fully] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.245Z","caller":"networkutils/network.go:407","msg":"execute iptable rule : connmark for primary ENI"}
{"level":"debug","ts":"2023-01-06T13:19:48.250Z","caller":"networkutils/network.go:407","msg":"rule mangle/PREROUTING rule connmark for primary ENI shouldExist true rule [-m comment --comment AWS, primary ENI -i eth0 -m addrtype --dst-type LOCAL --limit-iface-in -j CONNMARK --set-mark 0x80/0x80] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.251Z","caller":"networkutils/network.go:407","msg":"execute iptable rule : connmark restore for primary ENI"}
{"level":"debug","ts":"2023-01-06T13:19:48.252Z","caller":"networkutils/network.go:407","msg":"rule mangle/PREROUTING rule connmark restore for primary ENI shouldExist true rule [-m comment --comment AWS, primary ENI -i eni+ -j CONNMARK --restore-mark --mask 0x80] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.253Z","caller":"networkutils/network.go:407","msg":"execute iptable rule : connmark restore for primary ENI from vlan"}
{"level":"debug","ts":"2023-01-06T13:19:48.254Z","caller":"networkutils/network.go:407","msg":"rule mangle/PREROUTING rule connmark restore for primary ENI from vlan shouldExist true rule [-m comment --comment AWS, primary ENI -i vlan+ -j CONNMARK --restore-mark --mask 0x80] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.255Z","caller":"networkutils/network.go:411","msg":"Total CIDRs to exempt from connmark rules - 1"}
{"level":"debug","ts":"2023-01-06T13:19:48.255Z","caller":"networkutils/network.go:411","msg":"Setup Host Network: iptables -N AWS-CONNMARK-CHAIN-0 -t nat"}
{"level":"debug","ts":"2023-01-06T13:19:48.256Z","caller":"networkutils/network.go:411","msg":"Setup Host Network: iptables -N AWS-CONNMARK-CHAIN-1 -t nat"}
{"level":"debug","ts":"2023-01-06T13:19:48.257Z","caller":"networkutils/network.go:411","msg":"Setup Host Network: iptables -t nat -A PREROUTING -i eni+ -m comment --comment \"AWS, outbound connections\" -m state --state NEW -j AWS-CONNMARK-CHAIN-0"}
{"level":"debug","ts":"2023-01-06T13:19:48.257Z","caller":"networkutils/network.go:411","msg":"Setup Host Network: iptables -A AWS-CONNMARK-CHAIN-0 ! -d 10.0.0.0/16 -t nat -j AWS-CONNMARK-CHAIN-1"}
{"level":"debug","ts":"2023-01-06T13:19:48.257Z","caller":"networkutils/network.go:714","msg":"Setup Host Network: loading existing iptables nat rules with chain prefix AWS-CONNMARK-CHAIN"}
{"level":"debug","ts":"2023-01-06T13:19:48.259Z","caller":"networkutils/network.go:714","msg":"host network setup: found potentially stale SNAT rule for chain AWS-CONNMARK-CHAIN-0: [-N AWS-CONNMARK-CHAIN-0]"}
{"level":"debug","ts":"2023-01-06T13:19:48.260Z","caller":"networkutils/network.go:714","msg":"host network setup: found potentially stale SNAT rule for chain AWS-CONNMARK-CHAIN-1: [-N AWS-CONNMARK-CHAIN-1]"}
{"level":"debug","ts":"2023-01-06T13:19:48.260Z","caller":"networkutils/network.go:639","msg":"Setup Host Network: computing stale iptables rules for %s table with chain prefix %s"}
{"level":"debug","ts":"2023-01-06T13:19:48.260Z","caller":"networkutils/network.go:639","msg":"Setup Host Network: active chain found: AWS-CONNMARK-CHAIN-0"}
{"level":"debug","ts":"2023-01-06T13:19:48.260Z","caller":"networkutils/network.go:639","msg":"Setup Host Network: active chain found: AWS-CONNMARK-CHAIN-1"}
{"level":"debug","ts":"2023-01-06T13:19:48.260Z","caller":"networkutils/network.go:411","msg":"iptableRules: [nat/PREROUTING rule connmark rule for non-VPC outbound traffic shouldExist true rule [-i eni+ -m comment --comment AWS, outbound connections -m state --state NEW -j AWS-CONNMARK-CHAIN-0] nat/AWS-CONNMARK-CHAIN-0 rule [0] AWS-SNAT-CHAIN shouldExist true rule [! -d 10.0.0.0/16 -m comment --comment AWS CONNMARK CHAIN, VPC CIDR -j AWS-CONNMARK-CHAIN-1] nat/AWS-CONNMARK-CHAIN-1 rule connmark rule for external outbound traffic shouldExist true rule [-m comment --comment AWS, CONNMARK -j CONNMARK --set-xmark 0x80/0x80] nat/PREROUTING rule connmark to fwmark copy shouldExist false rule [-m comment --comment AWS, CONNMARK -j CONNMARK --restore-mark --mask 0x80] nat/PREROUTING rule connmark to fwmark copy shouldExist true rule [-m comment --comment AWS, CONNMARK -j CONNMARK --restore-mark --mask 0x80]]"}
{"level":"debug","ts":"2023-01-06T13:19:48.260Z","caller":"networkutils/network.go:415","msg":"execute iptable rule : connmark rule for non-VPC outbound traffic"}
{"level":"debug","ts":"2023-01-06T13:19:48.264Z","caller":"networkutils/network.go:415","msg":"rule nat/PREROUTING rule connmark rule for non-VPC outbound traffic shouldExist true rule [-i eni+ -m comment --comment AWS, outbound connections -m state --state NEW -j AWS-CONNMARK-CHAIN-0] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.266Z","caller":"networkutils/network.go:415","msg":"execute iptable rule : [0] AWS-SNAT-CHAIN"}
{"level":"debug","ts":"2023-01-06T13:19:48.267Z","caller":"networkutils/network.go:415","msg":"rule nat/AWS-CONNMARK-CHAIN-0 rule [0] AWS-SNAT-CHAIN shouldExist true rule [! -d 10.0.0.0/16 -m comment --comment AWS CONNMARK CHAIN, VPC CIDR -j AWS-CONNMARK-CHAIN-1] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.267Z","caller":"networkutils/network.go:415","msg":"execute iptable rule : connmark rule for external outbound traffic"}
{"level":"debug","ts":"2023-01-06T13:19:48.269Z","caller":"networkutils/network.go:415","msg":"rule nat/AWS-CONNMARK-CHAIN-1 rule connmark rule for external outbound traffic shouldExist true rule [-m comment --comment AWS, CONNMARK -j CONNMARK --set-xmark 0x80/0x80] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.270Z","caller":"networkutils/network.go:415","msg":"execute iptable rule : connmark to fwmark copy"}
{"level":"debug","ts":"2023-01-06T13:19:48.271Z","caller":"networkutils/network.go:415","msg":"rule nat/PREROUTING rule connmark to fwmark copy shouldExist false rule [-m comment --comment AWS, CONNMARK -j CONNMARK --restore-mark --mask 0x80] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.271Z","caller":"networkutils/network.go:415","msg":"execute iptable rule : connmark to fwmark copy"}
{"level":"debug","ts":"2023-01-06T13:19:48.272Z","caller":"networkutils/network.go:415","msg":"rule nat/PREROUTING rule connmark to fwmark copy shouldExist true rule [-m comment --comment AWS, CONNMARK -j CONNMARK --restore-mark --mask 0x80] exists false, err <nil>"}
{"level":"debug","ts":"2023-01-06T13:19:48.273Z","caller":"awsutils/awsutils.go:1140","msg":"Total number of interfaces found: 1 "}
{"level":"debug","ts":"2023-01-06T13:19:48.273Z","caller":"awsutils/awsutils.go:592","msg":"Found ENI MAC address: 0a:3f:bd:93:2a:8d"}
{"level":"debug","ts":"2023-01-06T13:19:48.276Z","caller":"awsutils/awsutils.go:592","msg":"Found ENI: eni-05097e0aa87b119d5, MAC 0a:3f:bd:93:2a:8d, device 0"}
{"level":"error","ts":"2023-01-06T13:20:38.949Z","caller":"ipamd/ipamd.go:462","msg":"Failed to call ec2:DescribeNetworkInterfaces for [eni-05097e0aa87b119d5]: WebIdentityErr: failed to retrieve credentials\ncaused by: InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.us-east-1.amazonaws.com/id/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\n\tstatus code: 400, request id: 5afe536f-4b18-4f21-a5ad-bc2d0341da2e"}
....
From the logs above this error is weird (XXXXX
is a replacement):
Failed to call ec2:DescribeNetworkInterfaces for [eni-05097e0aa87b119d5]: WebIdentityErr: failed to retrieve credentials\ncaused by: InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.us-east-1.amazonaws.com/id/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\n\tstatus code: 400, request id: 5afe536f-4b18-4f21-a5ad-bc2d0341da2e
I guess it's related to the sts:AssumeRole
action