Problem
I have deployed an ECS cluster and am running a job orchestration platform on the cluster. One of the containers of this platform uses the python docker api to pull a container from our private ECR repo and execute a job within the container. When the job starts running, it eventually hits an issue where it cannot find the assume role credentials defined inside the container in /root/.aws/config
as credential_source=EcsContainer
. This happens after the code tries to make a call to S3.
Why might this happening? The credential source is defined in the container. Why is it not found?
Details
Error
......
The above exception was caused by the following exception:
botocore.exceptions.CredentialRetrievalError: Error when retrieving credentials from EcsContainer: No credentials found in credential_source referenced in profile default
File "/usr/local/lib/python3.6/site-packages/dagster/core/execution/plan/utils.py", line 42, in solid_execution_error_boundary
yield
File "/usr/local/lib/python3.6/site-packages/dagster/utils/__init__.py", line 383, in iterate_with_context
next_output = next(iterator)
File "/usr/local/lib/python3.6/site-packages/dagster/core/execution/plan/compute_generator.py", line 65, in _coerce_solid_compute_fn_to_iterator
result = fn(context, **kwargs) if context_arg_provided else fn(**kwargs)
File "/opt/dagster/app/solids/files.py", line 33, in stream_url_to_s3
with smart.open(f's3://{s3_bucket}/{s3_key}', 'wb', transport_params=tp) as s3location:
File "/usr/local/lib/python3.6/site-packages/smart_open/smart_open_lib.py", line 235, in open
binary = _open_binary_stream(uri, binary_mode, transport_params)
File "/usr/local/lib/python3.6/site-packages/smart_open/smart_open_lib.py", line 398, in _open_binary_stream
fobj = submodule.open_uri(uri, mode, transport_params)
File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 224, in open_uri
return open(parsed_uri['bucket_id'], parsed_uri['key_id'], mode, **kwargs)
File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 308, in open
writebuffer=writebuffer,
File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 757, in __init__
_initialize_boto3(self, client, client_kwargs, bucket, key)
File "/usr/local/lib/python3.6/site-packages/smart_open/s3.py", line 528, in _initialize_boto3
client = boto3.client('s3', **init_kwargs)
File "/usr/local/lib/python3.6/site-packages/boto3/__init__.py", line 91, in client
return _get_default_session().client(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/boto3/session.py", line 263, in client
aws_session_token=aws_session_token, config=config)
File "/usr/local/lib/python3.6/site-packages/botocore/session.py", line 826, in create_client
credentials = self.get_credentials()
File "/usr/local/lib/python3.6/site-packages/botocore/session.py", line 431, in get_credentials
'credential_provider').load_credentials()
File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1962, in load_credentials
creds = provider.load()
File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1395, in load
return self._load_creds_via_assume_role(self._profile_name)
File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1410, in _load_creds_via_assume_role
role_config, profile_name
File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1566, in _resolve_source_credentials
credential_source, profile_name
File "/usr/local/lib/python3.6/site-packages/botocore/credentials.py", line 1623, in _resolve_credentials_from_source
'in profile %s' % profile_name
Configuration
Container Role:
EcsTaskRole:
Type: AWS::IAM::Role
Properties:
Description: The role assumed by the containers, allowing them to call AWS services.
RoleName: !Sub ecs-task-trans-role-development
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service:
- ecs-tasks.amazonaws.com
Action:
- sts:AssumeRole
Policies:
- PolicyName: !Sub 's3-access-${EnvironmentName}-${AWS::StackName}'
PolicyDocument:
Statement:
- Effect: Allow
Action:
- s3:*
Resource:
- "*"
/root/.aws/config
in the container:
[default]
role_arn = arn:aws:iam::<my account>:role/ecs-task-trans-role-development
credential_source = EcsContainer
There is no /root/.aws/credentials
file because the point of assuming a role from the config file is to retrieve the temporary credentials.
https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-role.html
Partial TaskDefinition
:
TaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
...
ContainerDefinitions:
...
MountPoints:
- ContainerPath: "/var/run/docker.sock"
SourceVolume: docker_sock
ReadOnly: true
- ContainerPath: "/root/.docker"
SourceVolume: docker_dir
ReadOnly: true
- ContainerPath: "/usr/bin/docker-credential-ecr-login"
SourceVolume: docker_creds
ReadOnly: true
What I have tried
- Use the
taskExecutionRole
rather than the container role.
- Exporting
AWS_PROFILE=default
in the container