Score:1

AWS pcluster fails with MasterServerWaitCondition Received FAILURE signal, iptables and chef version error

my flag

I'm trying to create an AMI for parallelcluster. I used amazon's stock AMI (ami-0436692c7b452bae4 for us-west-2, the region I am in, and alinux) and modified it slightly by adding a few packages.

However, when I run pcluster create foo --norollback I get the error:

Beginning cluster creation for cluster: stockAWS
Creating stack named: parallelcluster-stockAWS
Status: parallelcluster-stockAWS - ROLLBACK_IN_PROGRESS                         
Cluster creation failed.  Failed events:
  - AWS::AutoScaling::AutoScalingGroup ComputeFleet Resource creation cancelled
  - AWS::CloudFormation::WaitCondition MasterServerWaitCondition Received FAILURE signal with UniqueId i-booyaa

I then run ssh foo and look at the logs in /var/log/cfncluster-init.log which shows a long error log, the bottom of which I have provided:

2021-07-28 23:16:49,659 [ERROR] Command chef (chef-client --local-mode --config /etc/chef/client.rb --log_level auto --force-formatter --no-color --chef-zero-port 8889 --json-attributes /etc/chef/dna.json --override-runlist aws-parallelcluster::_prep_env) failed
2021-07-28 23:16:49,659 [DEBUG] Command chef output: Starting Chef Client, version 14.2.0
[2021-07-28T23:16:47+00:00] WARN: Run List override has been provided.
[2021-07-28T23:16:47+00:00] WARN: Run List override has been provided.
[2021-07-28T23:16:47+00:00] WARN: Original Run List: [recipe[aws-parallelcluster::slurm_config]]
[2021-07-28T23:16:47+00:00] WARN: Original Run List: [recipe[aws-parallelcluster::slurm_config]]
[2021-07-28T23:16:47+00:00] WARN: Overridden Run List: [recipe[aws-parallelcluster::_prep_env]]
[2021-07-28T23:16:47+00:00] WARN: Overridden Run List: [recipe[aws-parallelcluster::_prep_env]]
resolving cookbooks for run list: ["aws-parallelcluster::_prep_env"]
Synchronizing Cookbooks:
  - aws-parallelcluster (2.5.1)
  - poise-python (1.7.0)
  - tar (2.1.1)
  - selinux (2.1.1)
  - nfs (2.6.4)
  - yum (5.1.0)
  - yum-epel (3.1.0)
  - openssh (2.6.3)
  - apt (7.0.0)
  - hostname (0.4.2)
  - line (2.4.1)
  - ulimit (1.0.0)
  - pyenv (3.1.1)
  - kernel_module (1.1.2)
  - poise (2.8.2)
  - poise-languages (2.1.2)
  - iptables (8.0.0)
  - hostsfile (3.0.1)
  - poise-archive (1.5.0)

Running handlers:
[2021-07-28T23:16:49+00:00] ERROR: Running exception handlers
[2021-07-28T23:16:49+00:00] ERROR: Running exception handlers
Running handlers complete
[2021-07-28T23:16:49+00:00] ERROR: Exception handlers complete
[2021-07-28T23:16:49+00:00] ERROR: Exception handlers complete
Chef Client failed. 0 resources updated in 11 seconds
[2021-07-28T23:16:49+00:00] FATAL: Stacktrace dumped to /etc/chef/local-mode-cache/cache/chef-stacktrace.out
[2021-07-28T23:16:49+00:00] FATAL: Stacktrace dumped to /etc/chef/local-mode-cache/cache/chef-stacktrace.out
[2021-07-28T23:16:49+00:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2021-07-28T23:16:49+00:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2021-07-28T23:16:49+00:00] FATAL: Chef::Exceptions::CookbookChefVersionMismatch: Cookbook 'iptables' version '8.0.0' depends on chef version [">= 15.3"], but the running chef version is 14.2.0
[2021-07-28T23:16:49+00:00] FATAL: Chef::Exceptions::CookbookChefVersionMismatch: Cookbook 'iptables' version '8.0.0' depends on chef version [">= 15.3"], but the running chef version is 14.2.0

2021-07-28 23:16:49,659 [ERROR] Error encountered during build of chefPrepEnv: Command chef failed
Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/cfnbootstrap/construction.py", line 573, in run_config
    CloudFormationCarpenter(config, self._auth_config).build(worklog)
  File "/usr/lib/python3.7/site-packages/cfnbootstrap/construction.py", line 273, in build
    self._config.commands)
  File "/usr/lib/python3.7/site-packages/cfnbootstrap/command_tool.py", line 127, in apply
    raise ToolError(u"Command %s failed" % name)
cfnbootstrap.construction_errors.ToolError: Command chef failed
2021-07-28 23:16:49,661 [ERROR] -----------------------BUILD FAILED!------------------------

If I run iptables --version i get v1.8.4. same for running it with sudo. chef is 14.2.0

the frustrating thing is that if i create a parallelcluster stack with the stock aws AMI, I get the exact same behavior. What's going on here?

digijay avatar
mx flag
The error message says "CookbookChefVersionMismatch: Cookbook 'iptables' version '8.0.0' depends on chef version [">= 15.3"], but the running chef version is 14.2.0"
Joe B avatar
my flag
@digijay Thanks, I did notice that bit, but `iptables --version` shows v 1.8.4. chef version is 14.2.0. the sudo version of iptables is the same.
digijay avatar
mx flag
Yes, a version 8.0 of iptabled doesn't even exist, I wonder where that might be defined. Does the cluster start if you leave out the iptables cookbook?
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.