I installed a new domain controller hosted by Samba 4.11 on Ubuntu 20.04 (using ubuntu packages). I configured the system authentication to use AD credentials with the use of winbind. On others (members) servers I usually use sssd instead of winbind but my understanding is I can't use sssd on a server running samba as a DC.
At first, it works fine. The server operate as a DC as it should. I can logon with my AD credentials to the server with SSH.
Problem is when I connect to this machine with Ansible. Ansible does many SSH connections one after the other (one for each tasks). It also elevates as root with sudo almost every time too. When running my playbook, it fails at a random task because of this error Timeout (62s) waiting for privilege escalation prompt
. If I try to manually connect to SSH at this moment, this just timeout. After some seconds, I can logon again.
The problem doesn't occur when using a local account instead of a domain account.
The only relevant logs I found is in the log.windbindd
log:
[2021/10/12 16:06:57.312913, 5] ../../source3/winbindd/winbindd.c:1204(remove_timed_out_clients)
Idle client timed out, shutting down sock 38, pid 568531
[2021/10/12 16:07:09.933655, 3] ../../source3/winbindd/winbindd_getpwnam.c:59(winbindd_getpwnam_send)
winbindd_getpwnam_send: [nss_winbind (567486)] getpwnam EXAMPLE\myuserid
[2021/10/12 16:07:12.326288, 5] ../../source3/winbindd/winbindd.c:1204(remove_timed_out_clients)
Idle client timed out, shutting down sock 34, pid 568806
[2021/10/12 16:07:12.326413, 5] ../../source3/winbindd/winbindd.c:1209(remove_timed_out_clients)
Client request timed out, shutting down sock 39, pid 568806
[2021/10/12 16:07:12.326451, 1] ../../source3/winbindd/winbindd_dual.c:337(wb_child_request_cleanup)
wb_child_request_cleanup: keep orphaned subreq[0x563394622e40]
[2021/10/12 16:07:14.044393, 3] ../../source3/winbindd/winbindd_misc.c:429(winbindd_interface_version)
winbindd_interface_version: [nss_winbind (568806)]: request interface version (version = 31)
[2021/10/12 16:07:14.044847, 3] ../../source3/winbindd/winbindd_getpwnam.c:59(winbindd_getpwnam_send)
winbindd_getpwnam_send: [nss_winbind (568806)] getpwnam EXAMPLE\myuserid
^C
I wrote a little script that just do a sudo every 1000ms, then 500ms, then 100ms trying to emulate what Ansible does. I noticed that the samba process take about 30% cpu when the script is set to loop every 1000ms, 50% when the script is running at 500ms and 70% when running at 100ms. Unfortunately, I was unable to reproduce authentication timeout when running the script.
#!/bin/bash
while [ true ]; do
time sudo echo 1
sleep 0.1
done
Samba config:
[global]
netbios name = DC4
realm = AD.EXAMPLE.CA
workgroup = EXAMPLE
dns forwarder = 10.3.0.3
server role = active directory domain controller
idmap_ldb:use rfc2307 = yes
pid directory = /run/samba
state directory = /data/samba
binddns dir = /data/samba/bind-dns
ntp signd socket directory = /data/samba/ntp_signd
private dir = /data/samba/private
usershare path = /data/samba/usershares
tls enabled = yes
tls keyfile = tls/key.pem
tls certfile = tls/cert.pem
tls cafile = /usr/local/share/ca-certificates/internal_ca.crt
log level = 5
max log size = 1000000
check password script = python3 /usr/local/bin/check_password_hibpwnd.py
allow dns updates = nonsecure
ntlm auth = yes
template shell = /bin/bash
template homedir = /home/ad.example.ca/%u
# tried this to reduce load, no effects
winbind offline logon = yes
winbind enum users = no
winbind enum groups = no
winbind nested groups = false
[netlogon]
path = /data/samba/sysvol/ad.example.ca/scripts
read only = No
[sysvol]
path = /data/samba/sysvol
read only = No
nsswitch config (I just added winbind
for the passwd
and group
entries):
# /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference' and `info' packages installed, try:
# `info libc "Name Service Switch"' for information about this file.
passwd: files systemd winbind
group: files systemd winbind
shadow: files
gshadow: files
hosts: files dns
networks: files
protocols: db files
services: db files
ethers: db files
rpc: db files
netgroup: nis
Update! I found that configuring sudo to not require user's password to elevate privilege work fine. This is not an acceptable workaround, but maybe this can tell something to someone that could help to diagnosis the problem.