Score:1

Format SFTP Logs to have Username on Each Entry

cn flag

I'm running a production server (Debian 10, standard OpenSSH package) that has Pure-FTPD running for legacy connections and SFTP for all our current connections. The SFTP server is setup with a chroot jail that logs via a bound device in the user's chroot jail. This is picked up by rsyslog and sent to /var/log/sftp.log, after which I am using logstash to parse that file and forward everything to a visualization server for our superusers. The superusers log into the visualization to see all SFTP and FTP/FTPS logs in one place.

The pure-ftpd logs are formatted in a way that our superusers like:

pure-ftpd: (testuser@hostname) [NOTICE] /home/ftpusers/testuser//outbound/testfile.pdf downloaded  (1765060 bytes, 5989.55KB/sec)

This is great, because in a single line it shows the exact user and exact file that they uploaded or downloaded. However, for SFTP the situation is not as nice:

internal-sftp[8848]: session opened for local user testuser from [{ip_address}]
internal-sftp[8848]: opendir "/inbound"
internal-sftp[8848]: realpath "/inbound/."
internal-sftp[8848]: open "/inbound/testfile.pdf" flags WRITE,CREATE,TRUNCATE mode 0666
internal-sftp[8848]: close "/inbound/testfile.pdf" bytes read 0 written 1734445

In this case, the logs are easy to follow. testuser logs in, writes file, done. BUT we have many users logging in at a time, and logs from multiple instances of internal-sftp can occur at once. If this happens, the only way to track the activity for a user is to search for the username testuser, find the process ID that is logged (8848 in the example above) then find any messages that have that process ID. Many users are logging in via a cronjob, so this happens every 2 minutes or so...when we have 300 users logging in at regular intervals, you can imagine that hunting through this many process ID's would be a pain.

My Question

Is there a way to preface each log message from sftp-internal with the name of the user that is generating the log? This would have to work in a chroot jail. I cannot find anything about how to modify the message that rsyslog generates to include the username.

I would like to see something similar from my SFTP logs:

internal-sftp[8848]: (testuser) open "/inbound/testfile.pdf" flags WRITE,CREATE,TRUNCATE mode 0666
internal-sftp[8848]: (testuser) close "/inbound/testfile.pdf" bytes read 0 written 1734445

Current State of Configuration

My process chain goes:

ssh -> sftp-internal -> rsyslog (on local3.*) -> file /var/log/sftp.log -> logstash -> export to visualization server

Excerpt from my chroot group in /etc/ssh/sshd_config

Match Group sftpusers 
        ChrootDirectory %h
        AuthorizedKeysFile %h/.ssh/authorized_keys
        ForceCommand internal-sftp -f local3 -l INFO
        # ForceCommand internal-sftp -l VERBOSE
        AllowTcpForwarding no
        X11Forwarding no

and my /etc/rsyslog.d/sftp.conf

local3.*        -/var/log/sftp.log

Similar Questions:

This question is about SFTP logging to separate files, but it mentions this waybackmachine entry for an old article that included pretty-formatting SFTP log entries to look like standard xferlogs. The article mentions a Perl script (holy grail) that will format it for you, but alas, the link is dead. I could write a Python or Perl script that finds the specific message for transfers, grabs the process id, and reverse searches to find the user, then prints a reformatted xfer message with the username out to a file. But surely someone has solved this problem before and has a better solution.

Thank you for any help.

Score:0
cn flag

I was able to build a solution with Python and systemd. This is very quick-and-dirty but works for my purposes. I take in a sftp-internal log file and dump it to a reformatted one. I do not modify the original in case this formatter makes any errors.

The Python Script

This logs out to rsyslog for monitoring and responds to SEGINT from systemd. Yes, this should use something better than a list, but python has no ring buffers or formal queueing system built in (send me a comment if there's something I'm missing). Either way...this ain't C!

#!/usr/bin/python3

import logging
import re
import sys
import time


class SFTPLogFormatter:

    def __init__(self, infile: str, outfile: str):
        self.logger = logging.getLogger(__name__)
        self.logger.setLevel(logging.DEBUG)
        stdout_handler = logging.StreamHandler()
        stdout_handler.setLevel(logging.DEBUG)
        stdout_handler.setFormatter(logging.Formatter('%(levelname)8s | %(message)s'))
        self.logger.addHandler(stdout_handler)

        self.infile = open(infile, 'r')

        # append to file and keep only 1 lines in a write buffer (write almost
        # immediately)
        self.outfile = open(outfile, 'a', 1)

    def start(self):
        try:
            self.logger.info('starting formatter')
            self.run()
        except KeyboardInterrupt:
            self.logger.warning('SIGINT received, gracefully exiting')
            self.stop()

    @staticmethod
    def tail_file(file_obj):
        while True:
            line = file_obj.readline()
            # sleep if file hasn't been updated
            if not line:
                time.sleep(1)
                continue

            yield line

    def run(self):
        self.infile.seek(0, 2)  # jump to end of file for `tail -f` type behavior
        lines_read = []
        for line in self.tail_file(self.infile): # tail a file like `tail -f`
            lines_read.insert(0, line) # treat the list like a stack
            lines_read = lines_read[:2000] # trim stack since python does not have ring buffers

            modifyline_match = re.match(r'(.*)\[(\d+)\]: (open|close|remove name) (.*)', line) 
            if not modifyline_match:
                self.logger.info(line)
                self.outfile.write(line)
                continue

            modify_line_procid = modifyline_match.group(2)

            self.logger.debug(f'searching for session open statement for open|close file match string: \"{modifyline_match.group(0)}\"')
            open_session_regex = rf'.*\[{modify_line_procid}\]: session opened for local user (.*) from.*'
            open_session_match = None
            for prevline in lines_read[1:]:
                open_session_match = re.match(open_session_regex, prevline)
                if open_session_match:
                    self.logger.debug(f'found session open string: \"{open_session_match.group(0)}\"')
                    break
            else:
                # we found nothing
                self.logger.debug('could not find open session string for: \"{modifyline_match.group(0)}\"')
                continue

            modify_line_start = modifyline_match.group(1)
            modify_line_operator = modifyline_match.group(3)
            modify_line_details = modifyline_match.group(4)

            username = open_session_match.group(1)

            log_str = f'{modify_line_start}[{modify_line_procid}]: (user={username}) {modify_line_operator} {modify_line_details}\n'
            self.logger.info(log_str)
            self.outfile.write(log_str)

    def stop(self):
        self.logger.info('cleaning up')
        try:
            self.infile.close()
        except Exception as e:
            self.logger.error(f'failure while closing infile: {e}')

        try:
            self.outfile.close()
        except Exception as e:
            self.logger.error(f'failure while closing outfile: {e}')

        self.logger.info('exit')
        sys.exit(0)


if __name__ == '__main__':
    infile = sys.argv[1]
    outfile = sys.argv[2]
    service = SFTPLogFormatter(infile, outfile)
    service.start()

Service File

The following service file was created and enabled in systemd.

[Unit]
Description=Format log messages from sftp to have the username on any file reads, writes, and deletes, making multi-user logs much easier to read.
After=network.target

[Service]
User=root
Type=simple
ExecStart=/usr/bin/python3 /home/admin/services/format_sftp_logs_with_username.py /var/log/sftp.log /var/log/sftp_with_usernames.log
KillSignal=SIGINT

[Install]
WantedBy=multi-user.target

Results

This results in the following log messages. Notice the (user=XYZ) additions.

Feb 11 21:22:01 ip-10-20-0-96 internal-sftp[18241]: session opened for local user testuser from [127.0.0.1]
Feb 11 21:22:02 ip-10-20-0-96 internal-sftp[18241]: opendir "/"
Feb 11 21:22:02 ip-10-20-0-96 internal-sftp[18241]: closedir "/"
Feb 11 21:22:05 ip-10-20-0-96 internal-sftp[18241]: opendir "/inbound"
Feb 11 21:22:05 ip-10-20-0-96 internal-sftp[18241]: closedir "/inbound"
Feb 11 21:22:10 ip-10-20-0-96 internal-sftp[18241]: opendir "/inbound/"
Feb 11 21:22:10 ip-10-20-0-96 internal-sftp[18241]: closedir "/inbound/"
Feb 11 21:22:12 ip-10-20-0-96 internal-sftp[18241]: (user=testuser) open "/inbound/mailhog-deployment.yaml" flags READ mode 0666
Feb 11 21:22:12 ip-10-20-0-96 internal-sftp[18241]: (user=testuser) close "/inbound/mailhog-deployment.yaml" bytes read 815 written 0
Feb 11 21:22:13 ip-10-20-0-96 internal-sftp[18241]: opendir "/inbound/"
Feb 11 21:22:13 ip-10-20-0-96 internal-sftp[18241]: closedir "/inbound/"
Feb 11 21:22:14 ip-10-20-0-96 internal-sftp[18241]: opendir "/inbound/"
Feb 11 21:22:14 ip-10-20-0-96 internal-sftp[18241]: closedir "/inbound/"
Feb 11 21:22:14 ip-10-20-0-96 internal-sftp[18241]: (user=testuser) remove name "/inbound/mailhog-deployment.yaml"
Feb 11 21:22:18 ip-10-20-0-96 internal-sftp[18241]: (user=testuser) open "/inbound/mailhog-deployment.yaml" flags WRITE,CREATE,TRUNCATE mode 0644
Feb 11 21:22:18 ip-10-20-0-96 internal-sftp[18241]: (user=testuser) close "/inbound/mailhog-deployment.yaml" bytes read 0 written 815
Feb 11 21:22:19 ip-10-20-0-96 internal-sftp[18241]: session closed for local user testuser from [127.0.0.1]

Limitations

The buffer has a 2000 line look-behind to hunt for the process id. Bump that up if you have dozens or hundreds of users logging in at a given instant. Otherwise this should cover most server's needs.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.