Score:7

Server hardening of Ubuntu 22.04 by analyzing and removing unnecessary packages

cn flag

I have read Jay Lacroix's book about "Mastering Ubuntu Server", and he recommends removing all unnecessary packages in order to reduce the attack surface. Specifically, he advises running apt-cache rdepends <package> to find out if there are other packages depending on a package we consider removing.

I wrote a bash script that lists dependent packages for all installed packages, but it takes a very long time (>30 mins on a Raspberry Pi 4, 8GB), and I was wondering if there is a better, faster solution to this.

#!/bin/bash

readarray -t packages < <(dpkg --get-selections | cut -f1)

for package in ${packages[@]};
do
    readarray -t dependents < <(apt-cache rdepends $package | sed -n '3,$s/^\s*//p')
    echo "---------------------------------------------" | tee -a packages_and_depents.txt
    echo "${package} has these dependents on the system of max ${#dependents[@]}:" | tee -a packages_and_depents.txt
    echo "---------------------------------------------" | tee -a packages_and_depents.txt
    for dependent in ${dependents[@]};
    do
        dpkg --get-selections $dependent 2>/dev/null | tee -a packages_and_depents.txt
    done
done
user535733 avatar
cn flag
A better way to print all dependencies? Or a better way to locate unnecessary packages?
Thomas Grusz avatar
cn flag
@user535733 It's not the **dependencies** I'm after, I need the **dependents**, i.e. the packages, that depend on the package I consider removing. And I was hoping to find a faster solution than mine.
ru flag
I would note that "removing unnecessary packages" is not a sure fire way to harden down a system. If you're looking for hardening benchmarks you should probably be after the CIS benchmarks for hardening not just "removing unnecessary packages"
cn flag
I'd ponder the alternative - starting with a minimal server or debootstrap system and building *up* from there. You're starting with as small a footprint as possible and adding from there
Score:12
vn flag

The dpkg package system has a field for each package indicating its Priority.

You could use this as an initial filter, and only run your script on packages that are categorized as optional and extra (and leaving out required, important and standard).

Also, creating an extra array and running an extra for loop for each packages seems unnecessary, and will definitely take more computing power.

So I've removed the 2nd for loop and instead adding --installed directly to the apt-cache rdepends command.

This could be done with the script modified as this:

#!/bin/bash

# The command for this line is changed
readarray -t packages < <(dpkg-query -Wf '${Package}${Status}${Priority}\n' | sort -b -k5,5 -k1,1 | grep -v 'required\|important\|standard' | grep 'installed' | awk '{ print $1 }')

for package in ${packages[@]};
do
    echo "---------------------------------------------" | tee -a packages_and_depents.txt
    echo "${package} has these dependents installed on the system:" | tee -a packages_and_depents.txt
    echo "---------------------------------------------" | tee -a packages_and_depents.txt

    # 2nd for loop removed and replaced with `--installed` option
    apt-cache --installed rdepends "$package" | tail -n +3 | tee -a packages_and_depents.txt
done

Another option is to change the entire script, so instead of showing all reverse dependencies, it only shows the name of those packages that have NO reverse dependencies (those that are a candidate for removal).

Also, I think you could add an additional grep exclusion by excluding all packages whose name start with lib (adding grep -v '^lib').

Finally, the presentation can be improved, so the script gives a visual feedback while it's running, but the final report is only written to the output file.

Here is my final version of the script:

#!/bin/bash

# The command for this line is changed
readarray -t packages < <(dpkg-query -Wf '${Package}${Status}${Priority}\n' | sort -b -k5,5 -k1,1 | grep -v 'required\|important\|standard' | grep -v '^lib' | grep 'installed' | awk '{ print $1 }')

# Write to file
echo "The following packages are not a dependency to any installed package:" > packages_no_depends.txt
# Write to screen
echo "Number of packages: [ ${#packages[@]} ] (priority optional/extra)"
echo ""

i=0
j=0

# Loop that only prints package names with NO reverse dependencies
for package in ${packages[@]};
do

    (( j++ ))
    echo -e "\033[1AProcessed packages: [ $i/$j ]"
    if [[ $(apt-cache --installed rdepends "$package" | tail -n +3 | wc -l) -eq 0 ]]
    then
        # Write to file
        echo "  $package" >> packages_no_depends.txt
        # Write to screen
        echo -e "\033[K  Package $package added to the list of non-dependencies\033[1A"
        (( i++ ))
    fi

done

# Final overview
echo -e "\033[K"
echo "STATUS"
echo "======"
echo "  Total packages scanned : $j"
echo "  Candidates for removal : $i"
echo "  Script execution time  : $SECONDS seconds"

Reference to escape codes for cursor movement.

EDIT: This solution should mostly be considered for layout and presentation - Raffa's solution is much more effective, so do your own combination of the two according to preference.

With input from Raffa, this is my final version of the script:

#!/bin/bash

# Change /path/to
dpkg_file="/path/to/packages_no_depends.txt"

# Write to file
echo "The following packages are not a dependency to any installed package:" > "$dpkg_file"
# Write to screen
echo "Scanning packages ..."

# Function to write non-dependencies to file
dpkg-query -Wf '${Package} ${Status}${Priority}\n' |
  grep -v 'required\|important\|standard' |
  grep 'installed' |
  awk '{ print $1 }' |
xargs apt-cache rdepends --installed |
  awk '! /Reverse Depends:/ {
    tp = $0
    n++
  }
  /Reverse Depends:/ {
    if (n == 1 && NR != 2) {
        print "  " p
    }
    n = 0
    p = tp
  }
  END {
    if (n == 0) {
        print "  " p
    }
  }' >> "$dpkg_file"

# Final overview
echo -e "\nSTATUS\n======"
echo "  Total packages scanned : $(dpkg-query -Wf '${Package}${Status}${Priority}\n' | grep -v 'required\|important\|standard' | grep 'installed' | wc -l)"
echo "  Candidates for removal : $(tail -n +2 $dpkg_file | wc -l)"
echo "  Script execution time  : $SECONDS seconds"
Raffa avatar
jp flag
Newer APT versions can take [a search pattern](https://manpages.ubuntu.com/manpages/jammy/man7/apt-patterns.7.html) `?reverse-depends(PATERN)` that can be negated by prepending `!` before it like `!?reverse-depends(PATERN)` … Yet, I don’t know if it will work for this use case.
Thomas Grusz avatar
cn flag
@artur-meinild Thanks! Both scripts work great and reduce execution time drastically. I stick with your second script, which is exactly what I was looking for. I simplified line #9 in the second script to: `echo "Number of packages: [ ${#packages[@]} ] (priority optional/extra)"`
LogicalBranch avatar
rs flag
Genuinely appreciate this contribution, saved me a lot of time and when paired with Synaptic to double check, ensured I wasn't breaking anything by accidentally uninstalling an "unimportant" package that was required by an important one. Thank you.
Artur Meinild avatar
vn flag
@ThomasGrusz happy it works, and I implemented your improvement, thanks!
Artur Meinild avatar
vn flag
Additional comment: You might be able to further reduce the runtime by excluding packages where the name is starting with `lib` (add an additional `grep -v '^lib'`). In my experience, those packages will almost always be a dependency of some other package.
jp flag
Dan
FWIW, if the package name is longer than 40 characters, the script will complain that a package with a weird name does not exist. It can be fixed by removing the package name limit and forcing a whitespace character in the format: `'${Package} ${Status;-26}${Priority}\n'`. For example, fo me, this package was causing the weird output: `network-manager-config-connectivity-ubuntu`
Artur Meinild avatar
vn flag
@Dan thanks, will fix that
Score:7
jp flag

In line with the tools you implement in your script, this should be as fast as it gets:

dpkg --get-selections |
cut -f1 |
xargs apt-cache rdepends --installed |
awk '! /Reverse Depends:/ {
    tp = $0
    n++
}

/Reverse Depends:/ {
    if (n == 1 && NR != 2) {
        print p
    }
    n = 0
    p = tp
}

END {
    if (n == 0) {
        print p
    }
}
'

It should list packages in the output of dpkg --get-selections | cut -f1 that have no installed revers dependencies on the system.

!!WARNING!!

Never feed the output of the above command to a package removal tool ... Inspect the output and handle it manually in all cases.

Thomas Grusz avatar
cn flag
Your implementation is awesome. Runtime on my Raspberry Pi 4, 8GB is about 1 second! I will combine your script with parts of Artur Meinild's script as the final implementation. Cool team effort.
Artur Meinild avatar
vn flag
Looks really cool. I can see I need to look more into `xargs` etc.
user535733 avatar
cn flag
+1 for the clear warning that thoughtful human review is essential.
Score:4
cn flag

This is my final implementation with input from @Raffa, @artur-meinild, and @Dan.

I tested the script on a Raspberry Pi 4 (8GB) and a VM on my iMac (8GB, i5, SSD, 2015) and runtime was about 1 second on both systems. A big improvement from my initial script that took more than 30 minutes.

Thanks to everyone!

#!/bin/bash

# Define output filename
filename=packages_no_dependents.txt

# Write heading to file
echo "The following packages are not a dependency to any installed package:" > $filename

# Get all installed packages that do not have a 'Priority' of 'required', 'important' or 'standard' and do not have packages that depend on them
dpkg-query -Wf '${Package} ${Status;-26}${Priority}\n' | grep -v 'required\|important\|standard' | grep 'installed' | awk '{ print $1 }' |
xargs apt-cache rdepends --installed |
awk '! /Reverse Depends:/ {
    tp = $0
    n++
}

/Reverse Depends:/ {
    if (n == 1 && NR != 2) {
        print p
    }
    n = 0
    p = tp
}

END {
    if (n == 0) {
        print p
    }
}
' | tee -a $filename

# Remove leading white-space in file
sed -i '3,$s/\s*//' $filename
Artur Meinild avatar
vn flag
My own final version also ended up with something similar to this! Also, the `sort` command is totally irrelevant here, so that part can be removed as well.
Thomas Grusz avatar
cn flag
@ArturMeinild Thanks for pointing this out, I will remove it in my solution. Can you share your final script here or send it to me via email? I am learning so much here. My email: [email protected]
Artur Meinild avatar
vn flag
I added my final version to my answer.
Score:0
cn flag

I generally go the other way: In aptitude I mark everything as "automatically installed", which will mark all but essential packages for deletion. Then I go through the list of packages to be uninstalled, and manually add those that I know I will directly need.

If something in the list looks useful, I can press r to get the list of reverse dependencies, and if something in that list looks useful, that gets marked manually installed and the package I was looking at earlier remains installed as a dependency.

Artur Meinild avatar
vn flag
This sounds like a very dangerous practice - I wouldn't recommend this - especially not marking everything as "automatically installed".
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.