4

I use EC2 on Amazon Web Services. The OS of the t2.micro instance is a customized “Amazon Linux” with 1 GiB RAM and 1 vCPU. When accessing this instance via their Cloud9 IDE I find that by default already 73% of the available file space (7.8G on /dev/xvda1) is occupied, and I can only use the remaining 2.2G.

My requirements:

  • I need to execute a Python script and write output data locally.
  • I can do without GUI since I am working on the command line.

What components of their OS can be safely removed in order to free up some space?

marianoju
  • 233
  • 1
    Is running a different distribution an option? I run an Ubuntu Server on EC2 with less than 4.5 GB for the OS. – Philip Couling Mar 19 '19 at 16:15
  • You could a) add an extra empty volume to an instance b) copy existing image to a bigger volume and extend the file system (this is a bit tricky but still possible) c) create your own image with software packages of your choice – Serge Mar 20 '19 at 14:38
  • @PhilipCouling That might be a workaround, but would require me to set up the instance from scratch. Can you confirm that this actually yields you more (remaining) free space? – marianoju Mar 20 '19 at 19:43
  • @PhilipCouling I had to find out by myself and started two t2.micro instances with Amazon Linux and Ubuntu Server 18.04 LTS respectively. OS occupies 5.7G/9.8G (60%) and 6.4G/9.7G (66%), and default available file space on startup is 4.0G (40%) and 3.3G (34%). How come your Ubuntu Server on EC2 weighs less than 4.5G? – marianoju Aug 08 '19 at 12:19
  • @marianoju Similar to the accepted answer. I start with a minimal image, its a while so I can't say which one. Then I strip things out. From a fresh install I interrogate the output of apt-mark showmanual. I mark for possible removal as much as I can: apt-mark auto <module>. Eg: apt-mark auto $(apt-mark showmanual | grep ^lib) sets all libs to auto meaning they will only be kept if something else needs them. Finally I apt-get autoremove to remove anything no longer needed. – Philip Couling Aug 08 '19 at 13:01
  • Please note that I am using the Cloud9 IDE. An instance initiated via the Launch Instance Wizard with an Amazon Linux 2 AMI (HVM) to be accessed via SSH has 6.8G default available file space on startup. – marianoju Aug 08 '19 at 13:04

1 Answers1

4

1. remove dispensable packages

Amazon Linux instances manage their software using the yum package manager. The yum package manager can install, remove, and update software, as well as manage all of the dependencies for each package. – Managing Software on Your Linux Instance

I have executed the following to produce a list of the 20 largest packages in the system:

rpm -qa --queryformat '%10{size} - %-25{name} \t %{version}\n' | sort -nr | head -n 20

To remove packages with all of its dependencies I have then installed the yum plugin remove-with-leaves and then repeatedly removed the largest packages (including dependencies) which I deemed dispensable (see below for list):

sudo yum remove package_name --remove-leaves

2. remove obsolete kernel

  1. Identified current kernel:uname -mrs
  2. Listed all kernels: rpm -q kernel
  3. Manually removed obsolete Linux kernel: sudo yum remove kernel-4.9.76-3.78.amzn1.x86_64

3. remove unused packages

Identified packages that can be removed without affecting anything else (in debian-speak these are called “orphaned packages”) and removed quietly.

sudo package-cleanup --quiet --leaves | sudo xargs -l1 yum -y remove 

Findings

While I am actively only using Python 3.6.5 it is not possible to remove the default python (Python 2.7.14).

Python is required by many of the Linux distributions. Many system utilities the distro providers combine (both GUI based and not), are programmed in Python. The version of python the system utilities are programmed in I will call the "main" python. [...] Because of the system utilities that are written in python it is impossible to remove the main python without breaking the system. – How to yum remove Python gracefully?

Space occupied by python27 packages amounts to 115819035 bytes (~116 MB).

Results

  • A total of ~0.5 GB was reclaimed (7% of disk space on /dev/xvda1).
  • 214 packages with a total of 633427867 bytes were removed:
    java-1.7.0-openjdk emacs-common mysql55-server java-1.7.0-openjdk-devel git 
    mysql55 vim-common perl compat-libicu4 aws-apitools-ec2 emacs v8 ruby20-libs 
    perl-Encode nodejs-devel aws-apitools-elb aws-apitools-as nodejs 
    aws-apitools-mon perl-DBD-SQLite dejavu-sans-fonts subversion subversion-libs 
    subversion-perl python36-devel dejavu-serif-fonts vim-enhanced libtool autoconf 
    perl-DBI rubygem20-rdoc automake libX11-common perl-libs gyp cvs libX11 git-svn 
    alsa-lib gnutls dejavu-sans-mono-fonts perl-Net-SSLeay npm libyaml-devel 
    xorg-x11-fonts-Type1 perl-IO-Compress rsync libxcb libpng perl-Test-Harness 
    rubygems20 perl-Pod-Simple fontconfig aws-amitools-ec2 lcms2 perl-DBD-MySQL55 
    git-cvs xorg-x11-font-utils libXfont perl-podlators perl-IO-Socket-SSL git-p4 
    v8-devel perl-YAML perl-Storable rubygem20-json perl-Git-SVN perl-PathTools 
    nodejs-hawk perl-Pod-Perldoc ruby20-irb perl-File-Temp libuv-devel libserf 
    system-rpm-config autogen-libopts perl-Getopt-Long perl-Compress-Raw-Zlib 
    perl-Filter perl-GSSAPI dejavu-fonts-common libuv perl-Net-Daemon libICE cvsps 
    perl-Socket rubygem20-psych perl-Digest-SHA git-email perl-Authen-SASL ttmkfdir 
    perl-HTTP-Tiny perl-Data-Dumper nodejs-ctype perl-threads emacs-git 
    perl-Time-HiRes perl-IO-Socket-IP libXext giflib rubygem20-bigdecimal libSM 
    nodejs-async perl-threads-shared perl-PlRPC nodejs-hoek node-gyp libXi perl-Git 
    nodejs-request nodejs-fstream perl-Scalar-List-Utils ruby20 nodejs-mime 
    perl-Exporter perl-TermReadKey perl-Compress-Raw-Bzip2 nodejs-tar 
    perl-Digest-MD5 perl-File-Path perl-Error http-parser perl-Net-LibIDN 
    perl-Pod-Usage perl-Time-Local libfontenc libXrender libXau 
    nodejs-npm-registry-client nodejs-minimatch nodejs-boom nodejs-http-signature 
    nodejs-semver libXcomposite nodejs-glob nodejs-nopt perl-Digest perl-Carp 
    libXtst perl-Thread-Queue nodejs-npmconf libffi-devel perl-constant gpm-libs 
    perl-Pod-Escapes nodejs-normalize-package-data nodejs-packaging 
    nodejs-read-package-json nodejs-promzard nodejs-lockfile nodejs-asn1 
    nodejs-ansi perl-Text-ParseWords copy-jdk-configs nodejs-form-data nodejs-sntp 
    nodejs-fstream-npm nodejs-node-uuid nodejs-config-chain perl-Digest-HMAC 
    nodejs-retry nodejs-graceful-fs nodejs-sigmund nodejs-npmlog http-parser-devel 
    nodejs-read-installed nodejs-lru-cache nodejs-init-package-json nodejs-qs 
    nodejs-slide nodejs-combined-stream nodejs-assert-plus nodejs-fstream-ignore 
    nodejs-block-stream perl-parent nodejs-delayed-stream nodejs-ini nodejs-sha 
    nodejs-cmd-shim nodejs-tunnel-agent nodejs-mute-stream nodejs-rimraf 
    nodejs-read nodejs-osenv nodejs-mkdirp perl-macros nodejs-which nodejs-abbrev 
    perl-Net-SMTP-SSL nodejs-archy nodejs-uid-number nodejs-aws-sign 
    nodejs-forever-agent nodejs-opener nodejs-json-stringify-safe nodejs-proto-list 
    nodejs-cryptiles nodejs-editor nodejs-child-process-close 
    nodejs-github-url-from-git nodejs-cookie-jar nodejs-npm-user-validate 
    nodejs-chmodr nodejs-chownr nodejs-once nodejs-inherits nodejs-oauth-sign 
    aws-apitools-common mysql-config vim-filesystem ruby git-all 
    fontpackages-filesystem 
    

Resources

  1. Amazon Linux AMI
  2. GAD3R's answer to how to remove all installed dependent packages while removing a package in centos 7?
  3. How to remove old unused kernels on CentOS Linux
  4. jtoscarson's answer to Remove unused packages
  5. Owen Fraser-Green's answer to How can I remove Orphan Packages in Fedora?
marianoju
  • 233
  • 2
    nice work and nice explanation. this seems however too much work for too small gain (unless whole process is automated on vm creation). – Archemar Jul 24 '19 at 08:35
  • Compressing my output remotely at regular intervals turned out to be the solution to my original problem which I failed to state: I was regularly running out of space. My output was highly redundant and could be compressed by >80% using gzip and >95% using xz. Runtime of the compression was negligible in this scenario. – marianoju Aug 08 '19 at 14:08