2013-12-30

Remove duplicate files

Poor man's deduplication: scan directories to find duplicate files and remove all but one. This blog entry looks at two tools: fdupes and fslint


Using fdupes (NOTE: only deletes duplicate files, leaving one intact; there is no option to hardlink):

  1. Install by going to http://software.opensuse.org/package/fdupes and clicking the "Direct install" button
  2. Read the manpage: man fdupes
  3. Do it (WARNING this will destroy data, blah blah):
    $ fdupes -rdN [LIST_OF_DIRECTORIES]
    

Using fslint:

  1. Unfortunately, there is no official openSUSE package. Also, this is "GUI-lovers" software similar to unison and recoll -- i.e. it implements a GUI over a command-line interface, and the packaging may not ensure easy access to the latter. So, I follow the openSUSE instructions at the project homepage:
    # cd
    # [ -f /etc/SuSE-release ] && pkg=packages
    # wget http://www.pixelbeat.org/fslint/fslint-2.42.tar.gz
    # sudo rpmbuild -ta fslint-2.42.tar.gz
    # sudo rpm -Uvh /usr/src/$pkg/RPMS/noarch/fslint-2.42-1.*.noarch.rpm
    
  2. Read the manpage: man fslint
  3. Make the findup command accessible by putting the following in my .aliases
    alias findup=/usr/share/fslint/fslint/findup
    
    (and also run it from my current bash prompt so it takes effect immediately)
  4. Look at usage: findup --help
  5. Dry run: findup -mt --summary [LIST_OF_DIRECTORIES]

No comments:

Post a Comment