WebSVN – sysadmin_scripts – Diff – /trunk/fdupes/README.md

 # find greatest savings after fdupes run
 Using fdupes to report on duplicate files, or even remove them automatically, is excellent. However, sometimes just working on a subset of the files can be done in 10% of the time and result in a 90% cleanup.
-fdupesGreatestSavings is a perl script that takes the output of fdupes (on
-stdin) and determines which entries will result in the greatest savings,
-whether it is 100 copies of a 1 Meg file or 2 copies of a 20G file. The
+fdupesGreatestSavings is a perl script that takes the output of fdupes (on stdin) and determines which entries will result in the greatest savings, whether it is 100 copies of a 1 Meg file or 2 copies of a 20G file. The output is sorted by greatest savings to least.
-output is sorted by greatest savings to least.
-fdupesGreatestSavings takes one parameter, the number of entries to display.
+fdupesGreatestSavings takes one parameter, the number of entries to display. It accepts input from stdin and sends output to stdout, so it is a filter.
-It accepts input from stdin and sends output to stdout, so it is a filter.
 > fdupes must be run with only the flags --recurse and --size, though
 > --recurse is optional.
-The following command will look through the entire file system on a Unix
+The following command will look through the entire file system on a Unix machine and report the top 10 duplicates it finds.
-machine and report the top 10 duplicates it finds.
     fdupes --recurse --size / | ./fdupesGreatestSavings 10
-If you want to save the results (for other procesing), you could do
+If you want to save the results (for other procesing), you could do something like
-something like
     fdupes --recurse --size  /path/to/be/checked > /tmp/duplicate_files
     fdupesGreatestSavings 100 < /tmp/duplicate_files > /tmp/fdupe.savings
 #### Downloading
 Script is available via subversion at
     svn co http://svn.dailydata.net/svn/sysadmin_scripts/trunk/fdupes
 #### Bugs
-The only bug I have found so far is that the count of the number of files is
+The only bug I have found so far is that the count of the number of files is incorrect (always 2), and I haven't tracked it down. Total space used by an entry is correct, however.
-incorrect (always 2), and I haven't tracked it down. Total space used by an
-entry is correct, however.

Subversion Repositories sysadmin_scripts

(root)/trunk/fdupes/README.md – Rev 161 → 162