Line 1... |
Line 1... |
1 |
# find greatest savings after fdupes run
|
1 |
# find greatest savings after fdupes run
|
2 |
|
2 |
|
3 |
Using fdupes to report on duplicate files, or even remove them automatically, is excellent. However, sometimes just working on a subset of the files can be done in 10% of the time and result in a 90% cleanup.
|
3 |
Using fdupes to report on duplicate files, or even remove them automatically, is excellent. However, sometimes just working on a subset of the files can be done in 10% of the time and result in a 90% cleanup.
|
4 |
|
4 |
|
5 |
fdupesGreatestSavings is a perl script that takes the output of fdupes (on
|
- |
|
6 |
stdin) and determines which entries will result in the greatest savings,
|
- |
|
7 |
whether it is 100 copies of a 1 Meg file or 2 copies of a 20G file. The
|
5 |
fdupesGreatestSavings is a perl script that takes the output of fdupes (on stdin) and determines which entries will result in the greatest savings, whether it is 100 copies of a 1 Meg file or 2 copies of a 20G file. The output is sorted by greatest savings to least.
|
8 |
output is sorted by greatest savings to least.
|
- |
|
9 |
|
6 |
|
10 |
fdupesGreatestSavings takes one parameter, the number of entries to display.
|
7 |
fdupesGreatestSavings takes one parameter, the number of entries to display. It accepts input from stdin and sends output to stdout, so it is a filter.
|
11 |
It accepts input from stdin and sends output to stdout, so it is a filter.
|
- |
|
12 |
|
8 |
|
13 |
> fdupes must be run with only the flags --recurse and --size, though
|
9 |
> fdupes must be run with only the flags --recurse and --size, though
|
14 |
> --recurse is optional.
|
10 |
> --recurse is optional.
|
15 |
|
11 |
|
16 |
The following command will look through the entire file system on a Unix
|
12 |
The following command will look through the entire file system on a Unix machine and report the top 10 duplicates it finds.
|
17 |
machine and report the top 10 duplicates it finds.
|
- |
|
18 |
|
13 |
|
19 |
fdupes --recurse --size / | ./fdupesGreatestSavings 10
|
14 |
fdupes --recurse --size / | ./fdupesGreatestSavings 10
|
20 |
|
15 |
|
21 |
If you want to save the results (for other procesing), you could do
|
16 |
If you want to save the results (for other procesing), you could do something like
|
22 |
something like
|
- |
|
23 |
|
17 |
|
24 |
fdupes --recurse --size /path/to/be/checked > /tmp/duplicate_files
|
18 |
fdupes --recurse --size /path/to/be/checked > /tmp/duplicate_files
|
25 |
fdupesGreatestSavings 100 < /tmp/duplicate_files > /tmp/fdupe.savings
|
19 |
fdupesGreatestSavings 100 < /tmp/duplicate_files > /tmp/fdupe.savings
|
26 |
|
20 |
|
27 |
#### Downloading
|
21 |
#### Downloading
|
Line 29... |
Line 23... |
29 |
Script is available via subversion at
|
23 |
Script is available via subversion at
|
30 |
svn co http://svn.dailydata.net/svn/sysadmin_scripts/trunk/fdupes
|
24 |
svn co http://svn.dailydata.net/svn/sysadmin_scripts/trunk/fdupes
|
31 |
|
25 |
|
32 |
#### Bugs
|
26 |
#### Bugs
|
33 |
|
27 |
|
34 |
The only bug I have found so far is that the count of the number of files is
|
28 |
The only bug I have found so far is that the count of the number of files is incorrect (always 2), and I haven't tracked it down. Total space used by an entry is correct, however.
|
35 |
incorrect (always 2), and I haven't tracked it down. Total space used by an
|
- |
|
36 |
entry is correct, however.
|
- |
|