WebSVN – sysadmin_scripts – Blame – /trunk/fdupes/README.md

Rev	Author	Line No.	Line
158	rodolico	1	`# find greatest savings after fdupes run`
		2
		3	`Using fdupes to report on duplicate files, or even remove them`
		4	`automatically, is excellent. However, sometimes just working on a subset of`
		5	`the files can be done in 10% of the time and result in a 90% cleanup.`
		6
		7	`fdupesGreatestSavings is a perl script that takes the output of fdupes (on`
		8	`stdin) and determines which entries will result in the greatest savings,`
		9	`whether it is 100 copies of a 1 Meg file or 2 copies of a 20G file. The`
		10	`output is sorted by greatest savings to least.`
		11
		12	`fdupesGreatestSavings takes one parameter, the number of entries to display.`
		13	`It accepts input from stdin and sends output to stdout, so it is a filter.`
		14
		15	`> fdupes must be run with only the flags --recurse and --size, though`
		16	`> --recurse is optional.`
		17
		18	`The following command will look through the entire file system on a Unix`
		19	`machine and report the top 10 duplicates it finds.`
		20
		21	`fdupes --recurse --size / \| ./fdupesGreatestSavings 10`
		22
		23	`If you want to save the results (for other procesing), you could do`
		24	`something like`
		25
		26	`fdupes --recurse --size /path/to/be/checked > /tmp/duplicate_files`
		27	`fdupesGreatestSavings 100 < /tmp/duplicate_files > /tmp/fdupe.savings`
		28
160	rodolico	29	`#### Downloading`
159	rodolico	30
		31	`Script is available via subversion at`
		32	`svn co http://svn.dailydata.net/svn/sysadmin_scripts/trunk/fdupes`
		33
160	rodolico	34	`#### Bugs`
158	rodolico	35
		36	`The only bug I have found so far is that the count of the number of files is`
		37	`incorrect (always 2), and I haven't tracked it down. Total space used by an`
		38	`entry is correct, however.`

Subversion Repositories sysadmin_scripts

(root)/trunk/fdupes/README.md – Rev 160