Step 1: Adjusting comparison settings
Before being able to start a scan, a few settings have to be configured. The
first thing we will take a look at are the comparison settings.
The comparison criteria control how DoubleKiller determines if two files are
equal. You can specify any combination of the following four criteria on the
Comparison Options page
by clicking their checkboxes. The advanced... buttons
open additional windows offering more detailed settings for each criterion which
are summarized in the white text boxes.
- file name
Compares the file names. In the advanced comparison
options you can specify if the complete name or only a part of it should
be compared and if the comparison is case sensitive.
- date
Compares the dates at which the files have been modified the last time. You
can limit the comparison to certain parts of the date, e.g. year and month,
or specify an allowed difference in the advanced comparison
options.
- size
Compares the file size / length. An allowed difference can be set in the advanced
comparison options.
- content
Compares the content of the files which is often the most important criterion.The
advanced comparison options allow to limit
the comparison to certain parts of the files and specify the comparison methods
(fingerprint and/or byte-per-byte comparison).
If more than one criterion is selected, the files have to pass all of the selected
comparisons to be considered duplicates. The relatively slow process of a content
comparison is always done at last, so in general you should try to take advantage
of anything you know about the duplicates you are searching for by using this
knowledge to reduce the amount of content comparisons needed, i.e. minimize
the number of files by choosing the right folders and filters (discussed in
the following steps) and select as many comparison criteria as possible in your
case.
Comparison criteria examples
The best combination of criteria is always dependant on the task you want to
perform. To help you, here are some hints on which combination to choose for
which purpose:
- If you know you have 1:1-copies of files or even complete folders
spread on your computer, you should select name, size and date comparison.
This will find files you have copied somewhere e.g. for backup purposes without
renaming or modifying them in any way. You can also select a content comparison
for additional safety, but this is mostly unnecessary.
- If the above applies, but you may have renamed the files, just deselect
name comparison to find identical files with different names. A content comparison
is recommended in this case to make sure no false positives are detected which
have the same size and date just by coincidence.
- If you have different versions of the same files stored in different
folders, but not renamed, you should select a file name comparison only. In
this case it is strongly recommended to only scan the specific folders these
files are located in (not your complete hard disk) and to manually check the
scan results as there might be files coincidentally sharing the same name.
You could also use a size comparison with a tolerance
of e.g. 50 kB (this depends on the file type... text files will not differ
that much in size as multimedia files) to eliminate most false positives.
If your files follow a certain naming scheme, e.g. 'myfile[1].cad',
'myfile[2].dat', you can use the advanced
filename comparison options to exclude e.g. your numbering from the comparison.
- To just scan for any files having exactly the same content without
regarding file names or dates, select size and content. This allows to find
any intentional or unintentional duplicates, e.g. in your music archive, image
collection or downloaded files folder. If you plan to scan your complete disk,
it is strongly recommended that you manually check the results and do not
touch any files in the Windows folder or other files you are unsure about!
- To create a list of all files in the given directories, using the
given filters, deactivate all comparison criteria. This way you can use DoubleKiller
as a fast file searcher, which is especially useful when searching inside
zip archives or catalog files - or when you want to create
a catalog file.
See
also
Page
"Comparison Options"
Advanced
filename comparison options
Advanced
date comparison options
Advanced
size comparison options
Advanced
content comparison options