Deduplicator uses size comparison and fxhash (a non non-cryptographic hashing algo) to quickly scan through large number of files to find duplicates. its also highly parallel (uses rayon and dashmap). I haven'tuploadedthebenchmarksyet,butIwas able to scan through 120GB of files (Videos, PDFs, Images) in ~300ms.
70
+
Deduplicator uses size comparison and fxhash (a non non-cryptographic hashing algo) to quickly scan through large number of files to find duplicates. its also highly parallel (uses rayon and dashmap). I was able to scan through 120GB of files (Videos, PDFs, Images) in ~300ms.checkoutthebenchmarks
71
+
72
+
## benchmarks
73
+
74
+
| Command | Dirsize | Mean [ms] | Min [ms] | Max [ms] | Relative |
* The last entry is lower because of the number of files deduplicator had to go through (~660895 Files). The average size of the files rarely affect the performance of deduplicator.
82
+
83
+
These benchmarks were run using [hyperfine](https://github.com/sharkdp/hyperfine). Here are the specs of the machine used to benchmark deduplicator: