This section will teach how to integrate the Dedupeer library in a distributed storage system and how use the DeFS with Dedupeer integrated.

Using the DeFS to economize disk space

To use the DeFS, follow the steps below:

1 - Download the DeFS here.
2 - Download the Apache Cassandra in the official page ( and follow the instruction of installation.
3 - Add in the environment variable "Path" the location of the folder "bin" of the Cassandra.
4 - Run Cassandra typing "cassandra" in the prompt window.
5 - And finally, run DedupeerFileStorage.jar.

Using Dedupeer in your storage system

To use the Dedupeer in your storage system, follow steps below:

1 - Download the Dedupeer Library here.
2 - Add the library to the storage system.
3 - Implement a Client Thrift to call the deduplication service that run in the library. See the example below.
4 - Before call the method implemented in the client to deduplicate, run the Thrift Server.
5 - Use the class com.dedupeer.checksum.Checksum32 to calculate the weak hash in your storage system. The strong hash used in the Dedupeer is the SHA-1.

Example of a Thrift Client using Dedupeer