merkle tree based on blake3

blake3open in new window is based on merkle tree, but the exposed interface cannot export merkle tree.

baoopen in new window implements blake3 streaming validation, but cannot resize the underlying chunksopen in new window (see support larger "chunk groups" for reduced space overheadopen in new window ).

That is, bao consumes 6% extra storage space to record the merkle tree, which is a significant overhead for a distributed content index.

So, I implemented blake3_merkleopen in new window to derive a 32-byte hash for every 1MB of content with an additional storage overhead of 0.3‱.

The merkle tree can generate hash values consistent with blake3.

When the content is less than or equal to 1MB, the merkle tree has only one node and the hash value of this node is equal to the hash value of blake3.

./examples/main.rs As follows :

use blake3_merkle::Merkle;

use std::{env, error::Error, fs::File, io::copy};

fn main() -> Result<(), Box<dyn Error>> {
  let fpath = env::current_dir()?.join("test.pdf");

  let mut blake3 = blake3::Hasher::new();
  copy(&mut File::open(&fpath)?, &mut blake3)?;

  let mut merkle = Merkle::new();
  copy(&mut File::open(&fpath)?, &mut merkle)?;
  merkle.finalize();
  dbg!(&merkle.li);
  dbg!(merkle.blake3());
  dbg!(blake3.finalize());
  Ok(())
}

Run ./example.main.shand the output is as follows

[examples/main.rs:14] &merkle.li = [
    HashDepth {
        hash: Hash(
            "eb896f431b7ff8acb4749b54981d461359a01ded0261fa0da856dd28bf29d3b3",
        ),
        depth: 10,
    },
    HashDepth {
        hash: Hash(
            "4a84cc85f03f47a7c32755f8d9d81c5d3f3e04548ee8129fd480cb71c7dbc5b4",
        ),
        depth: 10,
    },
    HashDepth {
        hash: Hash(
            "fbfe78e550b355cb6775e324c4fed7eb987084b115dca599aaf40056bfb031c3",
        ),
        depth: 10,
    },
    HashDepth {
        hash: Hash(
            "392878c3bdc9c315d6cc8a1721d8cd0a39e49ac8716f4cb8cdf6cf83fbb666f5",
        ),
        depth: 6,
    },
]
[examples/main.rs:15] merkle.blake3() = Hash(
    "74a79d0bc37dcac64c493e872252f19e8bdb32dee306481a6827fa037b378c76",
)
[examples/main.rs:16] blake3.finalize() = Hash(
    "74a79d0bc37dcac64c493e872252f19e8bdb32dee306481a6827fa037b378c76",
)
Last Updated:
Contributors: gcxfd