How state compression - the tech behind compressed NFTs - works, and how to implement it in your own Nexis Native Chain programs.
2 ^ maxDepth
.
The max buffer size is effectively the maximum number of concurrent changes
that you can make to a tree within a single slot with the root hash still being
valid. When multiple transactions are submitted in the same slot, each of which
is competing to update leafs on a standard Merkle tree, only the first to run
will be valid. This is because that “write” operation will modify the hash
stored in the account. Subsequent transactions in the same slot will be trying
to validate their data against a now-outdated hash. A concurrent Merkle tree has
a buffer so that the buffer can keep a running log of these modifications. This
allows the State Compression Program to validate multiple data writes in the
same slot because it can look up what the previous hashes were in the buffer and
compare against the appropriate hash.
The canopy depth is the number of proof nodes that are stored onchain for
any given proof path. Verifying any leaf requires the complete proof path for
the tree. The complete proof path is made up of one proof node for every “layer”
of the tree, i.e. a max depth of 14 means there are 14 proof nodes. Every proof
node passed into the program adds 32 bytes to a transaction, so large trees
would quickly exceed the maximum transaction size limit. Caching proof nodes
onchain in the canopy helps improve program composability.
Each of these three values, max depth, max buffer size, and canopy depth, comes
with a tradeoff. Increasing the value of any of these values increases the size
of the account used to store the tree, thus increasing the cost of creating the
tree.
Choosing the max depth is fairly straightforward as it directly relates to the
number of leafs and therefore the amount of data you can store. If you need 1
million cNFTs on a single tree where each cNFT is a leaf of the tree, find the
max depth that makes the following expression true: 2^maxDepth > 1 million
.
The answer is 20.
Choosing a max buffer size is effectively a question of throughput: how many
concurrent writes do you need? The larger the buffer, the higher the throughput.
Lastly, the canopy depth will determine your program’s composability. State
compression pioneers have made it clear that omitting a canopy is a bad idea.
Program A can’t call your state-compressed program B if doing so maxes out the
transaction size limits. Remember, program A also has required accounts and data
in addition to required proof paths, each of which take up transaction space.
Message
struct might look as follows:
init_empty_merkle_tree
instruction on the State Compression Program. Since
this requires the max depth and max buffer size, these will need to be passed in
as arguments to the instruction.
Remember, the max depth refers to the maximum number of hops to get from any
leaf to the root of the tree. Max buffer size refers to the amount of space
reserved for storing a changelog of tree updates. This changelog is used to
ensure that your tree can support concurrent updates within the same block.
For example, if we were initializing a tree for storing messages between users,
the instruction might look like this:
append
instruction to add the hash to the tree. The following
discuss what your instruction needs to do in depth:
hashv
function from the keccak
crate to hash the data. In most
cases, you’ll want to also hash the owner or authority of the data as well to
ensure that it can only be modified by the proper authority.wrap_application_data_v1
to issue a CPI to the Noop
program with this object. This ensures that the uncompressed data is readily
available to any client looking for it. For broad use cases like cNFTs, that
would be indexers. You might also create your own observing client to
simulate what indexers are doing but specific to your application.append
instruction. This takes the hash computed in step 1 and adds it to the next
available leaf on your Merkle tree. Just as before, this requires the Merkle
tree address and the tree authority bump as signature seeds.update
transaction is the true owner or authority of the leaf at the
given index. Since the data is compressed as a hash on the leaf, we can’t
simply compare the authority
public key to a stored value. Instead, we need
to compute the previous hash using the old data and the authority
listed in
the account validation struct. We then build and issue a CPI to the State
Compression Program’s verify_leaf
instruction using our computed hash.hashv
function from the keccak
crate to
hash the new data and the update authority, each as their corresponding byte
representation.wrap_application_data_v1
to issue a CPI to the Noop program.replace_leaf
instruction. This uses the old hash, the
new hash, and the leaf index to replace the data of the leaf at the given
index with the new hash. Just as before, this requires the Merkle tree
address and the tree authority bump as signature seeds.delete
instruction. Instead, you’ll want to update leaf data with
data that indicates the data as “deleted.” The specific data will depend on your
use case and security concerns. Some may opt to set all data to 0, whereas
others might store a static string that all “deleted” items will have in common.
deserializeApplicationDataEvent
function from the @nexis-network/spl-account-compression
JS package to get the logs,
then deserialize them using Borsh. Below is an example based on the messaging
program used above.
spl-account-compression
crate with the cpi
feature
enabled. Let’s add it as a dependency in programs/compressed-notes/Cargo.toml
.
Anchor.toml
in the root
directory so they get cloned to our local cluster.
lib.rs
file for the rest of the Demo. Remove the
initialize
instruction and the Initialize
accounts struct, then add the
imports shown in the code snippet below (be sure to put in your program
id):
lib.rs
file. This simplifies the explanations a bit. You’re welcome to
modify the structure as you will.
Feel free to build before continuing. This ensures your environment is working
properly and shortens future build times.
Note
schemaleaf_node
- this should be a 32-byte array representing the hash stored on
the leaf nodeowner
- the public key of the note ownernote
- the string representation of the noteAnchorSerialize
derive macro rather than the account
macro.
NoteAccounts
struct for our account
validation. It’ll need the following accounts:
owner
- this is the creator and owner of the note; should be a signer on the
transactiontree_authority
- the authority for the Merkle tree; used for signing
compression-related CPIsmerkle_tree
- the address of the Merkle tree used to store the note hashes;
will be unchecked since it is validated by the State Compression Programlog_wrapper
- the address of the Noop Programcompression_program
- the address of the State Compression Programcreate_note_tree
instructioncreate_note_tree
instruction. Remember, clients will
have already allocated the Merkle tree account but will use this instruction to
initialize it.
All this instruction needs to do is build a CPI to invoke the
init_empty_merkle_tree
instruction on the State Compression Program. To do
this, it needs the accounts listed in the NoteAccounts
account validation
struct. It also needs two additional arguments:
max_depth
- the max depth of the Merkle treemax_buffer_size
- the max buffer size of the Merkle treeappend_note
instructionappend_note
instruction. This instruction needs to take
the raw note as a String and compress it into a hash that we’ll store on the
Merkle tree. We’ll also log the note to the Noop program so the entirety of the
data exists within the chain’s state.
The steps here are as follows:
hashv
function from the keccak
crate to hash the note and owner,
each as their corresponding byte representation. It’s crucial that you
hash the owner as well as the note. This is how we’ll verify note ownership
before updates in the update instruction.NoteLog
struct using the hash from step 1, the
owner’s public key, and the raw note as a String. Then call
wrap_application_data_v1
to issue a CPI to the Noop program, passing the
instance of NoteLog
. This ensures the entirety of the note (not just the
hash) is readily available to any client looking for it. For broad use cases
like cNFTs, that would be indexers. You might create your observing client to
simulate what indexers are doing but for your own application.append
instruction. This takes the hash computed in step 1 and adds it to the next
available leaf on your Merkle tree. Just as before, this requires the Merkle
tree address and the tree authority bump as signature seeds.update_note
instructionupdate_note
instruction. This should
replace an existing leaf with a new hash representing the new updated note data.
For this to work, we’ll need the following parameters:
index
- the index of the leaf we are going to updateroot
- the root hash of the Merkle treeold_note
- the string representation of the old note we’re updatingnew_note
- the string representation of the new note we want to update toappend_note
, but with some minor
additions and modifications:
owner
calling this
function is the true owner of the leaf at the given index. Since the data is
compressed as a hash on the leaf, we can’t simply compare the owner
public
key to a stored value. Instead, we need to compute the previous hash using
the old note data and the owner
listed in the account validation struct. We
then build and issue a CPI to the State Compression Program’s verify_leaf
instruction using our computed hash.append_note
instruction. Use the hashv
function from the keccak
crate to hash the new
note and its owner, each as their corresponding byte representation.append_note
instruction. Create an instance of the NoteLog
struct using the hash from
step 2, the owner’s public key, and the new note as a string. Then call
wrap_application_data_v1
to issue a CPI to the Noop program, passing the
instance of NoteLog
append_note
instruction. Build and issue a CPI to the State Compression
Program’s replace_leaf
instruction. This uses the old hash, the new hash,
and the leaf index to replace the data of the leaf at the given index with
the new hash. Just as before, this requires the Merkle tree address and the
tree authority bump as signature seeds.@nexis-network/spl-account-compression
package. Go ahead and
install it:
utils.ts
file in the tests
directory, add in
the below, then we’ll explain it.
NoteLog
- a class representing the note log we’ll find in the Noop program
logs. We’ve also added the borsh schema as NoteLogBorshSchema
for
deserialization.getHash
- a function that creates a hash of the note and note owner so we
can compare it to what we find on the Merkle treegetNoteLog
- a function that looks through the provided transaction’s logs,
finds the Noop program logs, then deserializes and returns the corresponding
Note log.append_note
instructionappend_note
instruction with a note
that maxes out the 1232 bytes allowed in a single transactionupdate_note
instruction to modify
the first note we added@nexis-network/web3.js
, @nexis-network/spl-account-compression
, and our own utils file.
Create Note Tree
test. This test will do two things:
createNoteTree
instructionAdd Note
test. It should call append_note
with
firstNote
, then check that the onchain hash matches our computed hash and that
the note log matches the text of the note we passed into the instruction.
Add Max Size Note
test. It is the same as the previous
test, but with the second note.
Update First Note
test. This is slightly more complex
than adding a note. We’ll do the following:
update_note
instruction of our program, passing in the index 0
(for the first note), the Merkle tree root, the first note, and the updated
data. Remember, it needs the first note and the root because the program must
verify the entire proof path for the note’s leaf before it can be updated.anchor test
and you should get four
passing tests.
If you’re running into issues, feel free to go back through some of the demo or
look at the full solution code in the
Compressed Notes repository.
solution
branch on GitHub.