Subbit takes the bench

Intro

We’ve been busy!

A late draft of the subbit validator, the backbone of Subbit.xyz, is now ready. In fact, by the time you’re reading this, it should have been released into the wild!

For the brave wanting to go deeper, you can also checkout our

In this post we want to talk about some of the testing and benchmarking we’ve done.

Pssss! To our reviewers

According to our milestone M1, it must be demonstrated that:

The SC aligns with spec. SC builds. Tests succeed. Benchmarks run.
The output of benchmarks are numbers providing evidence of the feasibility and costing of “typical txs”
There are at least 10 tests, with some covering positive conditions, some negative.

There is some acknowledgement that this is a proxy, albeit a poor one, for a “sensible” amount of testing.

This post is to provide some commentary about how we think we’ve met these aims

SC/ Spec alignment

The spec is structured in a way that roughly mirrors that of the implementation. This likely makes the spec harder to read and digest than some alternative presentation. However, once this obstacle has been overcome, it should be much easier to assure oneself that the code does all the things the spec indicates it should.

In fact key parts of the spec are inlined in the code. Small divergence will creep in and will be fixed on an ongoing basis, but particularly in M6. Large divergences should be fixed immediately.

Tests

We run some tests to check that things that we expect to succeed succeed, and the things we expect to fail fail.

Steps succeed

We fuzz each step individually:

$aiken check -m "step.{..}"
    Compiling kompact-io/subbit-xyz 0.0.0 (.)
    Resolving kompact-io/subbit-xyz
      Fetched 1 package in 0.05s from cache
    Compiling aiken-lang/stdlib v2.2.0 (./build/packages/aiken-lang-stdlib)
    Compiling aiken-lang/fuzz main (./build/packages/aiken-lang-fuzz)
   Collecting all tests scenarios within module(s): *step*
      Testing ...

┍━ mark/steps ━━━━━━━━━━━━━━━━━━━
│ PASS [after 100 tests] test_add
│ PASS [after 100 tests] test_sub
│ PASS [after 100 tests] test_close
│ PASS [after 100 tests] test_settle
│ PASS [after 100 tests] test_end
│ PASS [after 100 tests] test_expire
┕ with --seed=2867413678 → 6 tests | 6 passed | 0 failed

The fuzzers for these aren’t the most elaborate, but we deem them good enough.

Steps fail

We do some sanity checks that things that ought to fail do. For every step we have a {{step}}_not_signed to check the expected party has signed. In addition we have the following:

add_less : An add step in which the continuing output has less funds.
sub_too_much : A sub that in which the continuing output has less funds than should be allowed by the IOU
sub_bad_id : A sub in which the IOU is for a different subbit
sub_bad_sig : A sub in which the IOU has a bad id
close_bad_data : A close in which the continuing output datum has changed, other than the stage
close_bad_expire : A close in which the expire at timestamp is too soon
expire_too_soon : An expire in which the time lower bound is not after expire at.

┍━ mark/fails ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
│ PASS [mem: 14310, cpu:  4280720] add_less
│ PASS [mem:   905, cpu:  2340991] add_not_singed
│ · with traces
│ | the validator crashed / exited prematurely
│ PASS [mem: 34001, cpu: 70418501] sub_too_much
│ PASS [mem:  1355, cpu:  3311452] sub_not_signed
│ · with traces
│ | the validator crashed / exited prematurely
│ PASS [mem: 22268, cpu: 68692619] sub_bad_id
│ · with traces
│ | the validator crashed / exited prematurely
│ PASS [mem: 22458, cpu: 68912045] sub_bad_sig
│ · with traces
│ | the validator crashed / exited prematurely
│ PASS [mem: 21780, cpu: 10100087] close_bad_data
│ · with traces
│ | the validator crashed / exited prematurely
│ PASS [mem: 34998, cpu: 13621027] close_bad_expire
│ PASS [mem: 21518, cpu:  7014169] close_not_signed
│ · with traces
│ | the validator crashed / exited prematurely
│ PASS [mem: 21451, cpu:  6756441] settle_not_signed
│ · with traces
│ | the validator crashed / exited prematurely
│ PASS [mem: 23017, cpu:  7187471] expire_too_soon
│ PASS [mem:  1194, cpu:  2917665] expire_not_signed
│ · with traces
│ | the validator crashed / exited prematurely
│ PASS [mem: 15268, cpu:  4647455] end_not_signed
┕━━━━━━━━━━━━━━━━━━━ 13 tests | 13 passed | 0 failed

Mutual

A mutual step should succeed if both partners sign the tx, and should fail otherwise.

┍━ mark/mutual ━━━━━━━━━━━━━━━━━━━━━
│ PASS [after 100 tests] test_mutual
│ PASS [after 100 tests] fail_mutual
┕ with --seed=3505391875 → 2 tests | 2 passed | 0 failed

Bench

We want costings estimates of “typical” txs. Namely, txs in which steps are done in batch. For Subbit.xyz, each subbit spend is one invocation of the script.

We start with a preamble on cardano txs, and the limits and costs we face.

Preample on Cardano Txs

Cardano txs have fees that increase with size and complexity. Moreover, there are upper bounds on these attributes that cannot be exceeded.

See this guide for a description and further signposting on parameters, their meanings, and purpose.

The more inputs or outputs a tx has, the larger size it will have. Each validator that is invoked, adds to the complexity budget of the tx, and may also contribute the tx size or additional reference input budget.

Bounds

maxTxSize - Max total bytes of tx. This excludes the size of inputs, beyond their output reference and likewise for reference inputs.
maxTxExecutionUnits.exUnitsMem - Max mem units that can be accumulated in a single tx.
maxTxExecutionUnits.exUnitsSteps - Max cpu units that can be accumulated in a single tx.

At the time of writing:

maxTxSize = 16384
maxTxExecutionUnits.memory = 14000000
maxTxExecutionUnits.steps = 10000000000

Fees

The relevant params with there current values are:

txFeePerByte aka minFeeA := 44
txFeeFixed aka minFeeB := 155381
executionUnitPrices := { priceMemory: 5.77e-2, priceSteps: 7.21e-5 }
minFeeRefScriptCostPerByte := 15

Each time a script is executed as part of the transaction validation process, it adds to the memory and cpu budget.

Let size be the tx size (in bytes), and mem and cpu be the total units of memory and cpu usage. Let scriptSize be the total bytes of ref scripts in either the inputs or reference inputs. Then the fee computation is:

fees
  = txFeeFixed
  + txFeePerByte * size
  + mem * priceMemory
  + cpu * priceSteps
  + scriptSize * minFeeRefScriptCostPerByte

There are two simplification in the above:

By first totalling the mem and cpu values we introduce a potential rounding error. The above may give a very small underestimate. We should instead find the ceil for the potentially non integer value cost corresponding to each redeemer.
The reference script cost calculation is generally non-linear, but for our purposes only the linear part is our concern. A note on the matter is here

Aiken Benchmarks

Aiken has an inbuilt benchmarking tool bench. To use it we need to define appropriate fuzzers for the function under consideration. It will return the cpu and mem usage.

Batched sub

The batch sub is the type of tx we are most interested in. It is the typical transaction of a provider. They take their latest set of IOUs from a set of their subscribers, and submit all of these to the L1.

Furthermore, the two most repeated steps of a subbit lifecycle are the add and sub. Of these two steps, the sub step costs more of the three budgets (size aka bytes, memory, and cpu). This is because it needs to have an IOU included and to verify it. Thus, for a typical life of a subbit, the sub step will have cost the most to do.

The following is the costing of our slightly crude “batch subbit simulator function”. The function invokes all but some preamble of the subbit validator for each subbit input. It therefore provides a reasonable guide to its cost in the wild.

┍━ mark/steps ━━━
│ test_multi_subs
│   memory units                                           cpu units
│   ⡁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⢠⠓⡁ 14161961.0   ⡁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⡠⠓⡁ 7878806528.0
│   ⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠤⠊⠁⠀⠄              ⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠔⠊⠀⠀⠄
│   ⠂⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣀⠔⠁⠀⠀⠀⠀⠂              ⠂⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡠⠤⠊⠀⠀⠀⠀⠀⠂
│   ⡁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠤⠃⠀⠀⠀⠀⠀⠀⠀⡁              ⡁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠤⠒⠁⠀⠀⠀⠀⠀⠀⠀⡁
│   ⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡔⠚⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠄              ⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠤⠊⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠄
│   ⠂⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⡠⠒⠉⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠂              ⠂⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡠⠔⠊⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠂
│   ⡁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠤⠊⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡁              ⡁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡠⠒⠊⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡁
│   ⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠤⠊⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠄              ⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠤⠒⠉⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠄
│   ⠂⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⡠⠔⠉⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠂              ⠂⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠔⠊⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠂
│   ⡁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡠⠤⠒⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡁              ⡁⠀⠀⠀⠀⠀⠀⠀⠀⠀⡠⠔⠉⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡁
│   ⠄⠀⠀⠀⠀⠀⠀⢀⠤⠒⠊⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠄              ⠄⠀⠀⠀⠀⠀⢀⡠⠒⠉⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠄
│   ⠂⠀⠀⢀⠤⠒⠊⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠂              ⠂⠀⢀⣀⠤⠊⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠂
│   ⠥⠲⠊⠅⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠁ 238348.0     ⠥⠪⠁⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠠⠀⠄⠁ 142985712.0
│   1.0                                 50.0               1.0                                 50.0
┕━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ with --seed=1817129587

The complexity/ size measure in the x-axis corresponds to the number of subbits involved. Both memory and steps (aka cpu) grow essentially linearly.

In conclusion: with 50 sub steps just about hits the max memory limit (14161961 > 14000000). See below if we expect to reach this many inputs before hitting the max tx size.

Counting Bytes:

It remains to finally count the bytes involved.

Validator

The following functions outputs the size in bytes of the subbit validator:

echo $(( ($(cat plutus.json | jq '.validators[0] | .compiledCode' | wc -m) - 3) / 2))

The following numbers are given with the current code base and versions of aiken etc.

Aiken provides the ability to build the script with different levels of tracing. With:

no tracing - 3662 bytes.
compact, user-defined tracing - 4491 bytes.
full verbose tracing - 10057

Without tracing, the validator is on-par with a simple to moderate complexity validator found in the wild.

Even with full tracing, the script can be output as a reference script without offending the maxTxSize limit. A tx including the validator as a reference script will pay 54930 lovelace more in fees than the identical tx without the reference script.

Tx

The three or four fields that grow linearly in the number of subbits input. All other fields are constant. The three fields that grow linear are the inputs, outputs, and redeemers. The fourth is the “required signers” (cardano ledger/ plutus) or “extra signatories” (aiken) field - and it depends on whether or not the provider reuses their key. Lets assume here there is some key reuse.

Our numbers are very rough, back of an envelope level calculations.

Size of an input is the size of an output reference ~= 36 or 37.

Size of redeemer. The redeemer includes an IOU, consisting of an amount and signature. An amount will be likely by 5 or 9 bytes.

  amount ~= 9
  sig ~= 64
  TOTAL + WRAPPER ~= 80

A redeemer includes an output index, the purpose index and so ~=85. However, in reality one redeemer includes all the IOUs while the rest contain no additional content, ie are 3 byte empty contructors.

Size of the constants

  subbit_id ~= 32
  currency ~= 40
  iou_key ~= 32
  consumer ~= 28
  provider ~=28
  close_period ~= 9
  TOTAL + WRAPPER ~= 173

An opened datum is then ~= 200

Size of the output

  address ~= 60
  value ~= 45
  datum ~= 200
  TOTAL = 305

Adding all this up, we estimate that including a subbit in the tx adds 36 + 85 + 305 = 426 bytes. The max tx size is 16348, and there are a couple 100 bytes of “general overhead”. Thus we hope to be able to handle 15800/426 ~ 37 subs per tx. With more restricted cases (eg Ada only, no staking) we expect to see more subbits per tx.

A final note: the datum is a substantial part of the byte budget. If the datum was embedded then we could see the bytes per subbit reduced. For example, restricting to a single currency might be an attractive optimisation for certain use cases.

Edit: There were some errors in the tx estimation section, mostly in basic arithmetic. These have been corrected.