Subnet 37: Incentive Mechanism
Subnet 37 incentive overview
Last updated
Subnet 37 incentive overview
Last updated
Subnet 37 rewards miners for producing finetuned models according to the competition’s defined parameters. It acts like a continuous benchmark where miners are rewarded for scoring the best on the evaluation criteria of the competition.
The reward mechanism works as follows:
Miners train and periodically publish competition-specific models to HuggingFace and commit the metadata for that model to the Bittensor chain.
Validators download the models from HuggingFace for each miner based on the Bittensor chain metadata and continuously evaluate them. For each competition, only the top model will receive incentive. Validators also log results to wandb. .
Note that competitions are specified independently with a defined split of emissions from the subnet. Competitions each have unique parameters that define which model(s), tokenizer(s), size(s), and sequence length(s) that miners are evaluated against. Validators will be able to use their bandwidths to fund any competition of their choosing, including both Macrocosmos ‘public good’ competitions and user-defined competitions.
Additionally, each competition can define one or more Evaluation Tasks that allow for specifying the data source along with the method to evaluate the resulting data. The normalization and weighting of each task is also supported, so we can precisely tune the contribution of each task towards ultimately receiving incentive for the competition.
The subnet launched with a competition to produce the best chatbot by finetuning the top pre-trained model from . That evaluation was performed on fresh, authenticated, synthetic data generated by subnet 18. We have since moved to using subnet 1 as the source of our evaluation data as we find that the synthetic data is a consistently higher quality. We will expand to support additional data sources and other competitions that allow for an unlocked tokenizer (as an example).
We have also introduced logic to better synchronize evaluation data across validators, using the hash of a recent block on the chain as a seed for some of the random selection/generation of data. We have a delay for picking up new submissions that ensures this information can’t be abused by submitting a model with this foreknowledge.