Chinchilla Scaling Calculator

Find the optimal model size for your training budget, or the optimal token count for your model. Based on Hoffmann et al. 2022.

Tiny (toy) GPT-2 Small GPT-2 Medium GPT-2 Large LLaMA 7B LLaMA 13B

Calculator

Model Parameters

Parameters 85.0M

Training Tokens 1.70B

Chinchilla-Optimal

1.70B

Optimal Training Tokens

85.0M

Optimal Model for Token Budget

Chinchilla-optimal ratio: 20 tokens per parameter.

Optimal

Your model and training token count are well-balanced according to Chinchilla scaling laws. For a 85M parameter model, the optimal training budget is 1.7B tokens.

Scaling Frontier

Training FLOPs iso-lines Known models

Known Models Reference

Model	Parameters	Training Tokens	Token:Param Ratio	Status