Temperature & Sampling Playground

See how temperature, top-k, and top-p reshape a token probability distribution in real time. Understand the math behind every sampling knob.

Week 4 · Day 3 Week 4 · Day 4 Week 6 · Day 5

Sampling Controls

Adjust temperature to reshape the distribution. Then layer on top-k or top-p filtering to see which tokens survive.

Temperature 0.80

        0.01 (greedy)
        1.0 (raw)
        3.0 (uniform)
      

Top-k 40

Top-p (nucleus) 1.00

01 (minimum)
5
0 (off)
    

How they combine The pipeline is: temperature → top-k → top-p → softmax. Temperature is applied to raw logits first, then filters restrict the candidate set, then softmax converts to probabilities. This order is critical — see the Order Matters section below.

Probability Distribution

0.00

Entropy (nats)

0.00

Entropy (bits)

Active tokens

Order Matters

Why you must apply temperature before top-k. The correct pipeline is temperature first, then filtering — applying them in reverse subtly changes the distribution.

✓ Correct: temperature → top-k

Temperature reshapes the logits, then top-k filters the scaled distribution. Temperature controls which tokens are kept by changing their relative ranks.

✗ Incorrect: top-k → temperature

Top-k is applied to raw logits, locking in a fixed set of candidates. Temperature then scales within this fixed set, unable to influence which tokens survived.

Empirical Sampling

Draw 100 tokens from the current distribution to see how the empirical sample frequencies compare to the theoretical probabilities.