BPE Tokenizer Visualizer

See Byte Pair Encoding happen step by step. Type any text and watch character pairs merge, building the vocabulary in real time.

Input

4
0 = character-level max = fully merged

Tokenized Output

Enter text above to see tokenization
0
Tokens
0
Characters
0
Vocabulary Size
0
O(n²) Attention Cost
!
Sequence length impact Transformer attention cost scales as O(n²) where n is the sequence length in tokens. A sequence of 43 tokens costs ~1,849 operations per head.

Character-Level vs BPE

Character-Level (no merges)

Sequence length: 0 · O(n²) = 0

BPE (current merges)

Sequence length: 0 · O(n²) = 0

Merge Table

Pairs are counted and merged in order of frequency. Each merge adds a new token to the vocabulary.

#PairNew Token IDCount