Show HN: Harmonic embeddings beat random init and work frozen – no tokenizer

(github.com)

2 points | by atech-77 5 hours ago ago

1 comments

atech-77 5 hours ago ago

I built a character-level transformer that uses harmonic phase encoding instead of learned embeddings. Each character gets a phase angle on the unit circle, embedded as [cos(theta), sin(theta), cos(2theta), sin(2theta), ...]. No tokenizer, no BPE.

  Three identical models trained on Shakespeare (4 layers, 128 dim, RTX 4070 Ti, ~10 min each):

  - Baseline (random Gaussian, trainable): val loss 1.5570
  - Harmonic (phase-encoded, trainable): val loss 1.5223 (-2.2%)
  - Frozen (phase-encoded, NOT trainable): val loss 1.5567 (-0.02%)

  The frozen model has 40k fewer trainable parameters and zero gradient updates to embeddings. It matches the fully-trained baseline using pure geometry.

  This came out of a larger project (25 tests) validating harmonic coherence as a computational primitive. The chain: Test 21 proved cosine similarity is blind to harmonic structure → Test 24 confirmed real transformer embeddings contain that structure → Test 25 showed providing it from the start beats learning it from random noise.

  Repo has everything: Rust + Python test suites, the transformer script, two papers. python harmonic_transformer.py to reproduce.