Show HN: I modeled the Voynich Manuscript with SBERT to test for structure
Published on: 2025-07-04 13:09:01
๐ Voynich Manuscript Structural Analysis
๐ Overview
This started as a personal challenge to figure out what modern NLP could tell us about the Voynich Manuscript โ without falling into translation speculation or pattern hallucination. I'm not a linguist or cryptographer. I just wanted to see if something as strange as Voynichese would hold up under real language modeling: clustering, POS inference, Markov transitions, and section-specific patterns.
Spoiler: it kinda did.
This repo walks through everything โ from suffix stripping to SBERT embeddings to building a lexicon hypothesis. No magic, no GPT guessing. Just a skeptical test of whether the manuscript has structure that behaves like language, even if we donโt know what itโs saying.
๐ง Why This Matters
The Voynich Manuscript remains undeciphered, with no agreed linguistic or cryptographic solution. Traditional analyses often fall into two camps: statistical entropy checks or wild guesswork. This project offers a middle path โ u
... Read full article.