QMD second pass — BM25 gap probes
Local index pass using qmd BM25 search over collection graphelogos-quran. Complements fetch/Atlas/entity scripts and Quartz by surfacing where the vault talks about gaps, stubs, review backlog, and tooling.
Preconditions
- Vault path:
Graphe/Quran(13,077markdown files on disk). - qmd collection
graphelogos-quranmust exist (this script runsqmd collection addif missing). - After large edits, refresh the index:
qmd update(re-indexes all collections; resolve anyEPERMon optional collections like~/Documents). - Optional:
qmd embedfor vectors, thenqmd vsearch/qmd querylocally (hybrid needs LLM; often disabled whenCI=true).
Probe results
Explicit gap / leverage language
Query: gap — 2 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.91 | qmd-pipeline-gaps.md | QMD second pass — BM25 gap probes | @@ -2,4 @@ (1 before, 176 after) noindex: true title: QMD pipeline gap pass (BM25) generated: 2026-03-20 20:57 UTC tags: [quran, qmd, pipeline, research] |
| 0.89 | research.md | Quran corpus — research & build plan | @@ -29,4 @@ (28 before, 249 after) Gap (highest leverage): the 114 surahs are not all present as files yet—only a subset is fetched. Downstream embeds, entity extraction, and published HTML all de… |
Fetch coverage / missing surahs
Query: not fetched — 5 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.91 | research.md | Quran corpus — research & build plan | @@ -29,4 @@ (28 before, 249 after) Gap (highest leverage): the 114 surahs are not all present as files yet—only a subset is fetched. Downstream embeds, entity extraction, and published HTML all de… |
| 0.9 | qmd-pipeline-gaps.md | QMD second pass — BM25 gap probes | @@ -26,4 @@ (25 before, 152 after) |------:|----------|-------|---------| | 0.79 | research.md | Quran corpus — research & build plan | @@ -29,4 @@ (28 before, 213 after) Gap (highest leverage): … |
| 0.9 | surahs.md | Surahs | @@ -8,4 @@ (7 before, 8 after) This folder holds one markdown file per surah (Surah NNN - Name.md), fetched/updated with .dev/scripts/fetch_quran.py. Each verse is a ### Ayah k section so other notes can tr… |
| 0.88 | surahs.md | Surahs in this vault | @@ -7,4 @@ (6 before, 61 after) Arabic–English surah files live under Graphe/Quran/Surahs/ (one .md per surah); see Surahs|Surahs folder note for how that directory relates to Ayah and… |
| 0.88 | research.md | Quran research | @@ -2,4 @@ (1 before, 14 after) noindex: true title: Quran research notes description: Literary overviews, entity pilot, and links to the master RESEARCH plan. tags: [quran, research, index] |
Partial corpus / scoped fetch
Query: subset — 3 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.89 | qmd-pipeline-gaps.md | QMD second pass — BM25 gap probes | @@ -26,4 @@ (25 before, 152 after) |------:|----------|-------|---------| | 0.79 | research.md | Quran corpus — research & build plan | @@ -29,4 @@ (28 before, 213 after) Gap (highest leverage): … |
| 0.86 | research.md | Quran corpus — research & build plan | @@ -29,4 @@ (28 before, 249 after) Gap (highest leverage): the 114 surahs are not all present as files yet—only a subset is fetched. Downstream embeds, entity extraction, and published HTML all de… |
| 0.82 | qmd-atlas-entity-graph.md | qmd entity + relationship hints | @@ -90,4 @@ (89 before, 0 after) uv run .dev/scripts/quran_qmd_entity_extract.py # subset: uv run .dev/scripts/quran_qmd_entity_extract.py —family people,places —max-entities 40 ``` |
Entity review backlog
Query: review queue — 5 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.95 | research.md | Quran research | @@ -14,4 @@ (13 before, 2 after) - entity-corpus-summary|Entity corpus summary — full extraction counts by confidence/family - [[Graphe/Quran/Research/entity-review-queue|Entity review queue]… |
| 0.95 | qmd-pipeline-gaps.md | QMD second pass — BM25 gap probes | @@ -57,4 @@ (56 before, 121 after) Query: review queue — 5 hit(s) | Score | Wikilink | Title | Snippet | |
| 0.95 | research.md | Quran corpus — research & build plan | @@ -110,4 @@ (109 before, 168 after) 3. Confidence queue — emit summary + review queue: ```bash |
| 0.94 | entity-review-qmd-evidence.md | Entity review qmd evidence | @@ -9,4 @@ (8 before, 274 after) Generated from entity-review-queue|entity-review-queue. Ranking rule: prioritize direct surah/ayah sources; research artifacts are filtered by default. |
| 0.9 | entity-review-queue.md | Quran entity review queue | @@ -2,4 @@ (1 before, 1016 after) noindex: true title: “Quran entity review queue” description: Medium/low-confidence Atlas matches requiring review. tags: [quran, atlas, extraction, review-queue] |
Embed integrity (Phase B)
Query: broken embed — 2 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.78 | research.md | Quran corpus — research & build plan | @@ -69,4 @@ (68 before, 209 after) - Regenerate: uv run .dev/scripts/generate_quran_juz_ayah.py after any rename (uses quran_api + /chapters + /juzs). - DoD: No broken `![[Graphe/Quran/Surahs/…#Ayah … | |
| | 0.77 | qmd-pipeline-gaps.md | QMD second pass — BM25 gap probes | @@ -69,4 @@ (68 before, 109 after) Query: broken embed — 1 hit(s) | Score | Wikilink | Title | Snippet | |
DoD / checklist
Query: Definition of Done — 3 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.86 | qmd-pipeline-gaps.md | QMD second pass — BM25 gap probes | @@ -77,4 @@ (76 before, 101 after) Query: Definition of Done — 2 hit(s) | Score | Wikilink | Title | Snippet | |
| 0.83 | research.md | Quran corpus — research & build plan | @@ -46,4 @@ (45 before, 232 after) Each stage below lists inputs, outputs, tools, and Definition of Done (observable). --- |
| 0.51 | surah-017-al-isra.md | Surah 17: Al-Isra | @@ -772,4 @@ (771 before, 369 after) [Mention, O Muḥammad], the Day We will call forth every people with their record [of deeds] Then whoever is given his record in his right hand - those will read their records, and… |
Entity pipeline & validation
Query: entity extraction — 12 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.94 | entity-validation-report.md | Entity extraction validation report | @@ -2,4 @@ (1 before, 14 after) noindex: true title: “Entity extraction validation report” description: Structural and regression checks for Quran Atlas extraction sidecars. tags: [quran, atlas, validation] |
| 0.94 | qmd-pipeline-gaps.md | QMD second pass — BM25 gap probes | @@ -26,4 @@ (25 before, 152 after) |------:|----------|-------|---------| | 0.79 | research.md | Quran corpus — research & build plan | @@ -29,4 @@ (28 before, 213 after) Gap (highest leverage): … |
| 0.94 | entity-corpus-summary.md | Quran entity extraction summary | @@ -2,4 @@ (1 before, 53 after) noindex: true title: “Quran entity extraction summary” description: Corpus-wide stats for Atlas candidate extraction. tags: [quran, atlas, extraction, summary] |
| 0.94 | research.md | Quran corpus — research & build plan | @@ -29,4 @@ (28 before, 249 after) Gap (highest leverage): the 114 surahs are not all present as files yet—only a subset is fetched. Downstream embeds, entity extraction, and published HTML all de… |
| 0.94 | entity-pilot-surah-001.md | Entity extraction pilot — Surah 1 | @@ -7,4 @@ (6 before, 33 after) # Entity extraction pilot — Surah 1 Source: Surah 001 - Al-Fatihah.md|Surah file · Generator: .dev/scripts/quran_entity_pilot.py |
| 0.93 | entity-scan-surah-077.md | Entity scan — Surah 77 | @@ -2,4 @@ (1 before, 16 after) noindex: true title: “Entity scan — Surah 77” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
| 0.93 | entity-scan-surah-102.md | Entity scan — Surah 102 | @@ -2,4 @@ (1 before, 16 after) noindex: true title: “Entity scan — Surah 102” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
| 0.93 | entity-scan-surah-103.md | Entity scan — Surah 103 | @@ -2,4 @@ (1 before, 16 after) noindex: true title: “Entity scan — Surah 103” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
| 0.93 | entity-scan-surah-109.md | Entity scan — Surah 109 | @@ -2,4 @@ (1 before, 16 after) noindex: true title: “Entity scan — Surah 109” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
| 0.93 | entity-scan-surah-101.md | Entity scan — Surah 101 | @@ -2,4 @@ (1 before, 16 after) noindex: true title: “Entity scan — Surah 101” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
| 0.93 | entity-scan-surah-107.md | Entity scan — Surah 107 | @@ -2,4 @@ (1 before, 16 after) noindex: true title: “Entity scan — Surah 107” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
| 0.93 | entity-scan-surah-086.md | Entity scan — Surah 86 | @@ -2,4 @@ (1 before, 19 after) noindex: true title: “Entity scan — Surah 86” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
Stubs / thin notes (often Atlas)
Query: stub — 12 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.88 | qmd-pipeline-gaps.md | QMD second pass — BM25 gap probes | @@ -9,4 @@ (8 before, 169 after) Local index pass using qmd BM25 search over collection graphelogos-quran. Complements fetch/Atlas/entity scripts and Quartz by surfacing where t… |
| 0.87 | tabuk.md | Tabūk | @@ -23,4 @@ (22 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | iraq.md | Iraq | @@ -23,4 @@ (22 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | nile.md | Nile | @@ -23,4 @@ (22 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | yemen.md | Yemen | @@ -23,4 @@ (22 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | aylah.md | Aylah | @@ -26,4 @@ (25 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | west.md | Maghrib | @@ -25,4 @@ (24 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | jordan.md | Jordan River | @@ -23,4 @@ (22 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | tih.md | al-Tīḥ | @@ -25,4 @@ (24 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | sham.md | al-Shām | @@ -26,4 @@ (25 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | ararat.md | Mount Judi | @@ -26,4 @@ (25 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | dead-sea.md | Dead Sea | @@ -23,4 @@ (22 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
Blockers
Query: blocker — 2 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.88 | qmd-pipeline-gaps.md | QMD second pass — BM25 gap probes | @@ -122,4 @@ (121 before, 56 after) Blockers Query: blocker — 1 hit(s) |
| 0.79 | research.md | Quran corpus — research & build plan | @@ -10,4 @@ (9 before, 268 after) Hypothesis (cycle): Finishing the 114 surah fetch plus keeping surah-hashes.json|surah-hashes.json + Quartz paths stable removes the largest blockers… |
Hash manifest
Query: surah-hashes — 0 hit(s)
No BM25 matches.
Fetch script references
Query: fetch_quran — 0 hit(s)
No BM25 matches.
Auto-generated Atlas verse markers
Query: AUTO_ASMA — 0 hit(s)
No BM25 matches.
People/places seed data
Query: people_places — 0 hit(s)
No BM25 matches.
Publish / site
Query: Quartz — 4 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.94 | research.md | Quran corpus — research & build plan | @@ -3,4 @@ (2 before, 275 after) description: End-to-end plan to fetch, organize, Atlas entity work, categorize, tag, hash, and index the full Quranic corpus in this vault (wikilinked). tags: [quran, research, pipelin… |
| 0.92 | qmd-pipeline-gaps.md | QMD second pass — BM25 gap probes | @@ -9,4 @@ (8 before, 169 after) Local index pass using qmd BM25 search over collection graphelogos-quran. Complements fetch/Atlas/entity scripts and Quartz by surfacing where t… |
| 0.92 | research.md | Quran research | @@ -9,4 @@ (8 before, 7 after) - RESEARCH|RESEARCH — master plan (fetch → Atlas → Quartz) - Literary structures overview|Literary structures overview — surah-level rhetoric … |
| 0.87 | surahs.md | Surahs in this vault | @@ -64,4 @@ (63 before, 4 after) - Juz literary overview|Juz — literary overview — ajzāʾ as reading grid vs surah-level rhetoric; ḥizb/maqraʾ; Juz ʿAmma - RESEARCH|RESEARCH —… |
Entity sidecar schema
Query: schema_version — 0 hit(s)
No BM25 matches.
How to regenerate
uv run .dev/scripts/quran_qmd_gap_pass.py
# optional: re-index everything first
# uv run .dev/scripts/quran_qmd_gap_pass.py --reindexTotal BM25 rows listed above: 50 (probes may overlap the same note).