QMD second pass — BM25 gap probes
Local index pass using qmd BM25 search over collection graphelogos-quran. Complements fetch/Atlas/entity scripts and Quartz by surfacing where the vault talks about gaps, stubs, review backlog, and tooling.
Preconditions
- Vault path:
Graphe/Quran(6,702markdown files on disk). - qmd collection
graphelogos-quranmust exist (this script runsqmd collection addif missing). - After large edits, refresh the index:
qmd update(re-indexes all collections; resolve anyEPERMon optional collections like~/Documents). - Optional:
qmd embedfor vectors, thenqmd vsearch/qmd querylocally (hybrid needs LLM; often disabled whenCI=true).
Probe results
Explicit gap / leverage language
Query: gap — 1 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.79 | research.md | Quran corpus — research & build plan | @@ -29,4 @@ (28 before, 213 after) Gap (highest leverage): the 114 surahs are not all present as files yet—only a subset is fetched. Downstream embeds, entity extraction, and published HTML all de… |
Fetch coverage / missing surahs
Query: not fetched — 12 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.88 | research.md | Quran corpus — research & build plan | @@ -29,4 @@ (28 before, 213 after) Gap (highest leverage): the 114 surahs are not all present as files yet—only a subset is fetched. Downstream embeds, entity extraction, and published HTML all de… |
| 0.87 | surahs.md | Surahs | @@ -8,4 @@ (7 before, 13 after) This folder holds one markdown file per surah (Surah NNN - Name.md), fetched/updated with .dev/scripts/fetch_quran.py. Each verse is a ### Ayah k section so other notes can t… |
| 0.86 | index.md | Quran | @@ -12,4 @@ (11 before, 1 after) - RESEARCH — full pipeline plan (fetch → Atlas → Quartz) - Research notes — literary overviews, Juz rhetoric, entity pilot Surah text files: [Surah… |
| 0.85 | index.md | Ayah notes | @@ -2,4 @@ (1 before, 14 after) title: “Ayah index” description: “One note per verse (6,236 files); each embeds a block from the surah source file.” tags: - quran |
| 0.85 | index.md | Juz (ajzāʾ) | @@ -11,4 @@ (10 before, 37 after) Thirty roughly equal parts of the mushaf (boundaries from the Quran.com API); they structure reading and memorization, not thematic “chapters.… |
| 0.85 | index.md | Quran research | @@ -1,4 @@ (0 before, 13 after) --- title: Quran research notes description: Literary overviews, entity pilot, and links to the master RESEARCH plan. tags: [quran, research, index] |
| 0.83 | atlas.md | Quran Atlas | @@ -8,4 @@ (7 before, 45 after) Like Atlas|Torah Atlas, this collection links Divine Names, People, Places, and Books (scriptural book entities such as Tawrat and Injil) so reading… |
| 0.83 | surahs.md | Surahs in this vault | @@ -7,4 @@ (6 before, 62 after) Arabic–English surah files live under Graphe/Quran/Surahs/ (one .md per surah); see Surahs|Surahs folder note for how that directory relates to Ayah and… |
| 0.82 | surah-111-al-masad.md | Surah 111: Al-Masad | @@ -34,4 @@ (33 before, 34 after) His wealth will not avail him or that which he gained. |
| 0.82 | surah-105-al-fil.md | Surah 105: Al-Fil | @@ -25,4 @@ (24 before, 43 after) Have you not considered, [O Muḥammad], how your Lord dealt with the companions of the elephant?1 |
| 0.82 | surah-109-al-kafirun.md | Surah 109: Al-Kafirun | @@ -35,4 @@ (34 before, 43 after) I do not worship what you worship. |
| 0.82 | surah-107-al-ma-un.md | Surah 107: Al-Ma’un | @@ -45,4 @@ (44 before, 43 after) And does not encourage the feeding of the poor. |
Partial corpus / scoped fetch
Query: subset — 1 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.8 | research.md | Quran corpus — research & build plan | @@ -29,4 @@ (28 before, 213 after) Gap (highest leverage): the 114 surahs are not all present as files yet—only a subset is fetched. Downstream embeds, entity extraction, and published HTML all de… |
Entity review backlog
Query: review queue — 5 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.94 | index.md | Quran research | @@ -13,4 @@ (12 before, 1 after) - entity-corpus-summary|Entity corpus summary — full extraction counts by confidence/family - [[Graphe/Quran/Research/entity-review-queue|Entity review queue]… |
| 0.94 | atlas.md | Quran Atlas | @@ -39,4 @@ (38 before, 14 after) uv run .dev/scripts/quran_entity_pipeline.py —all-surahs —write-sidecars —write-reports uv run .dev/scripts/quran_entity_pipeline.py —all-surahs —write-summary —write-review-que… |
| 0.94 | research.md | Quran corpus — research & build plan | @@ -110,4 @@ (109 before, 132 after) 3. Confidence queue — emit summary + review queue: ```bash |
| 0.91 | schema.md | Entity sidecar schema (surah-NNN.yaml) | @@ -38,4 @@ (37 before, 62 after) | confidence | string | high, medium, or low (balanced gate default: only high auto-applied). | | review_reasons | array | Optional reasons used to queue review (`alia… |
| 0.82 | entity-review-queue.md | Quran entity review queue | @@ -1,4 @@ (0 before, 1016 after) --- title: “Quran entity review queue” description: Medium/low-confidence Atlas matches requiring review. tags: [quran, atlas, extraction, review-queue] |
Embed integrity (Phase B)
Query: broken embed — 1 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.73 | research.md | Quran corpus — research & build plan | @@ -69,4 @@ (68 before, 173 after) - Regenerate: uv run .dev/scripts/generate_quran_juz_ayah.py after any rename (uses quran_api + /chapters + /juzs). - DoD: No broken `![[Graphe/Quran/Surahs/…#Ayah … |
DoD / checklist
Query: Definition of Done — 2 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.79 | research.md | Quran corpus — research & build plan | @@ -46,4 @@ (45 before, 196 after) Each stage below lists inputs, outputs, tools, and Definition of Done (observable). --- |
| 0.38 | surah-017-al-isra.md | Surah 17: Al-Isra | @@ -761,4 @@ (760 before, 367 after) [Mention, O Muḥammad], the Day We will call forth every people with their record [of deeds] Then whoever is given his record in his right hand - those will read their records, and… |
Entity pipeline & validation
Query: entity extraction — 12 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.93 | entity-validation-report.md | Entity extraction validation report | @@ -1,4 @@ (0 before, 14 after) --- title: “Entity extraction validation report” description: Structural and regression checks for Quran Atlas extraction sidecars. tags: [quran, atlas, validation] |
| 0.93 | entity-corpus-summary.md | Quran entity extraction summary | @@ -1,4 @@ (0 before, 53 after) --- title: “Quran entity extraction summary” description: Corpus-wide stats for Atlas candidate extraction. tags: [quran, atlas, extraction, summary] |
| 0.93 | entity-scan-surah-077.md | Entity scan — Surah 77 | @@ -1,4 @@ (0 before, 16 after) --- title: “Entity scan — Surah 77” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
| 0.93 | entity-scan-surah-102.md | Entity scan — Surah 102 | @@ -1,4 @@ (0 before, 16 after) --- title: “Entity scan — Surah 102” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
| 0.93 | entity-scan-surah-103.md | Entity scan — Surah 103 | @@ -1,4 @@ (0 before, 16 after) --- title: “Entity scan — Surah 103” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
| 0.93 | entity-scan-surah-109.md | Entity scan — Surah 109 | @@ -1,4 @@ (0 before, 16 after) --- title: “Entity scan — Surah 109” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
| 0.93 | entity-scan-surah-101.md | Entity scan — Surah 101 | @@ -1,4 @@ (0 before, 16 after) --- title: “Entity scan — Surah 101” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
| 0.93 | entity-scan-surah-107.md | Entity scan — Surah 107 | @@ -1,4 @@ (0 before, 16 after) --- title: “Entity scan — Surah 107” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
| 0.93 | entity-pilot-surah-001.md | Entity extraction pilot — Surah 1 | @@ -6,4 @@ (5 before, 33 after) # Entity extraction pilot — Surah 1 Source: Surah 001 - Al-Fatihah.md|Surah file · Generator: .dev/scripts/quran_entity_pilot.py |
| 0.93 | entity-scan-surah-086.md | Entity scan — Surah 86 | @@ -1,4 @@ (0 before, 19 after) --- title: “Entity scan — Surah 86” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
| 0.93 | entity-scan-surah-104.md | Entity scan — Surah 104 | @@ -1,4 @@ (0 before, 19 after) --- title: “Entity scan — Surah 104” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
| 0.93 | entity-scan-surah-111.md | Entity scan — Surah 111 | @@ -1,4 @@ (0 before, 19 after) --- title: “Entity scan — Surah 111” description: Candidate Atlas entity mentions with confidence tiers. tags: [quran, atlas, extraction, review] |
Stubs / thin notes (often Atlas)
Query: stub — 12 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.87 | tabuk.md | Tabūk | @@ -15,4 @@ (14 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | iraq.md | Iraq | @@ -15,4 @@ (14 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | nile.md | Nile | @@ -15,4 @@ (14 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | yemen.md | Yemen | @@ -15,4 @@ (14 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | aylah.md | Aylah | @@ -18,4 @@ (17 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | west.md | Maghrib | @@ -17,4 @@ (16 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | jordan.md | Jordan River | @@ -15,4 @@ (14 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | tih.md | al-Tīḥ | @@ -17,4 @@ (16 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | sham.md | al-Shām | @@ -18,4 @@ (17 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | ararat.md | Mount Judi | @@ -18,4 @@ (17 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | dead-sea.md | Dead Sea | @@ -15,4 @@ (14 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
| 0.87 | red-sea.md | Red Sea | @@ -15,4 @@ (14 before, 3 after) Stub from .dev/data/quran/people_places.json. Expand with surah references and links. See also |
Blockers
Query: blocker — 1 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.7 | research.md | Quran corpus — research & build plan | @@ -10,4 @@ (9 before, 232 after) Hypothesis (cycle): Finishing the 114 surah fetch plus keeping surah-hashes.json|surah-hashes.json + Quartz paths stable removes the largest blockers… |
Hash manifest
Query: surah-hashes — 0 hit(s)
No BM25 matches.
Fetch script references
Query: fetch_quran — 0 hit(s)
No BM25 matches.
Auto-generated Atlas verse markers
Query: AUTO_ASMA — 0 hit(s)
No BM25 matches.
People/places seed data
Query: people_places — 0 hit(s)
No BM25 matches.
Publish / site
Query: Quartz — 4 hit(s)
| Score | Wikilink | Title | Snippet |
|---|---|---|---|
| 0.93 | research.md | Quran corpus — research & build plan | @@ -3,4 @@ (2 before, 239 after) description: End-to-end plan to fetch, organize, Atlas entity work, categorize, tag, hash, and index the full Quranic corpus in this vault (wikilinked). tags: [quran, research, pipelin… |
| 0.93 | index.md | Quran | @@ -2,4 @@ (1 before, 11 after) title: Quran description: Entry point for the Quranic corpus in this vault (Quartz home page). tags: [quran] --- |
| 0.91 | index.md | Quran research | @@ -8,4 @@ (7 before, 6 after) - RESEARCH|RESEARCH — master plan (fetch → Atlas → Quartz) - Literary structures overview|Literary structures overview — surah-level rhetoric … |
| 0.82 | surahs.md | Surahs in this vault | @@ -65,4 @@ (64 before, 4 after) - Juz literary overview|Juz — literary overview — ajzāʾ as reading grid vs surah-level rhetoric; ḥizb/maqraʾ; Juz ʿAmma - RESEARCH|RESEARCH —… |
Entity sidecar schema
Query: schema_version — 0 hit(s)
No BM25 matches.
How to regenerate
uv run .dev/scripts/quran_qmd_gap_pass.py
# optional: re-index everything first
# uv run .dev/scripts/quran_qmd_gap_pass.py --reindexTotal BM25 rows listed above: 51 (probes may overlap the same note).