Quran Muaalem Overview
Quran Muaalem is the inference layer that compares a recitation to a phonetic reference and emits multi‑level outputs: phonemes and tajweed‑related attributes for each phoneme group.
Primary entry points:
Muaalemclass insrc/quran_muaalem/inference.py- Gradio UI in
src/quran_muaalem/gradio_app.py
Why this matters for researchers
- The model outputs attribute‑level labels in addition to phonemes.
- This enables fine‑grained evaluation beyond standard ASR metrics.
Core inference flow
Inside Muaalem.__call__:
- Tokenize the phonetic reference with
MultiLevelTokenizer. - Featurize audio with
AutoFeatureExtractor. - Run
Wav2Vec2BertForMultilevelCTCforward pass. - Decode with
phonemes_level_greedy_decodeandmultilevel_greedy_decode. - Assemble
Sifaobjects and returnMuaalemOutput.
Note: The phonetic reference is generated by
quran_transcript.quran_phonetizer.
Practical constraints
- Required sample rate: 16 kHz.
- Output quality depends on the reference phonetic script and recording quality.
probsare raw softmax values and are not calibrated by default.
Where to go next
- Python API for input/output details and examples.
- Outputs for the full output schema.
- Architecture for model internals.
Key files
src/quran_muaalem/inference.py— model class and inference path.src/quran_muaalem/decode.py— decoding and alignment.src/quran_muaalem/muaalem_typing.py— output dataclasses.src/quran_muaalem/gradio_app.py— UI wiring and Moshaf settings.