Feature Request: Latin/Spanish OCR model + MeloTTS Spanish for MaixCAM2

## Feature Request: Latin/Spanish OCR model + MeloTTS Spanish for MaixCAM2

**Platform:** MaixCAM2 / MaixPy v4

---

### Context

I am building an assistive vision system for blind and visually impaired users using MaixCAM2. The device uses Depth-Anything-V2 for obstacle detection, YOLO11n for object identification, SmolVLM for scene description, and PP-OCR for reading signs and text. All audio feedback is via MeloTTS. The target users are Spanish speakers (Latin America).

---

### Request 1: Latin/Spanish PP-OCR recognition model

The current `pp_ocr_en.mud` model fails to correctly recognize Spanish text. It confuses letters like Ñ, accented vowels, and common Spanish character combinations.

There is already a pre-converted ONNX model available at:
https://huggingface.co/docato/PaddleOCR_Mobile_Models — `latin_PP-OCRv3_mobile_rec_infer.onnx`

This model covers Latin-script languages including Spanish, French, Portuguese, etc. Since you already have the Pulsar2 pipeline set up for PP-OCR models on AX630, compiling this to `.axmodel` + `.mud` should follow the same process as the existing English model.

**Request:** Please provide `pp_ocr_latin.mud` (or similar) in the MaixCAM2 model zoo, using the Latin PP-OCRv3 recognition model compiled for AX630.

---

### Request 2: MeloTTS Spanish model

The current `melotts-zh.mud` only supports Chinese and basic English. There is no Spanish TTS option available on MaixCAM2.

The upstream Spanish model exists at:
https://huggingface.co/myshell-ai/MeloTTS-Spanish

You already have the full conversion pipeline done for the Chinese model (`melotts-zh.mud`). The Spanish model uses the same MeloTTS architecture, so porting it should follow the same ONNX → INT8 quantization → `.axmodel` + `.mud` process.

**Request:** Please provide `melotts-es.mud` in the MaixCAM2 model zoo, following the same structure as `melotts-zh.mud`.

---

### Why this matters

Spanish is spoken by 500+ million people. Latin America represents a huge potential user base for accessibility and assistive technology applications. Both of these additions would make MaixCAM2 significantly more useful for non-English speaking markets, with very little additional work given the pipelines already exist.

Thank you for the excellent work on MaixPy and MaixCAM2!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Latin/Spanish OCR model + MeloTTS Spanish for MaixCAM2 #196