Feature Request: Latin/Spanish OCR model + MeloTTS Spanish for MaixCAM2
Platform: MaixCAM2 / MaixPy v4
Context
I am building an assistive vision system for blind and visually impaired users using MaixCAM2. The device uses Depth-Anything-V2 for obstacle detection, YOLO11n for object identification, SmolVLM for scene description, and PP-OCR for reading signs and text. All audio feedback is via MeloTTS. The target users are Spanish speakers (Latin America).
Request 1: Latin/Spanish PP-OCR recognition model
The current pp_ocr_en.mud model fails to correctly recognize Spanish text. It confuses letters like Ñ, accented vowels, and common Spanish character combinations.
There is already a pre-converted ONNX model available at:
https://huggingface.co/docato/PaddleOCR_Mobile_Models — latin_PP-OCRv3_mobile_rec_infer.onnx
This model covers Latin-script languages including Spanish, French, Portuguese, etc. Since you already have the Pulsar2 pipeline set up for PP-OCR models on AX630, compiling this to .axmodel + .mud should follow the same process as the existing English model.
Request: Please provide pp_ocr_latin.mud (or similar) in the MaixCAM2 model zoo, using the Latin PP-OCRv3 recognition model compiled for AX630.
Request 2: MeloTTS Spanish model
The current melotts-zh.mud only supports Chinese and basic English. There is no Spanish TTS option available on MaixCAM2.
The upstream Spanish model exists at:
https://huggingface.co/myshell-ai/MeloTTS-Spanish
You already have the full conversion pipeline done for the Chinese model (melotts-zh.mud). The Spanish model uses the same MeloTTS architecture, so porting it should follow the same ONNX → INT8 quantization → .axmodel + .mud process.
Request: Please provide melotts-es.mud in the MaixCAM2 model zoo, following the same structure as melotts-zh.mud.
Why this matters
Spanish is spoken by 500+ million people. Latin America represents a huge potential user base for accessibility and assistive technology applications. Both of these additions would make MaixCAM2 significantly more useful for non-English speaking markets, with very little additional work given the pipelines already exist.
Thank you for the excellent work on MaixPy and MaixCAM2!
Feature Request: Latin/Spanish OCR model + MeloTTS Spanish for MaixCAM2
Platform: MaixCAM2 / MaixPy v4
Context
I am building an assistive vision system for blind and visually impaired users using MaixCAM2. The device uses Depth-Anything-V2 for obstacle detection, YOLO11n for object identification, SmolVLM for scene description, and PP-OCR for reading signs and text. All audio feedback is via MeloTTS. The target users are Spanish speakers (Latin America).
Request 1: Latin/Spanish PP-OCR recognition model
The current
pp_ocr_en.mudmodel fails to correctly recognize Spanish text. It confuses letters like Ñ, accented vowels, and common Spanish character combinations.There is already a pre-converted ONNX model available at:
https://huggingface.co/docato/PaddleOCR_Mobile_Models —
latin_PP-OCRv3_mobile_rec_infer.onnxThis model covers Latin-script languages including Spanish, French, Portuguese, etc. Since you already have the Pulsar2 pipeline set up for PP-OCR models on AX630, compiling this to
.axmodel+.mudshould follow the same process as the existing English model.Request: Please provide
pp_ocr_latin.mud(or similar) in the MaixCAM2 model zoo, using the Latin PP-OCRv3 recognition model compiled for AX630.Request 2: MeloTTS Spanish model
The current
melotts-zh.mudonly supports Chinese and basic English. There is no Spanish TTS option available on MaixCAM2.The upstream Spanish model exists at:
https://huggingface.co/myshell-ai/MeloTTS-Spanish
You already have the full conversion pipeline done for the Chinese model (
melotts-zh.mud). The Spanish model uses the same MeloTTS architecture, so porting it should follow the same ONNX → INT8 quantization →.axmodel+.mudprocess.Request: Please provide
melotts-es.mudin the MaixCAM2 model zoo, following the same structure asmelotts-zh.mud.Why this matters
Spanish is spoken by 500+ million people. Latin America represents a huge potential user base for accessibility and assistive technology applications. Both of these additions would make MaixCAM2 significantly more useful for non-English speaking markets, with very little additional work given the pipelines already exist.
Thank you for the excellent work on MaixPy and MaixCAM2!