I build efficient, high-performance machine learning systems for resource-constrained environments. My focus is at the intersection of systems engineering and deep learning.
-
VinNet: A ternary-quantized, depthwise-separable CNN for edge-based plant disease diagnosis.
- Key Result: 6.4x memory reduction (0.225 MB footprint) while maintaining 97.4% accuracy.
- Techniques: TTQ, STE, DWS Convolutions, ONNX benchmarking.
-
GLADtoText: A dependency-free C++17 NLP engine for high-throughput text embedding and classification.
- Key Result: Sub-millisecond inference and 4kβ10k examples/sec throughput.
- Techniques: Sparse matrix handling, pruning, self-attention without heavy frameworks.
- Languages: Python, C++, Golang, Kotlin Android
- ML & Research: PyTorch, ONNX, Model Quantization, Computer Vision
- Engineering: Linux/Unix, Git, Parallel Computing, CI/CD


