CV · NLP · VLMs · Multimodal AI

Manpreet Singh Minhas

Senior Deep Learning Research Engineer

Building AI Systems with Measurable Impact

I bring end-to-end expertise across applied research, model development, deployment, and engineering systems. I build and scale CV, NLP, VLM, and multimodal AI products that improve trust-and-safety quality while driving measurable business impact.

Waterloo, Canada

LinkedIn · GitHub · Google Scholar · Email

Computer Vision (CV)Natural Language Processing (NLP)Vision-Language Models (VLMs)Multimodal LLMsLarge Language Models (LLMs)Generative AIImage SegmentationObject DetectionOCRAnomaly DetectionPythonC++RustPyTorchTensorFlowHuggingFaceFiftyOneDetectron2OpenCVQdrantAWSGCPDocker

Impact

Quantified business and model outcomes delivered across multilingual NLP, multimodal trust & safety AI, and internet-scale retrieval systems.

Key Results

80% reduction in manual review workload

Agentic pseudolabler for model distillation

$1.1M

Annual cost savings from advanced multilingual NLP models

0.93 F1

Automated image pseudolabler using VLMs (5 datasets)

200M+ samples

Cross-platform multimodal vector search index

210K samples

High-quality CV/NLP training and holdout data curation

94% avg F1

Trust & safety model quality across 11 GARM categories

Selected Experience Highlights

Senior Deep Learning Research Engineer (CV and NLP)

ZEFR · Jun 2022 - Present

Architected multilingual trust-and-safety NLP pipelines (fil, hin, khm, kor, tur, vie), delivering +13.98% average F1 uplift and +32.1% F1 on Korean.
Built a policy-enforcement platform using Google ADK multi-agent workflows and Qdrant semantic search, cutting manual review by 80% while maintaining ~0.90 F1.
Shipped production image pseudo-labeling with Gemini reasoning + policy attribution, achieving 0.93 average F1 across five high-difficulty datasets.
Developed and deployed a custom FiftyOne plugin for interactive image sourcing in the UI, accelerating holdout and training set construction.
Led deployment of next-generation multilingual NLP models with 63% performance gain and elimination of translation dependencies, producing $1.1M annual savings.
Led development of ZEFR's first-generation CV models, reaching 94% average F1 across all 11 GARM categories.
Developed a multimodal fusion architecture that integrated text and image signals, replacing brittle manual business logic and threshold tuning.
Designed high-throughput vector retrieval systems for 200M+ image/text assets across YouTube, Meta, and TikTok, with internal tooling for rapid data exploration.
Established CV/NLP training infrastructure (reporting, ONNX conversion, quantization, model registry) and led curation of a 210K-sample dataset for sustained model gains.

Deep Learning and Computer Vision Research Engineer

Fugro · Mar 2020 - May 2022

Developed and deployed a MobileNet-v2 debris detection system for resource-constrained environments as a Windows service package, generating $100K annual savings.
Delivered end-to-end pavement classification and road-crack segmentation systems with C++ DLL integration, outperforming competing solutions by 25%.
Implemented object detection and tracking pipelines that reduced manual processing costs by ~35% and improved operational throughput.
Built bird's-eye-view projection workflows plus SQL-backed Python tooling with multiprocessing, accelerating data processing by 50%.
Introduced CI/CD with GitHub Actions to automate test and deployment steps, reducing release friction across CV deliverables.

Research Assistant

Vision and Image Processing Lab, University of Waterloo · Aug 2018 - Mar 2020

Researched supervised, semi-supervised, and weakly supervised anomaly detection on textured surfaces.
Developed methods spanning anomaly localization and weak-annotation learning, contributing to publications in ArXiv, VISIGRAPP, and JCVIS.