Yes – the training code and pruning recipes will be released on GitHub soon. User can train those models with their own dataset and different input image resolution.
Accelerate real-world AI with the
Ambarella Developer Zone
Developer Resources
Discover our capabilities. Sign up to unlock more tools and resources.
Sign Up NowLearning Center
Tutorials, white papers and sample applications to accelerate your development and keep you ahead of the AI curve
Blog
AI Processors Matter When Selecting Fleet Dashcams
Blog
Generative AI at the Edge – Key Takeaways from Omdia’s White Paper and Our Joint Webinar
Blog
Collaborating With Robots: How AI Is Enabling the Next Generation of Cobots

Our Developer-Friendly Kit is Coming Soon
Our most advanced edge-AI development kit—now in its final stages of development and available soon. Designed for Physical AI applications across autonomous systems, industrial automation, and smart city infrastructure. Ideal for next-generation development.
Register Interest in a DevKit Now ➜FAQs
Find quick answers to common questions
Choose models based on your task category.
- Image classification: ResNet, EfficientNetV2 — assign a single label to an image.
- Object detection: YOLOX, RTMDet — locate and classify multiple objects.
- Segmentation: DeepLabv3+, TopFormer — assign a class to each pixel.
- Vision-Language Models (VLMs): OWL-ViT, LLaVA OneVision — enable multimodal reasoning and can be applied across classification, detection, and segmentation tasks.
The model garden provides curated recommendations per task, including compute and memory requirements to help match your silicon budget. VLMs are more resource-intensive, but offer strong generalization and often work without task-specific training.
Yes — every model in the garden comes with a pre-validated runtime package, including compatible ONNX/quantized models, pre-processing pipelines (resize, normalization, tokenization), and post-processing (NMS, depth scaling, keypoint decoding, etc.).
Quantization (INT8, W4A8, mixed precision, etc.) and unstructured sparsity reduce memory and latency but can introduce negligible accuracy losses depending on the model and data distribution. The models on model garden are curated models from Ambarella with tradeoff between data formats and pruning budget to recover accuracy. Accuracy of the models can be verified on boards.
For VLMs like OWL-ViT, LongCLIP, or LLaVA OneVision, prompts should be explicit, structured, and task-specific (e.g., “Describe the objects in this image,” “Locate all emergency vehicles,” “Answer the question based only on the image”). Accuracy can be evaluated using standardized benchmarks (VQA, COCO retrieval, phrase grounding, open-vocabulary detection) or domain-specific metrics such as answer correctness, grounding precision, or retrieval recall. The SDK includes evaluation utilities to run these tests locally on the silicon for consistent measurement.
