Developer | Ambarella

Accelerate real-world AI with the
Ambarella Developer Zone

Advanced Neural Processing

Dedicated AI Accelerators with support for multiple precision formats
Power
Efficiency

Industry-leading performance per watt
Scalable  Architecture

Modular architecture allowing for easy scaling from edge endpoints to edge infrastructure
Real-time Performance

Ultra-low latency processing with deterministic execution for mission-critical applications

Developer Resources

Discover our capabilities.  Sign up to unlock more tools and resources.

Model Garden

Optimized models that run on Ambarella SoCs

Explore Now ➜

Agentic Blueprints

Agentic Workflows that run on Ambarella SoCs

Explore Now ➜

Learning Center

Tutorials, white papers and sample applications to accelerate your development and keep you ahead of the AI curve

View All

Blog

AI Processors Matter When Selecting Fleet Dashcams

Computer Vision

Blog

Generative AI at the Edge – Key Takeaways from Omdia’s White Paper and Our Joint Webinar

Computer Vision

Blog

Collaborating With Robots: How AI Is Enabling the Next Generation of Cobots

Our Developer-Friendly Kit is Coming Soon

Our most advanced edge-AI development kit—now in its final stages of development and available soon. Designed for Physical AI applications across autonomous systems, industrial automation, and smart city infrastructure. Ideal for next-generation development.

FAQs

Find quick answers to common questions

Yes – the training code and pruning recipes will be released on GitHub soon. User can train those models with their own dataset and different input image resolution.

Choose models based on your task category.

Image classification: ResNet, EfficientNetV2 — assign a single label to an image.
Object detection: YOLOX, RTMDet — locate and classify multiple objects.
Segmentation: DeepLabv3+, TopFormer — assign a class to each pixel.
Vision-Language Models (VLMs): OWL-ViT, LLaVA OneVision — enable multimodal reasoning and can be applied across classification, detection, and segmentation tasks.

The model garden provides curated recommendations per task, including compute and memory requirements to help match your silicon budget. VLMs are more resource-intensive, but offer strong generalization and often work without task-specific training.

Yes — every model in the garden comes with a pre-validated runtime package, including compatible ONNX/quantized models, pre-processing pipelines (resize, normalization, tokenization), and post-processing (NMS, depth scaling, keypoint decoding, etc.).

Quantization (INT8, W4A8, mixed precision, etc.) and unstructured sparsity reduce memory and latency but can introduce negligible accuracy losses depending on the model and data distribution. The models on model garden are curated models from Ambarella with tradeoff between data formats and pruning budget to recover accuracy. Accuracy of the models can be verified on boards.

For VLMs like OWL-ViT, LongCLIP, or LLaVA OneVision, prompts should be explicit, structured, and task-specific (e.g., “Describe the objects in this image,” “Locate all emergency vehicles,” “Answer the question based only on the image”). Accuracy can be evaluated using standardized benchmarks (VQA, COCO retrieval, phrase grounding, open-vocabulary detection) or domain-specific metrics such as answer correctness, grounding precision, or retrieval recall. The SDK includes evaluation utilities to run these tests locally on the silicon for consistent measurement.

View All FAQs

Accelerate real-world AI with the Ambarella Developer Zone

Advanced Neural Processing

Power Efficiency

Scalable Architecture

Real-time Performance

Developer Resources

Model Garden

Agentic Blueprints

Learning Center

AI Processors Matter When Selecting Fleet Dashcams

Generative AI at the Edge – Key Takeaways from Omdia’s White Paper and Our Joint Webinar

Collaborating With Robots: How AI Is Enabling the Next Generation of Cobots

Our Developer-Friendly Kit is Coming Soon

FAQs

Can fine-tune a model and still deploy it on the silicon?

How do I select the right model for my application (detection/classification vs VLM)?

Do these models run out-of-the-box on the silicon, including pre/post processing?

How does pruning/ quantization affect model accuracy, and how can I debug accuracy drops on silicon?

How should I write prompts for Vision-Language Models, and how do I measure their accuracy?

Accelerate real-world AI with the
Ambarella Developer Zone

Power
Efficiency

Scalable  Architecture