logo

Amazon Nova Pro

Question answering
Text generation
Zero-shot classification
Summarization
Conversational
Image classification
Text classification
Custom ontology

Amazon Nova Pro is a highly capable multimodal model that combines accuracy, speed, and cost for a wide range of tasks. 

The capabilities of Amazon Nova Pro, coupled with its focus on high speeds and cost efficiency, makes it a compelling model for almost any task, including video summarization, Q&A, mathematical reasoning, software development, and AI agents that can execute multistep workflows. 

In addition to state-of-the-art accuracy on text and visual intelligence benchmarks, Amazon Nova Pro excels at instruction following and agentic workflows as measured by Comprehensive RAG Benchmark (CRAG), the Berkeley Function Calling Leaderboard, and Mind2Web.


Intended Use

  • Multimodal Processing: It can process and understand text, images, documents, and video, making it well suited for applications like video captioning, visual question answering, and other multimedia tasks.

  • Complex Language Tasks: Nova Pro is designed to handle complex language tasks with high accuracy, such as deep reasoning, multi-step problem solving, and mathematical problem-solving.

  • Agentic Workflows: It powers AI agents capable of performing multi-step tasks, integrated with retrieval-augmented generation (RAG) for improved accuracy and data grounding.

  • Customizable Applications: Developers can fine-tune it with multimodal data for specific use cases, such as enhancing accuracy, reducing latency, or optimizing cost.

  • Fast Inference: It’s optimized for fast response times, making it suitable for real-time applications in industries like customer service, automation, and content creation.


Performance

Amazon Nova Pro provides high performance, particularly in complex reasoning, multimodal tasks, and real-time applications, with speed and flexibility for developers.


Limitations

  1. Domain Specialization: While it performs well across a variety of tasks, it may not always be as specialized in certain niche areas or highly specific domains compared to models fine-tuned for those purposes.

  2. Resource-Intensive: As a powerful multimodal model, Nova Pro can require significant computational resources for optimal performance, which might be a consideration for developers working with large datasets or complex tasks.

  3. Training Data: Nova Pro's performance is highly dependent on the quality and diversity of the multimodal data it's trained on. Its performance in tasks involving complex or obscure multimedia content might be less reliable.

  4. Fine-Tuning Requirements: While customizability is a key feature, fine-tuning the model for very specific tasks or datasets might still require considerable effort and expertise from developers.


Citation

https://www.amazon.science/publications/the-amazon-nova-family-of-models-technical-report-and-model-card