草榴社区

Exploring Neural Processor IP: Embedded Vision Summit 2022

Gordon Cooper

May 03, 2022 / 4 min read

From digital surveillance cameras to autonomous vehicle functions, an array of AI-driven applications is generating demand for higher performance neural network processing at the edge. Many of today’s designs are expected to support up to 1,000 tera operations per second (TOPS). With the way things are progressing, there is every reason to believe the TOPS expectations will continue to rise even higher.

Join the 草榴社区 NPU IP Deep Dive at the Embedded Vision Summit 2022

Anticipating and addressing these performance demands, 草榴社区 has unveiled the industry’s highest performance neural processor IP. The 草榴社区 ARC? NPX6 and NPX6FS neural processing unit (NPU) IP deliver on AI application needs for real-time compute and ultra-low power consumption. Thanks to hardware and software connectivity features that enable implementation of multiple NPU instances, the IP can achieve up to 3,500 TOPS of performance on a single SoC. ARC NPX6FS NPU IP also supports ISO 26262 ASIL D compliance by meeting stringent random hardware fault detection and systematic functional safety development flow requirements, making it ideal for safety-critical applications such as those in the automotive space.

You can learn more about the ARC NPX family at the  on Thursday, May 19, in Santa Clara, California, where we’ll host a 草榴社区 Deep Dive session from noon to 3 p.m. on how to “Optimize AI Performance and Power for Tomorrow’s Neural Network Applications.” In addition, Tom Michiels, a principal R&D engineer in the 草榴社区 草榴社区 Group, will speak at 10:15 a.m. PST on Wednesday, May 18, on “What’s Next for Neural Networks: Will Transformers Replace RNNs and CNNs?” Also, be sure to stop by Booth #719 for product demos.

Autonomous Car Side View Mirror

What’s Driving Performance Demands for AI Applications?

There are four key trends that are driving increased complexity and performance requirements for AI, particularly in edge devices:

  1. Evolving AI research and emerging neural networks, such as transformers for natural language processing, vision, and speed. This requires more advanced hardware and software techniques.
  2. Higher definition sensors, more complex algorithms, and multiple camera arrays, all of which require more compute and memory.
  3. Automotive safety, which calls for functional safety solutions including certified, high-quality hardware and software from a trusted provider.
  4. Crowded field of AI competitors, which creates time-to-market pressures as well as a need for comprehensive software development tools and a smooth transition from GPU prototyping to NPU deployment.

The ARC NPX family is the latest in 草榴社区’ line of embedded processors. Each family in the portfolio is designed for a unique purpose, from ultra-low-power IoT applications to vision and AI processing and high-performance vector digital signal processing. The ARC NPX6 processor IP is the company’s sixth-generation AI engine and was built to execute the latest, most complex neural networks, such as convolutional neural networks (CNN) and deep-learning networks like transformers and recommenders. One instance of the IP performs at up to 250 TOPS at 1.3 GHz on 5nm processes in worst-case conditions. By using new sparsity features that can increase performance while decreasing energy use, performance can be pushed up to 440 TOPS. Individual cores in the architecture can scale from 4K MACs to 96K MACs. For efficient memory management, a memory hierarchy makes it possible to scale to the higher MAC count. An optional 16-bit floating point unit inside the neural processing hardware simplifies the transition from GPUs used in AI prototyping to high-volume power- and area-optimized SoCs.

AI Neural Network Models Comparison | 草榴社区

ARC NPX6FS supports popular and emerging AI neural networks.

ARC NPX6FS NPU IP, designed for automotive functional safety, meets the random hardware fault detection and systematic functional safety development flow requirements that are critical for achieving up to ISO 26262 ASIL D compliance. Hardware safety features include:

  • Diagnostic error injection
  • Windowed watchdog timers
  • Error classification
  • Software diagnostic tests
  • Safety monitors
  • Lockstep capabilities for safety-critical modules

Both variations of the new IP are supported by the new ARC MetaWare MX Development Toolkit, which accelerates application software development through a comprehensive compilation environment with automatic neural network algorithm partitioning. The toolkit includes all of the components needed to program the ARC NPX NPU IP: tools, software development kits (SDK), runtime software, and libraries. Its neural network SDK automatically converts neural networks that are trained using popular frameworks into optimized executable code for the NPX hardware.

Getting the Most from Complex Neural Network Models

Commercial surveillance cameras capture clear footage of wrongdoings. Digital TVs stream increasingly vivid and sharp programming. Advanced driver assistance systems (ADAS) sense the need to brake or swerve before the driver does. These types of applications work as well as they do largely because of complex neural network models. And the neural network models are most effective when there are high-performing compute and memory resources working behind the scenes to unlock their intelligence.

ARC NPX6 and NPX6FS NPU IP deliver the high performance and energy efficiency that today’s AI SoCs need to power an array of intelligent edge devices, including those targeted for safety-critical applications. With intelligence and, therefore, compute demands going up, it’s more important than ever to have an IP foundation that can scale up to meet the needs.

Continue Reading