Cloud native EDA tools & pre-optimized hardware platforms
Over the last decade, artificial intelligence (AI) has opened doors to a multitude of capabilities in our everyday life. AI is making automated transportation safe and secure, allowing home assistants to be more personalized, and entertainment to be more immersive. Chip designers across the world have pushed the limits of Moore’s law and silicon performance while accelerating system-on-chip (SoC) design cycles, implementation, and verification.
Applications rely more and more on deep-learning neural networks to deliver valuable insights. The demand for deep learning and machine learning requires compute-intensive methodologies and robust chip designs to power these intelligent functions. The underlying silicon technology must be capable to operate advanced levels of mathematics and explore applications such as object identification, voice, super image resolution, facial recognition, and more in real time.
Recent innovations in neural network processing and deep learning algorithms are driving new hardware requirements for processing, memory, and connectivity in AI SoCs and spawning a new generation of investments in the semiconductor market. As organizations undertake major digital transformations and explore the possibilities of a “metaverse,” the demand for specialized, silicon-proven AI IP in SoCs will continue to grow in importance to reduce integration risk and speed time-to-market.
Read on to learn more about the different aspects within AI classifications, markets that are driving its growth, key SoC design challenges, nurturing SoC designs beyond integration, and what to expect going forward.
The term “AI” has been around since the mid-1950s when it was first introduced as an academic discipline. In 2015, with the advancements in processor technology and AI algorithms, we saw an explosion of new investments as this technology was proven to match and now surpass human capabilities. These investments are transitioning AI from mainframes to embedded applications, and development continues to evolve at a rapid pace.
Typically, AI is classified as either “weak AI or narrow AI,” the capability to solve specific tasks, or “strong AI or artificial general intelligence,” which is the machine’s ability to find a solution when faced with an unfamiliar task. Existing AI systems are predominantly based on the concept of weak AI, with strong AI systems expected to be utilized in the next few years.
Most AI applications follow a process that involves three fundamental building blocks: perception, decision making, and response. Think of the Google Home or Amazon Echo device that responds to your daily “What’s the weather like today?” question. The device senses your voice, processes the request between the cloud and a local processor, and responds to you with real-time data in audio. On a broader scale, AI can be defined as the ability to recognize the environment and maximize the chances of achieving a goal.
Over the last few years, accelerated advancements have equipped machine learning algorithms to achieve higher levels of recognition accuracy – better than humans. For instance, NASA uses neural network technology to analyze data from telescopes to discover new planets. This process allows engineering teams to analyze the data much faster. Using this system, NASA found an eighth planet revolving around a star (Kepler-90) located 2,545 light years away from Earth.
While the AI market is still very nascent, several non-traditional semiconductor companies are already making significant investments to get a share of the pie. System companies like Google, Facebook, Amazon, and Alibaba are designing their own custom ASIC (application-specific integrated circuit) chips to support their AI software requirements and business models – a move not many would have predicted a decade ago.
Machine vision applications are a key function that’s driving new SoC investments. Significant advancements in machine learning that employ neural network technology have dramatically increased accuracy. Convolutional neural networks (CNNs) and other deep learning algorithms in vision applications have made AI capabilities within SoCs pervasive. On the other hand, deep learning is often used to solve complex problems, such as 5G implementation for cellular infrastructure and simplifying 5G operational tasks through Self Organizing Networks (SON).
This shift from AI being used in academia to the implementation of AI in embedded applications is driven by advances in process technology, microprocessors, and AI algorithms. Many new functions such as vision systems for facial detection, natural language understanding for improved human machine interfaces, and context awareness are being added to SoCs in all markets including automotive, mobile, digital home, data center, and the internet of things (IoT).
The figure above shows the highest performance end to the left, where SoC designers for cloud AI accelerators focus on maximizing performance to solve large problems. This involves executing complex AI algorithms and training methodologies, ultimately reducing costs by limiting training time and energy usage for inference. These hardware innovations replace years of development and enable faster discovery of critical solutions, like vaccine and drug innovations.
However, not all problems can be solved in the cloud, and many of these AI accelerator architectures are being modified to enable edge computing and on-device AI where we expect to see the largest growth in hardware moving forward. Today, the on-device AI category includes a wide variety of applications such as advanced driver assistance systems (ADAS), digital TVs, voice and speech recognition, and AR/VR headsets. Within this market segment, mobile devices continue to push the innovation envelope, especially when it comes to the latest process nodes.
Over the past year, the AI processing capabilities of mobile processors has increased from single digit tera operations per second (TOPS) to over 4x improvements, well over 20 TOPS, with little end in sight on future improvements to both performance and performance per watt. As we move closer to the point of data collection in edge servers and in plug-in accelerator cards, the top design requirement for edge device accelerators continues to be the optimization for performance per energy used. Since edge device accelerators have limited processing and memory, the algorithms are compressed to meet power and performance requirements while maintaining the desired level of accuracy. This is becoming increasingly difficult at the edge as resolutions for image sensors move from dozens of mega-pixels to well over 100, and resolution moves from 14-bit to 24-bit – well beyond the ability of the human eye. Due to the algorithms, resolution, and shear amount of data involved, the compute and memory required continues to increase. Not only are the algorithms compressed, but the data focus is also limited with special regions of interest features in software and hardware.
Each AI market segment has different goals and challenges. Over time, integrating AI capabilities into SoCs has underlined several weaknesses within the fundamental SoC architecture designed for AI. Design modifications to SoC architectures where deep learning capabilities are incorporated impact both highly unique solutions and more general-purpose AI SoC designs.
This means that choosing and integrating IP determines the baseline effectiveness of an AI SoC, making up the core “DNA” of the AI SoC. For instance, introducing custom processors, or arrays of processors, can speed up the extensive matrix multiplications needed in AI applications.
When incorporating deep learning capabilities, SoC designers encounter different challenges to meet specific computing requirements:
With the industry continuing to churn trillions of bytes of data and demanding chips that can withstand a never-ending computational paradigm, SoC designers need to have access to top-quality IP to be able to quickly achieve silicon success. The element of “nurturing” the design affects how the underlying pieces within the hardware function together and how IP can be optimized to build a more effective AI SoC architecture.
This design process to optimize, test, and benchmark overall SoC performance requires advanced simulation, prototyping solutions, and decades of expertise to optimize and expedite the AI system design. Design teams need a wide array of processors that are equipped with the necessary tools to benchmark different AI graphs effectively and accelerate their silicon success. Additionally, nurturing the design during the design process with necessary customizations and optimizations plays an integral role in determining the SoC’s success in the market. A targeted set of embedded memories can equip designers to address the challenges of high density and low leakage with customized solutions.
草榴社区 is on the leading edge of this technological transformation. The broad 草榴社区 DesignWare? IP portfolio enables specialized processing capabilities, high-bandwidth memory throughput, and reliable high-performance connectivity demands in AI chips for mobile, IoT, data center, automotive, digital home, and other markets.
Our decades of knowledge and experience with several leading providers of AI SoCs in every market segment gives us the unique opportunity to develop next-generation proven, reliable IP solutions that foster critical differentiations for tomorrow’s AI needs.
As machine learning and deep learning continue evolving, we expect that the AI market will be driven by the need for faster computations, more intelligence at the edge, faster processing, and automating more functions. With leading semiconductor suppliers and startups driving AI capabilities into scores of new SoCs and chipsets, specialized IP solutions in the form of new processing and memory architectures will become integral to fuel the next generation of designs. Eventually, AI will increase productivity, change how we access information, and profoundly transform the way we live our lives in the years to come.