GRAND CHALLENGE

LLM Software and Hardware

System Co-optimization

LLM Hardware System

Design

LLM-Based

Analog Circuit Design Competition

LLM Software and Hardware

System Co-optimization

1Introduction

Large Language Models (LLMs) based on pre-training and Transformer technology have demonstrated outstanding performance in various downstream natural language processing tasks.  However, due to concerns about data privacy and computational efficiency, achieving efficient LLM inference on the edge, especially on CPUs based on Arm architecture has emerged as a key development trend. This year’s AICAS conference organizes a general LLM performance optimization competition for CPUs based on Arm architecture, with the goal of promoting and advancing the development of related technology.

2Competition Description

Participants will base their work on the Qwen2.5 large language model (LLM). Relevant methods can be proposed from multiple perspectives, combined with Arm architecture hardware features and open source software resources (e.g., hardware BF16, vector matrix multiplication, Arm Compute Library, etc.) to systematically optimize and improve the inference performance of the LLM on hardware. The final score of the optimization method will be obtained through the test plan specified by the organizing committee of the competition.

Qwen2.5 is the LLM officially released by Alibaba Cloud in September 2024, featuring parameter scales ranging from 0.5B to 72B. All models are pretrained on a large-scale dataset containing up to 18T tokens, providing Qwen2.5 with significantly more knowledge compared to its predecessor, Qwen2. Among these, the 0.5B-parameter general-purpose model, Qwen2.5-0.5B, is an open-source and free version. Alibaba Cloud offers global multi-channel access to the model, along with services for training, deployment, and inference.

The preliminary round does not specify a specific hardware platform, participating teams can independently choose the hardware platform for the verification, overall performance evaluation and the improvement of “Ability” and “Efficiency” of the LLM optimization method. After the preliminary stage, we will select 16 teams that have submitted valid results and undergone code review to enter the final round.

In the final round, the participating teams will use the CPU processor based on Armv9 architecture (T-Head Yitian 710) provided by the organizing committee to carry out hardware and software co-optimization of the LLM, propose targeted algorithmic reasoning and optimization of the deployment strategy, and then carry out the overall performance evaluation and comparison before and after deployment, finally evaluate the first, second, third prizes and the seven winning prizes. The outstanding design solutions will be invited to be published in IEEE conferences or journals.

LLM Hardware System

Design

1Introduction

Large language models (LLMs) based on pre-training and Transformer technology have demonstrated outstanding performance in various downstream natural language processing tasks, such as text understanding, text generation, sentiment analysis, machine translation, and interactive question answering. However, due to concerns about data privacy and computational efficiency, achieving efficient LLM inference on the edge has emerged as a key development trend. This year’s AICAS will host a competition focused on optimizing LLM performance on FPGA+ARM platforms.

2Competition Description

Participants will base their work on the Qwen2.5 large language model (LLM) and propose relevant methods from multiple perspectives, such as model compression, parameter sparsity, precision quantization, and structural pruning. They will leverage the on-chip ARM processor and FPGA resources to optimize the deployment of the LLM.

Qwen2.5 is the LLM officially released by Alibaba Cloud in September 2024, featuring parameter scales ranging from 0.5B to 72B. The model demonstrates comprehensive performance across mainstream benchmark evaluation sets. All models are pretrained on a large-scale dataset containing up to 18T tokens, providing Qwen2.5 with significantly more knowledge compared to its predecessor, Qwen2. Among these, the 0.5B-parameter general-purpose model, Qwen2.5-0.5B-Instruct, is an open-source and free version.

Participants will work with the Qwen2.5-0.5B model, utilizing the KV260 computing platform to optimize LLM deployment. They will leverage the on-chip ARM processor and FPGA resources to achieve performance improvements. Participants are encouraged to propose methods from various perspectives, including but not limited to: FPGA-based accelerator design (required); Model compression (quantization, pruning), or speculative execution; Optimization of computational scheduling (improving data reuse and pipelining).

In the preliminary round, participants will remotely access the Kria KV260 board through interfaces provided by the competition committee for the verification, testing, and performance evaluation of their accelerator systems.

In the final round, teams advancing to the final stage will receive increased daily access to the cloud platform. Teams will further refine their designs and conduct optimizations. They will then need to evaluate the overall performance and compare results before and after deployment. Ultimately, the competition will award first, second, and third prizes to one team each, along with seven winning prizes. Outstanding designs may be invited for publication in IEEE conferences or journals.

LLM-Based

Analog Circuit Design Competition

1Introduction

Large Language Models (LLMs) -based solutions have demonstrated remarkable potential in digital circuit design, which promote AI-driven solutions evolving from circuit optimization to design automation. However, analog circuit design presents unique challenges due to its inherently complex, non-linear characteristics and strict precision requirements. This year’s AICAS will host a competition focused on integration of LLMs into analog circuit design workflows, advancing the intelligent and automated design paradigms.

2Competition Description

Participants will design and build AI agents for designing analog circuit via the Qwen2.5 series of large models. The goal is to autonomously complete the design of transistor-level operational amplifier circuits using the SkyWater SKY130 PDK open-source process design kit, ensuring the satisfaction of performance specifications). The submitted designs will be scored based on the performance of the generated circuits.

Qwen2.5 series large language models are developed by Alibaba with outstanding capabilities by optimizing architecture and inferencing efficiency to support applications like text generation, question answering, and dialogue systems.

The competition track is divided into two stages: a preliminary round and a final round. For both stages, participants are required to use the Qwen2.5 series (7B or below model parameters) to build intelligent agents. During the preliminary round, no specific hardware platform is provided, and each team can freely select a hardware platform to design, deploy, and optimize their intelligent agents. Participants must generate corresponding SPICE netlists according to given performance metrics and submit them for evaluation. For the preliminary round, 16 teams would be selected to the final stage by their scores after code review. Additionally, the top 10 teams from the preliminary round will be invited to participate in an offline workshop. In the final round, all teams will use the cloud computing platform provided by the organizing committee, which is based on the Armv9 architecture CPU processor (T-Head Yitian 710), to deploy intelligent agents for large-scale analog circuit models. Participants are required to generate circuit netlists to meet more performance specifications. Automated testing scripts will be used to evaluate the submissions for all teams. Based on the final rankings, the top three teams will be awarded and invited to the AICAS 2025 conference for the final defense.

The Grand Challenge Organizing Committee

Bo LI

Xidian University

Li DU

Nanjing University

Liang CHANG

University of Electronic Science and Technology of China

Wei MAO

Xidian University

Youngfu LI

Shanghai Jiao Tong University

Yuan DU

Nanjing University

Zhezhi HE

Shanghai Jia Tong University

Xiaohan MA

Alibaba Group

Guosheng YU

Alibaba Group

Evens PAN

Arm China

David BIAN

Arm China