Scientists develop fully connected quantum-inspired annealing system to disrupt D-Wave
In a new study [1], researchers from Tokyo University of Science (TUS), Japan, have proposed a scalable approach for a large-scale fully-coupled annealing processing system. By using an array calculator consisting of multiple coupled chips and a control chip, the scientists have been able to easily outperform modern CPUs in terms of speed and energy consumption for solving various combinatorial optimization problems in FPGAs.
There are two types of annealing processors: quantum annealers and quantum-inspired CMOS-type annealers, the latter in this paper. Once this fully connected annealing system scales up, it may overturn the only partially connected D-Wave quantum annealers.
01Fully-coupled annealing model
The annealing processor is a hardware implementation of the Ising model that can efficiently solve combinatorial optimization problems.
Combinatorial optimization problems and Ising models/annealing processors
Two types of annealing processors have been reported: (a) sparse models with adjacent spin-spin couplings and (b) fully coupled models with all spin-spin couplings. The fully coupled model is more general than the sparse model because it can map problems more easily and solve more problems per number of spins. The quantum annealer of the listed company D-Wave is the first one, and the number of quantum bits that can be interconnected in the current 5000 quantum bits Advantage is only 15.

Fully coupled model v.s. sparsely coupled model
The results of this FPGA demonstration confirm that these small amounts of communication through multi-chip operation are possible when operating as a fully coupled system. In the demonstration, a 384-rotation fully-coupled annealing processing system board was implemented using 16 first FPGA chips and one second FPGA chip. The current consumption of the entire system board is 950 mA at idle and 1050 mA when running fully-coupled annealing processing at 10 MHz. results also show that using this system it is possible to solve a 92-node graph coloring problem and a 384-node maximum cut problem. For the max-cut problem, this implementation of the system board is 584 times faster and 46 times more energy efficient than a 4-GHz CPU PC when solving the same problem.
These findings suggest that scalable models of fully coupled annealing systems can be implemented.
02Simulating Combinatorial Optimization Problems: Difficult to Implement and Scale Up
Have you ever had the problem of having to find an optimal solution from among many possible options, such as finding the fastest route to a certain place, taking into account distance and traffic? If so, you are dealing with what is formally known as a "combinatorial optimization problem". Mathematically speaking, these problems are common in the real world and arise in a variety of fields: including logistics, network routing, machine learning, and materials science.
However, large-scale combinatorial optimization problems are very computationally intensive when solved using standard computers, leading researchers to turn to other methods. One such approach is based on the "Ising model": it mathematically represents the magnetic orientation, or "spin," of atoms in ferromagnetic materials. At high temperatures, the orientation of these atomic spins is random. But as the temperature decreases, these spins line up to reach a minimum energy state, where the orientation of each spin depends on its neighboring atoms. It turns out that this process, known as "annealing," can be used to simulate combinatorial optimization problems that produce optimal solutions for the final states of the spins.
Researchers have attempted to use quantum devices to create annealing processors that mimic the behavior of spins, and to develop semiconductor devices using large-scale integration (LSI) technology. However, these systems are notoriously difficult to implement and scale up due to the sheer number of connections between the spins that need to be considered. While the use of multiple fully connected parallel chips is a potential solution to the scalability problem, it leaves a surprisingly large number of interconnects (wires) required between the chips.
0346 times more energy efficient and 584 times faster
In a recent study published in Microprocessors and Microsystems, Professor Kawahara and his colleagues demonstrate an ingenious solution to this problem. They developed a new method in which the calculation of the system's energy state is first performed in multiple fully coupled chips, forming an "array calculator.
A second type of chip, called a "controller chip," then collects the results from the remaining chips and calculates the total energy, which is used to update the value of the simulated spin. "The advantage of our method is that the amount of data transferred between the chips is extremely small," explains Professor Kawahara. "Despite its simple principle, this approach allows us to implement a scalable, fully connected LSI system that solves combinatorial optimization problems through simulated annealing."

Left: Schematic of the two chips; Right: Custom chip mounted on a "board" to form a fully coupled annealing system.

Left: System block diagram of FPGA board - prototype experimental board, where each divided function is actually implemented on a single FPGA chip. Right: System calculation steps and flowchart
The researchers successfully implemented their approach using commercial FPGA chips, which are widely used programmable semiconductor devices. They built a fully coupled annealing system with 384 rotations and used it to solve several optimization problems, including a 92-node graph coloring problem and a 384-node maximum cut problem.
Most importantly, these proof-of-concept experiments show that the proposed approach leads to real performance benefits. Compared to a standard modern CPU simulation of the same annealing system, the FPGA implementation is 584 times faster and 46 times more energy efficient in solving the maximum cut problem.

(a) Photograph of the prototype board; (b) Connection schematic.

Left: Results for the graph coloring 92-node problem; Right: (top) Ising energy transition for the maximum cut problem; (bottom) spin state results for the maximum cut 384-node problem.
04Further collaboration with companies to solve practical social problems
Now, with the successful demonstration of how their method works in an FPGA, the researchers plan to take it to the next level. "We hope to produce a custom-designed LSI chip to increase the capacity and greatly improve the performance and power efficiency of our method," says Professor Kawahara. This will allow us to achieve the performance required in the areas of materials development and drug discovery, which involve very complex optimization problems."
Finally, Professor Kawahara noted that he hopes to facilitate the implementation of the results to solve real problems in society. His research group hopes to conduct joint research with companies to bring their methods to the heart of semiconductor design technology and open the door to a renaissance of semiconductors in Japan.
Reference links:
[1]https://www.sciencedirect.com/science/article/pii/S0141933122002046?via%3Dihub
[2]https://techxplore.com/news/2022-09-scalable-fully-coupled-quantum-inspired-processor.html
