Challenges and Opportunities of Quantum Machine Learning

At the intersection of machine learning and quantum computing, quantum machine learning has the potential to accelerate data analysis, especially quantum data, with applications in quantum materials, biochemistry, and high-energy physics. However, challenges remain regarding the trainability of quantum machine learning models.

Therefore, the Los Alamos National Laboratory (LANL) team reviews current approaches and applications of quantum machine learning in the journal Nature - Computer Science.

Shannon information theory, which is the basis of communication technology, has now been extended to quantum Shannon theory (or quantum information theory), offering the possibility of quantum effects to make information transmission more efficient.

The field of biology has been extended to quantum biology for a deeper understanding of biological processes such as photosynthesis, odor, and enzyme catalysis. Turing's theory of general-purpose computing has been extended to general-purpose quantum computing, potentially leading to an exponential increase in the speed of simulation of physical systems.

One of the most successful techniques of this century is machine learning (ML), which aims to classify, cluster, and recognize patterns in large data sets. Learning theory has been developed in parallel with ML techniques to understand and improve quantum technologies. Concepts such as support vector machines (SVM), neural networks and generative adversarial networks have had a profound impact on science and technology. ML is now so ingrained in society that any radical improvement to ML can have huge economic benefits.

Similar to other classical theories, ML and learning theory can actually be embedded in a quantum mechanical form. Formally, this embedding has given rise to the field of quantum machine learning (QML), which aims to understand the ultimate limits of data analysis as allowed by the laws of physics. Indeed, the emergence of quantum computers, with the hope of achieving quantum dominance in data analysis, is what makes QML so exciting.

Tasks of QML. QML is often considered for four main tasks; top left: tensor networks are quantum-inspired classical methods for analyzing classical data. Top right: Unit time evolution data U from quantum systems can be classically compiled into quantum circuits. Bottom left: Handwritten numbers can be mapped to quantum states and classified on a quantum computer. Bottom right: molecular ground state data can be classified directly on a quantum computer. The figure shows the dependence of the ground-state energy E on the interatomic distance d.

Quantum computing uses entanglement, superposition and interference to perform certain tasks at dramatically, and sometimes exponentially, higher speeds than classical computing. Indeed, although such speedups have been observed for a well-designed problem, reaching such speedups remains uncertain for data science, even at the theoretical level, but it is one of the main goals of QML.

Key applications of QML. qml has been envisioned to bring computational advantages in many applications. qml can enhance quantum simulations of chemistry (e.g., molecular ground states, equilibria, and time evolution) and materials science (e.g., quantum phase identification and generative designs that take into account target properties). qml can be used to learn quantum error correction codes and syndrome decoders, perform quantum control, learn error suppression, and QML can enhance quantum computing by compiling and optimizing quantum circuits. qml can enhance sensing and metrology and extract hidden parameters from quantum systems. Finally, QML can accelerate classical data analysis, including clustering and classification.

We can speculate that all the fields shown in the figure above will be affected by QML. For example, QML will likely benefit chemistry, materials science, sensing and metrology, classical data analysis, quantum error correction, and quantum algorithm design. Some of these applications generate data that is itself quantum mechanical, so it is natural to apply QML (rather than classical ML) to them.

While there are similarities between classical and quantum ML, there are also some differences. Because QML uses quantum computers, noise in these computers can be a major problem. This includes hardware noise, such as decoherence, as well as statistical noise from measurements of quantum states (i.e., shot noise). Both of these noise sources can complicate the training process of QML. In addition, the natural nonlinear operations in classical ML (e.g., neural activation functions) require more careful design of QML models due to the linearity of quantum transformations.

For the QML field, the near-term goal is to demonstrate quantum advantages in data science applications, i.e., beyond classification methods. Achieving this goal requires keeping an open mind about which applications can benefit most from QML (e.g., perhaps an application that is itself quantum mechanical). There is also a need to understand how QML methods scale to large problem sizes, including analysis of trainability (gradient scaling) and prediction errors. The availability of high quality quantum hardware will also be crucial.

Finally, we note that QML provides a new way of thinking about established domains such as quantum information theory, quantum error correction, and quantum foundations. Looking at these applications from a data science perspective may lead to new breakthroughs.

a) Classical data x, i.e., images of cats and images of dogs, are encoded into Hilbert space by some mapping x → |ψ(x)〉. Ideally, different classes of data (represented here by dots and stars) are mapped to different regions of Hilbert space. b) Quantum data |ψ〉 can be analyzed directly on quantum devices. Here the dataset consists of states representing metallic or superconducting systems. c) The dataset is used to train QML models. two common paradigms of QML are QNN and quantum kernels, both of which allow classification of classical or quantum data. d) Once the model is trained, it can be used to make predictions

As shown above, QML can be used to learn from either classical or quantum data, so we first compare these two types of data. Classical data is ultimately encoded as bits, each of which can be in a state of 0 or 1: this includes images, text, charts, medical records, stock prices, properties of molecules, results of biological experiments, and collision traces from high-energy physics experiments. Quantum data is encoded in quantum bits, called quantum bits (or higher dimensional analogues).

A quantum bit can be represented by a linear superposition of states |0⟩, |1⟩, or any normalized complex of these two. Here, states contain information obtained from some physical processes: e.g., quantum sensing, quantum metrology, quantum networks, quantum control, and even quantum analog to digital conversion.

In principle, all classical data can be efficiently encoded in a quantum system: a classical string of bits of length n can be easily encoded onto n quantum bits. However, the converse is difficult because one cannot efficiently encode quantum data in a system of bits; that is, the state of a general system of n quantum bits requires (2n - 1) complex numbers to be specified.

Thus, quantum bit systems (and more generally quantum Hilbert spaces) constitute the ultimate data representation medium because they can encode not only classical information, but also quantum information obtained from physical processes.

We hope that QML models will be able to solve learning tasks by accessing "quanta" in quantum systems. In the near future, the availability of quantum data will increase significantly. The mere fact that people will use existing quantum computers will logically lead to more quantum problems being solved and quantum simulations being performed.

These calculations will produce quantum data sets, so it is reasonable to expect a rapid rise in quantum data. In the short term, however, this quantum data will be stored on classical devices in the form of valid descriptions of the quantum circuits that prepare the datasets.

Finally, as our level of control over quantum technology increases, it may be possible to achieve a coherent transformation of quantum information from the physical world to digital quantum computing platforms. This will mimic quantum mechanically the main information acquisition mechanism for classical data from the physical world, which is the analog-to-digital conversion. Furthermore, we can expect that the eventual emergence of practical quantum error correction and quantum memory will allow us to store quantum data in quantum computers.

Analyzing and learning data requires a parametric model, and many different models have been proposed for QML.

Similar to classical ML, several different QML paradigms exist: supervised learning (task-based), unsupervised learning (data-based), and reinforcement learning (reward-based). While each of these areas is exciting and thriving in its own right, supervised learning has recently received considerable attention for its potential to achieve quantum dominance, resilience to noise, and good generalization properties, which make it a strong candidate for recent applications.

1) Quantum neural networks

The most fundamental and critical components of QML models are parameterized quantum circuits (PQC). These involve a series of unitary gates acting on quantum data states|ψj〉, some of which have free parameters θ, that will be trained to solve the problem at hand. PQCs are conceptually similar to neural networks, and in fact the analogy can be made precise: that is, classical neural networks can be formally embedded in PQCs. This has led researchers to refer to certain types of PQC as QNNs. in practice, the term QNN is used whenever PQC is used for data science applications.

QNNs have applications in all three QML paradigms mentioned above.

- (a) In supervised classification tasks, the goal of QNN is to map different classes of states to distinguishable regions of Hilbert space;

- In an unsupervised learning scheme, a clustering task is mapped to a Max-cut Problem (MCP) and a QNN is trained to solve the maximum distance between classes.

- In a participatory reinforcement learning task, a QNN can be mapped to a Max-cut Problem and the distance between classes is maximized by training a QNN.

Examples of QNN architectures. a) quantum circuit diagram of a dissipative QNN (dissipative QNN); b) standard QNN. in this QNN, quantum data states are sent through a quantum circuit at the end of which some or all quantum bits are measured; c) convolutional QNN.

2) Quantum kernel approach (Quantum kernels)

As an alternative to QNN, researchers have proposed quantum versions of kernel methods. The kernel method maps each input to a vector in a high-dimensional vector space, called the reproduction kernel Hilbert space. The kernel method then learns a linear function in the reproducing kernel Hilbert space. The dimensionality of the reproducing kernel Hilbert space can be infinite, which makes the kernel method very powerful in terms of expressiveness.

Quantum kernel methods consider the use of quantum computers to compute kernel functions, and there are many possible implementations.

(3) Inductive bias (Inductive bias)

An important design criterion for QNN and quantum kernel methods is their inductive bias. One aspect of achieving quantum dominance with QML is to target the inductive bias of the QML model, which is not efficient to simulate with the classical model.

In general, the inductive bias includes any assumptions in the model design or optimization approach that bias the search for potential models toward a subset of the set of all possible models.

Ultimately, the inductive bias from the ML model design, combined with the choice of training process, is the key to the success or failure of the ML model. The main advantage of QMLs, then, will be the ability to sample and learn from (at least in part) models of native quantum mechanics; thus, they have an inductive bias that classical models do not have. This discussion assumes that the dataset to be represented is quantum-mechanical in nature, which is one of the reasons why researchers usually consider QML to be more promising for quantum data rather than classical data.

4) Training and generalization

The ultimate goal of ML (classical or quantum) is to train a model to solve a specific task. Therefore, understanding the training process of a QML model is fundamental to its success.

Considering the training process, the goal is to find the parameter set that leads to the best performance θ. Scattered particle noise, hardware noise, and unique application features often make off-the-shelf classical optimization methods perform poorly in QML training (this is due to the fact that extracting information from quantum states requires calculating expectations for some observations that in practice need to be estimated by measurements on a noisy quantum computer ).

Recent theoretical analysis of QNNs has shown that their predictive performance is closely related to the number of independent parameters in the QNN, and that good generalization is obtained when the number of training data is approximately equal to the number of parameters. This gives an exciting prospect that good generalization can be obtained with only a small amount of training data.

Heuristic fields (Heuristic fields) may face a stagnation period due to unforeseen technical challenges. In fact, in classical ML, there is a gap between the introduction of single perceptron and multilayer perceptron (i.e., neural networks), as well as between trying to train multiple layers and introducing backpropagation methods.

Naturally, we would like to avoid these technical limitations of QML. The obvious strategy is to try to identify all the challenges as quickly as possible and focus research efforts on solving them; fortunately, QML researchers have adopted this strategy.

a) Building a QML model requires several components and priors: the dataset (and the encoding scheme for classical data), the choice of parametric model, the loss function, and the classical optimizer. b-d, Phenomena that hinder the ability to train QML models.

1) Embedding schemes and quantum datasets

Access to high-quality, standardized datasets plays a key role in advancing classical ML. Therefore, one can conjecture that such datasets are also crucial for QML.

Currently, most QML architectures use classical datasets (e.g., MNIST, Dogs vs Cats, and Iris) as benchmarks. While it is natural to use classical datasets because of their accessibility, it is still unclear how to best encode classical information into quantum states.

A number of embedding schemes have been proposed, and they must possess a number of properties. One of these properties is that the inner product between the output states of the embedding is classically hard to simulate (otherwise the quantum kernel scheme would be classically simulable). Moreover, the embedding should be practically useful: that is, the states should be in a distinguishable region of Hilbert space in a classification task. Unfortunately, an embedding that satisfies one of these properties does not necessarily satisfy the others. Therefore, the development of encoding schemes is an active area of research, especially those equipped with an induction bias that contains information about the dataset.

Moreover, some recent results suggest that achieving quantitative dominance with classical data may not be straightforward; on the other hand, QML models using quantum data hold more promise for achieving quantum dominance. Despite this fact, true quantum datasets are still scarce for QML. Therefore, the field needs standardized quantum datasets with easily prepared quantum states, as these datasets can be used to benchmark QML models on real quantum data.

2) Quantum landscapes

The parameters for training a QML model correspond in many cases to minimizing a loss function and finding its minimum through a (usually non-convex) loss function.

Technically, the loss function defines a longitudinal view from the parameter space of the model to the actual values. The value of the loss function can be quantified, for example, by the error of the model in solving a given task, so the goal of QML is to find the set of parameters that minimizes this error. Quantum landscape theory aims to understand the longitudinal properties of QML and how it can be engineered.

3) QNN Architecture Design

Since QNNs are a fundamental component of supervised learning (deep learning, kernel methods), as well as unsupervised and reinforcement learning, the development of good QNN architectures is crucial to the field: for example, quantum convolutional neural networks (QCNNs).

In classical ML, the study of group theory (group theory) behind graph neural networks, i.e., the notion of invariance and equivalence of various group actions on the input space, has led to a unified theory of deep learning architectures based on group theory - geometric deep learning theory.

In order to create arbitrary architectures and inductive biases suitable for a given set of quantum physics data, quantum geometric deep learning theory may be the key to designing architectures with correct priors on transformation spaces and inductive biases to ensure trainability and generalization.

Since the study of physics is often about the identification of inherent or emergent symmetries in a given system, a future unified theory of quantum geometric deep learning has great potential to provide consistent methods to create QML model architectures with inductive bias encoding given the fundamental symmetries of quantum datasets and knowledge of the principles of quantum physical systems.

4) Quantum noise

The presence of hardware noise during quantum computing is one of the defining characteristics of noisy mesoscale quantum computing (NISQ). Accounting for the impact of hardware noise should be a key aspect of the QML analysis if we wish to pursue quantum dominance with currently available hardware.

Addressing the problems caused by noise may require

- A reduction in the hardware error rate;

- Partial quantum error correction

- The use of relatively shallow QNNs (i.e., whose depth grows sublinearly in the problem size, such as QCNNs.

Error mitigation techniques can also improve the performance of QML models under noise, although they may not address the trainability issues caused by noise. An alternative approach to dealing with noise is to design QML models with noise-resistant properties (e.g., the location of the minimum does not change due to noise).

1) Potential for quantum advantage

The first quantum advantage of QML may arise from extracting hidden parameters from quantum data. This could be for quantum sensing or quantum state classification/regression. Fundamentally, we know from the theory of optimal measurements that non-local quantum measurements can extract hidden parameters with fewer samples: using QML, one can form and search for parameterizations of the assumptions of such measurements.

Since information about such classical parameters is embedded in the structure of quantum correlations between subsystems, well-trained QML models with good inductive bias can naturally exhibit advantages over local measurements and classical representations.

Another application area where classical parameter extraction may yield advantages is quantum machine perception, i.e., quantum sensing, metrology and other fields.

In addition to classical parameter extraction of embedded quantum data, there may be an additional advantage in discovering quantum error correction codes: quantum error correction codes fundamentally encode data (usually) non-locally into subsystems/subspaces of Hilbert space. Since deep learning is fundamentally about the discovery of subsurfaces in the data space, identifying and decoding subspaces/subsystems corresponding to quantum error correction subspaces/subsystems from Hilbert space is a natural area where differentiable quantum computing may yield advantages.

2) What would the quantum advantage look like?

Exponential quantum advantage is more likely to be seen in ML when the data originates from quantum mechanical processes, such as experiments in chemistry, materials science, biology, and physics. The quantum advantage may be seen in sample complexity or time complexity: this substantial quantum advantage was recently demonstrated on Google's Sycamore processor, raising the hope of achieving quantum advantage using NISQ devices.

The situation with respect to advantages in time complexity is more subtle. Classical simulations of quantum processes are in many cases difficult to implement, so exponential advantages in time complexity would be expected to prevail.

There is no known exponential advantage when the data is purely classical in origin, such as in recommending products to customers, performing portfolio optimization, and in applications dealing with human language and everyday images. However, it is still reasonable to expect a polynomial advantage. Moreover, for purely classical problems, a quadratic advantage can be rigorously demonstrated. Thus, when we have fault-tolerant quantum computers, we are likely to have a potential impact in the long run; although in currently known fault-tolerant quantum computing schemes, the overhead of quantum error correction clearly inhibits speedups.

3) The Era of Fault-Tolerant Quantum Computing ......

Although QML has been proposed as a candidate for achieving quantum dominance using NISQ devices in the near future, one can still question its availability in the future. Here, researchers envision two different epochs of time after NISQ:

- In the first era, which we can call "partial error correction", quantum computers will have enough physical quantum bits (a few hundred) and a small enough error rate to allow a small number of fully error-corrected logical quantum bits. Since a logical quantum bit is composed of multiple physical quantum bits, in this era we will be free to trade off the quantum bits in the device into a subset of error-correcting quantum bits and a subset of non-error-correcting quantum bits.

- The next era, the era of "fault tolerance," will occur when quantum hardware has a large number of error-correcting quantum bits.

In fact, it is easy to envision QML being useful in both of these post-NISQ eras. First, in the era of partial error correction, QML modes will be able to perform high-fidelity circuits and thus have better performance; most importantly, in the era of fault tolerance, QML may see its broadest and most critical use: quantum algorithms, such as those used for quantum simulation, will be able to prepare quantum data accurately and store it in quantum memory with high fidelity. Thus, QML will be a natural model for learning, inferring, and predicting from quantum data, because here a quantum computer will learn directly from the data itself.

On a more distant timeline, it is expected that it will be possible to capture quantum data directly from nature, by converting it from its natural analog form to a quantum digital form (e.g., through quantum analog-digital interconversion). This data will then be able to be shuttled through quantum networks for distributed and/or centralized processing with QML models, using fault-tolerant quantum computing and error-correcting quantum communication.

At this point, QML would reach a stage similar to today's ML, where data is captured by edge sensors, forwarded to a central cloud, and ML models are trained on the aggregated data. With reference to modern widespread classical ML, we can foresee that the same ubiquitous access to quantum data in the fault-tolerant era could drive QML to a wider use.

Reference links:

https://www.nature.com/articles/s43588-022-00311-3

2023-04-04

ꂃPrevious： null

ꁹNext： null