Programming universal unitary transformations on a general-purpose silicon photonics platform

José Roberto Rausell-Campo joraucam@upv.es Photonics Research Lab, iTEAM. Universitat Politècnica de Valencia    Daniel Pérez-López IPronics Programmable Photonics S.L.    José Capmany Francoy Photonics Research Lab, iTEAM. Universitat Politècnica de Valencia
(July 3, 2024)
Abstract

General-purpose programmable photonic processors provide a versatile platform for integrating diverse functionalities on a single chip. Leveraging a two-dimensional hexagonal waveguide mesh of Mach-Zehnder interferometers, these systems have demonstrated significant potential in microwave photonics applications. Additionally, they are a promising platform for creating unitary linear transformations, which are key elements in quantum computing and photonic neural networks. However, a general procedure for implementing these transformations on such systems has not been established yet. This work demonstrates the programming of universal unitary transformations on a general-purpose programmable photonic circuit with a hexagonal topology. We detail the steps to split the light on-chip, demonstrate that an equivalent structure to the Mach-Zehnder interferometer with one internal and one external phase shifter can be built in the hexagonal mesh, and program both the triangular and rectangular architectures for matrix multiplication. We recalibrate the system to account for passive phase deviations. Experimental programming of 3x3 and 4x4 random unitary matrices yields fidelities > 98%percent\%% and bit precisions over 5 bits. To the best of our knowledge, this is the first time that random unitary matrices are demonstrated on a general-purpose photonic processor and pave the way for the implementation of programmable photonic circuits in optical computing and signal processing systems.

preprint: AIP/123-QED

I Introduction

In the previous years, general-purpose programmable photonic integrated processors have emerged as a promising platform for the inclusion of a variety of functionalities on a single platform, similar to field-programmable gate arrays in electronics. Bogaerts et al. (2020); Pérez, Gasulla, and Capmany (2018); Pérez et al. (2017) These systems allow software-controlled manipulation of light paths across a 2D waveguide mesh using their 2x2 building blocks, known as programmable unit cells (PUCs). PUCs are typically built from symmetric Mach-Zehnder interferometers with a phase shifter on each arm. The literature reports various waveguide mesh topologies, with triangular, rectangular, and hexagonal being the most common. Pérez-López et al. (2019); Zhuang et al. (2015); Pérez, Gasulla, and Capmany (2018)

One of the main application areas of programmable photonic circuits is for microwave photonics systems, Marpaung, Yao, and Capmany (2019) where RF signals are processed in the optical domain and thus, benefiting from the increased bandwidth and reduced latency when compared with its electronic counterparts.Miller (2017) Recently, researchers presented a general-purpose programmable silicon photonic circuit with a hexagonal topology. Pérez-López et al. (2024) They demonstrated the complete system, including the photonic circuit, electronic drivers, and software layer. Their work experimentally showcased 12 microwave photonic functionalities, highlighting the versatility of these devices.

Another area of interest is the implementation of unitary linear transformations. Application-specific photonic circuits have demonstrated capabilities for performing unitary matrix multiplications using coherent structures that combine beam splitters and phase shifters within a planar architecture. Harris et al. (2018) The basic building blocks, Mach-Zehnder interferometers with internal and external phase shifters, can be connected in various arrangements. Rectangular (Clements) Clements et al. (2016) and triangular (Reck) Reck et al. (1994) topologies are the two most common approaches. An alternative using a PUC as a building block has also been proposed to reduce area by eliminating the external phase shifter. Bell and Walmsley (2021) These coherent circuits inherently apply unitary transformations, which are crucial for applications like quantum computing. Harris et al. (2017); Carolan et al. (2015); Wang et al. (2020); Taballione et al. (2023) Additionally, unitary systems can be used to unscramble the mixing of optical signals traveling through multimode fibers Annoni et al. (2017); Choutagunta et al. (2020); Zhou et al. (2020) or free-space optical communication channels. SeyedinNavadeh et al. (2024) Finally, general matrix multiplications can be achieved using SVD decomposition, where two unitary matrices and a diagonal matrix are concatenated. Miller (2013) Photonic general matrix multiplications have shown promise for deep learning, Shen et al. (2017); Pai et al. (2023); Zhang et al. (2021) unconventional computing approaches, Prabhu et al. (2020); Roques-Carmes et al. (2020) and RF signal separation. Zhang et al. (2023); Romero et al. (2023); Zhang et al. (2024)

A general procedure for the implementation of linear unitary transformations on general-purpose photonic platforms has not been reported yet. While Perez et al. (2017) demonstrated programming a rectangular interferometer within a 7-cell hexagonal waveguide mesh, their experimental validation was limited to implementing permutation matrices (unitary transformations with absolute values of elements equal to 0 or 1). This was a consequence of two reasons: First, due to the mesh size, the input vector to be multiplied by the matrix has to be encoded externally and thus, compromising coherence and rendering complex matrix-vector multiplications infeasible. Second, it is known that due to fabrication imperfections, the initial state of the phase shifters is not 0 rad and a calibration of that passive phase must be performed.Bandyopadhyay et al. (2022) The required steps to perform this calibration were not provided nor theoretically or experimentally.

In the following work, we demonstrate how to program arbitrary unitary transformations on a general-purpose programmable photonic chip with a hexagonal topology. We utilize the commercially available Smartlight processor from iPronics Programmable Photonics. Pérez-López et al. (2024); iPr First, we explain the necessary steps for splitting light to maintain on-chip coherence. Second, we show how to replicate the transfer function of a Mach-Zehnder interferometer with internal and external phase shifters using two programmable unit cells (PUCs). We then demonstrate that both triangular and rectangular architectures can be programmed on this platform and present the recalibration process to measure the passive state of the phase shifters. We experimentally program 4x4 and 3x3 random unitary matrices. We calculate their fidelity and bit precision and perform random complex matrix-vector multiplications. Finally, we show how these general-purpose processor can be a promising platform for photonic neural networks and quantum circuits by solving two classification tasks with simulated feedforward neural networks and by programming a set of quantum logic gates.

II Hexagonal Programmable Photonic Circuits

The hexagonal topology have appeared to be the most promising topology for general-purpose programmable processors, see Fig. 1a. In this section, we provide their fundamental working principle and calibration requirements.

Refer to caption
Figure 1: a Smarlight processor from IPronics. It combines an hexagonal waveguide mesh with an electronic and software layer, and b Programmable unit cell (PUC) of the hexagonal mesh. It consist of two internal phase shifters θ1subscript𝜃1\theta_{1}italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and θ2subscript𝜃2\theta_{2}italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

II.1 Programmable unit cell

The basic building block of the hexagonal mesh is the programmable unit cell which comprises a mach zehnder interferometer with a thermo-optic phase shifter in each of the top and bottom arms. A picture of the PUC is shown in Fig. 1b. The transfer function of the PUC can be expressed as:

ieiθ1+θ22(sin(θ1θ22)cos(θ1θ22)cos(θ1θ22)sin(θ1θ22))𝑖superscript𝑒𝑖subscript𝜃1subscript𝜃22matrix𝑠𝑖𝑛subscript𝜃1subscript𝜃22𝑐𝑜𝑠subscript𝜃1subscript𝜃22𝑐𝑜𝑠subscript𝜃1subscript𝜃22𝑠𝑖𝑛subscript𝜃1subscript𝜃22\displaystyle ie^{i\frac{\theta_{1}+\theta_{2}}{2}}\begin{pmatrix}sin(\frac{% \theta_{1}-\theta_{2}}{2})&cos(\frac{\theta_{1}-\theta_{2}}{2})\\ cos(\frac{\theta_{1}-\theta_{2}}{2})&-sin(\frac{\theta_{1}-\theta_{2}}{2})\end% {pmatrix}italic_i italic_e start_POSTSUPERSCRIPT italic_i divide start_ARG italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( start_ARG start_ROW start_CELL italic_s italic_i italic_n ( divide start_ARG italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG ) end_CELL start_CELL italic_c italic_o italic_s ( divide start_ARG italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG ) end_CELL end_ROW start_ROW start_CELL italic_c italic_o italic_s ( divide start_ARG italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG ) end_CELL start_CELL - italic_s italic_i italic_n ( divide start_ARG italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG ) end_CELL end_ROW end_ARG ) (1)
=ieiϕ(1kkk1k)absent𝑖superscript𝑒𝑖italic-ϕmatrix1𝑘𝑘𝑘1𝑘\displaystyle=ie^{i\phi}\begin{pmatrix}\sqrt{1-k}&\sqrt{k}\\ \sqrt{k}&\sqrt{1-k}\end{pmatrix}= italic_i italic_e start_POSTSUPERSCRIPT italic_i italic_ϕ end_POSTSUPERSCRIPT ( start_ARG start_ROW start_CELL square-root start_ARG 1 - italic_k end_ARG end_CELL start_CELL square-root start_ARG italic_k end_ARG end_CELL end_ROW start_ROW start_CELL square-root start_ARG italic_k end_ARG end_CELL start_CELL square-root start_ARG 1 - italic_k end_ARG end_CELL end_ROW end_ARG ) (2)

The PUC enables the control of the coupling ratio (k) between the input 1 and output 2, as well as the phase shift (ϕitalic-ϕ\phiitalic_ϕ) experienced by the optical signal. This control is achieved by adjusting the relative phase applied to the top and bottom arms. In the bar state (k𝑘kitalic_k = 0), light entering port 1 exits from port 1, and light entering port 2 exits from port 2. Conversely, in the cross state (k𝑘kitalic_k = 1), light entering port 1 exits from port 2, and light entering port 2 exits from port 1.

II.2 Calibration

The calibration process aims to find the relationship between the applied current on the thermo-optic actuator and the resulting phase shift, which can be expressed by the following equation:

θ=θ0+αI2𝜃subscript𝜃0𝛼superscript𝐼2\theta=\theta_{0}+\alpha I^{2}italic_θ = italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_α italic_I start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT (3)

where α𝛼\alphaitalic_α is the proportionality constant between the current and the phase term, and θ0subscript𝜃0\theta_{0}italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is the passive phase that appears as a consequence of the fabrication imperfections, creating a difference in the group index between the top and bottom waveguides of the MZI. In an ideal system, if we set θ1=θ2subscript𝜃1subscript𝜃2\theta_{1}=\theta_{2}italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0 in (1), the PUC is in cross𝑐𝑟𝑜𝑠𝑠crossitalic_c italic_r italic_o italic_s italic_s state. However, variations in the passive phase across different PUCs will cause the initial state to deviate from the ideal and become a random one between bar𝑏𝑎𝑟baritalic_b italic_a italic_r and cross𝑐𝑟𝑜𝑠𝑠crossitalic_c italic_r italic_o italic_s italic_s. The calibration of each individual PUC in a hexagonal waveguide mesh requires the use of an optimization and an auto-routing algorithm. López-Hernández, Gutiérrez-Zubillaga, and Pérez-López (2022); López et al. (2020) Once calibrated, we can define the coupling state of each PUC, enabling functionalities like beam splitters, optical interconnects, or filters. However, for applications requiring coherence, such as unitary matrix multiplications, calibration of individual phase shifters is not sufficient. We also need to compensate for the difference in phase gained by light traversing different paths with the same number of PUCs. This phase deviation, also due to fabrication imperfections but fixed for each defined path, can be treated similarly to the passive phase of individual PUCs. In the following section, we will demonstrate how to apply this re-calibration for matrix multiplications with rectangular and triangular topologies. The same procedure can be extended to other coherent structures.

II.3 Smarlight Processor

Regarding the experimental demonstration of our work, we use the commercially available Smartlight photonic processor. An image of the full system with a schematic of the integrated circuit is shown in Fig. 1a. The processor comprises 17 hexagonal cells for a total of 72 programmable unit cells (PUCs). Each PUC has an insertion loss of 0.48 dB and a power consumption of 1.3mW/π𝜋\piitalic_π. Light input and output can be done using any of the 28 available optical ports with a fiber-array that introduces a 3-dB insertion loss per facet. Each of the 28 I/O ports features an opto-electronic monitoring unit with on-chip photodetectors, enabling optical power measurement without external units. The system also includes the electronic circuitry and the software layer used to control each PUC. Further details on the device’s performance and elements are available in Ref. Pérez-López et al. (2024).

III Unitary transformations on a photonic processor

In the following section, we detail the configuration steps required within the photonic processor to achieve vector-matrix multiplications.

III.1 Splitter tree

To ensure coherence throughout the operation, we employed a continuous-wave (CW) laser. The light was then split into a number of paths equal to the number of matrix inputs. For this work, we focused on matrices of sizes 3x3 and 4x4. Conventionally, splitter trees for dividing light equally are built using cascaded stages of multi-mode interferometers (MMIs) or directional couplers with a fixed 50:50 coupling ratio. However, in a general-purpose programmable photonic processor, different paths may experience varying optical losses due to the differing number of PUCs traversed by light on each path. To address this challenge, we implemented a correction to the coupling ratio programmed within the PUCs. This correction takes into account the measured insertion losses of each PUC. Assuming that the coupling ratio of light to a path with fewer PUCs is denoted by k, to achieve equal light intensity on both output paths, the following relationship applies:

(1k)ILNP=k1𝑘𝐼superscript𝐿subscript𝑁𝑃𝑘(1-k)IL^{N_{P}}=k( 1 - italic_k ) italic_I italic_L start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT end_POSTSUPERSCRIPT = italic_k (4)

where IL𝐼𝐿ILitalic_I italic_L are the linear insertion losses of the PUC and NPsubscript𝑁𝑃N_{P}italic_N start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT is the difference in number of PUCs between the two paths. Solving k𝑘kitalic_k we obtain:

k=11+1ILNP𝑘111𝐼superscript𝐿subscript𝑁𝑃k=\frac{1}{1+\frac{1}{IL^{N_{P}}}}italic_k = divide start_ARG 1 end_ARG start_ARG 1 + divide start_ARG 1 end_ARG start_ARG italic_I italic_L start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG end_ARG (5)
Refer to caption
Figure 2: a 1x4 Symmetric splitter tree, and b 1x4 Non-symmetric splitter tree.

Two possible implementations for a 1x4 splitter tree are depicted using green light in Fig. 2. Grey PUCs represent a random state except those with a green wave on them that refer to the PUCs used for input vector encoding. Data encoding is explained in the following section, but it is necessary to state that only light going out one of the two output ports of the encoding PUC is used for the matrix multiplication. The light exiting the other port is routed out of the processor before the multiplication stage. To do so, we set some PUCs in bar state (red) and others in cross (yellow) depending on the programmed splitter tree. White arrows indicate the travel path of the encoded input vector to be multiplied by the matrix.

The symmetric splitter tree in Fig. 2a requires three columns of PUCs. First, the input light is divided using a 50:50 coupling. For the second division, we need to be aware that one path will undergo two more PUCs than the other (compare the path difference between the two upper outputs of the splitter tree). Considering the insertion losses of the Smartlight processor and equation (5), the coupling ratio in this second stage needs to be adjusted to 56:44.

The non-symmetric splitter tree in Fig. 2b is more compact as it only requires two columns. On the other hand, ligth divided in the first stage undergoes 4 more PUCs in the lower branch than in the upper. Following the path length compensation equation (5), the coupling ratio must be 61:39.

III.2 Encoding

Input vector encoding is performed using an array of PUCs, see the PUCs with a green wave on them in Fig. 2. For the case of the symmetric splitter tree we used the bar𝑏𝑎𝑟baritalic_b italic_a italic_r port to encode the data while for the non-symmetric splitter we used the cross𝑐𝑟𝑜𝑠𝑠crossitalic_c italic_r italic_o italic_s italic_s port. In the general case, the input vector is an array of complex numbers. We encode its modulus in the amplitude of the optical signal and its angle (phase) using the term ϕitalic-ϕ\phiitalic_ϕ. Assuming the light intensity at each input port is normalized to 1, the encoded modulus becomes equal to k𝑘\sqrt{k}square-root start_ARG italic_k end_ARG or 1k1𝑘\sqrt{1-k}square-root start_ARG 1 - italic_k end_ARG, depending on whether we are using the cross𝑐𝑟𝑜𝑠𝑠crossitalic_c italic_r italic_o italic_s italic_s or bar𝑏𝑎𝑟baritalic_b italic_a italic_r ports. Focusing on the cross𝑐𝑟𝑜𝑠𝑠crossitalic_c italic_r italic_o italic_s italic_s encoding and from equations (1) - (2) we find that:

cos(θ1θ22)=k𝑐𝑜𝑠subscript𝜃1subscript𝜃22𝑘\displaystyle cos(\frac{\theta_{1}-\theta_{2}}{2})=\sqrt{k}italic_c italic_o italic_s ( divide start_ARG italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG ) = square-root start_ARG italic_k end_ARG (6)
ΔΘ=θ1θ2=2arccos(k)ΔΘsubscript𝜃1subscript𝜃22𝑎𝑟𝑐𝑐𝑜𝑠𝑘\displaystyle\Delta\Theta=\theta_{1}-\theta_{2}=2arccos(\sqrt{k})roman_Δ roman_Θ = italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 2 italic_a italic_r italic_c italic_c italic_o italic_s ( square-root start_ARG italic_k end_ARG ) (7)

Substituting this expression in the phase term of (1) we obtain

ϕ=θ1+θ22=θ2+arccos(k)italic-ϕsubscript𝜃1subscript𝜃22subscript𝜃2𝑎𝑟𝑐𝑐𝑜𝑠𝑘\phi=\frac{\theta_{1}+\theta_{2}}{2}=\theta_{2}+arccos(\sqrt{k})italic_ϕ = divide start_ARG italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG = italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_a italic_r italic_c italic_c italic_o italic_s ( square-root start_ARG italic_k end_ARG ) (8)

and finally

θ1=ϕ+arccos(k)subscript𝜃1italic-ϕ𝑎𝑟𝑐𝑐𝑜𝑠𝑘\displaystyle\theta_{1}=\phi+arccos(\sqrt{k})italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_ϕ + italic_a italic_r italic_c italic_c italic_o italic_s ( square-root start_ARG italic_k end_ARG ) (9)
θ2=ϕarccos(k)subscript𝜃2italic-ϕ𝑎𝑟𝑐𝑐𝑜𝑠𝑘\displaystyle\theta_{2}=\phi-arccos(\sqrt{k})italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_ϕ - italic_a italic_r italic_c italic_c italic_o italic_s ( square-root start_ARG italic_k end_ARG ) (10)

For the bar𝑏𝑎𝑟baritalic_b italic_a italic_r encoding the arccos(k)𝑎𝑟𝑐𝑐𝑜𝑠𝑘arccos(\sqrt{k})italic_a italic_r italic_c italic_c italic_o italic_s ( square-root start_ARG italic_k end_ARG ) term in equations (9) - (10) is substituted by arcsin(1k)𝑎𝑟𝑐𝑠𝑖𝑛1𝑘arcsin(\sqrt{1-k})italic_a italic_r italic_c italic_s italic_i italic_n ( square-root start_ARG 1 - italic_k end_ARG ).

III.3 Building block equivalence

Refer to caption
Figure 3: a The standard building block consists of an MZI with one external and one internal phase shifter, b equivalent system that concatenates two PUCs with θ2subscript𝜃2\theta_{2}italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0 and ϕ1subscriptitalic-ϕ1\phi_{1}italic_ϕ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ϕ2subscriptitalic-ϕ2\phi_{2}italic_ϕ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

To program arbitrary linear interferometers on our programmable photonic chip, we first establish the equivalence between the building blocks used in each system. As mentioned earlier, the programmable unit cell (PUC) serves as the fundamental building block of the general-purpose photonic processor. In contrast, standard decomposition algorithms for linear interferometers typically assume a Mach-Zehnder interferometer (MZI) with a single internal and a single external phase shifter as the building block, see Fig. 3a. This MZI has the following transfer function:

ieiθ2(eiϕsinθ2eiϕcosθ2cosθ2sinθ2)𝑖superscript𝑒𝑖𝜃2matrixsuperscript𝑒𝑖italic-ϕ𝑠𝑖𝑛𝜃2superscript𝑒𝑖italic-ϕ𝑐𝑜𝑠𝜃2𝑐𝑜𝑠𝜃2𝑠𝑖𝑛𝜃2ie^{i\frac{\theta}{2}}\begin{pmatrix}e^{i\phi}sin\frac{\theta}{2}&e^{i\phi}cos% \frac{\theta}{2}\\ cos\frac{\theta}{2}&-sin\frac{\theta}{2}\end{pmatrix}italic_i italic_e start_POSTSUPERSCRIPT italic_i divide start_ARG italic_θ end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( start_ARG start_ROW start_CELL italic_e start_POSTSUPERSCRIPT italic_i italic_ϕ end_POSTSUPERSCRIPT italic_s italic_i italic_n divide start_ARG italic_θ end_ARG start_ARG 2 end_ARG end_CELL start_CELL italic_e start_POSTSUPERSCRIPT italic_i italic_ϕ end_POSTSUPERSCRIPT italic_c italic_o italic_s divide start_ARG italic_θ end_ARG start_ARG 2 end_ARG end_CELL end_ROW start_ROW start_CELL italic_c italic_o italic_s divide start_ARG italic_θ end_ARG start_ARG 2 end_ARG end_CELL start_CELL - italic_s italic_i italic_n divide start_ARG italic_θ end_ARG start_ARG 2 end_ARG end_CELL end_ROW end_ARG ) (11)

We can show that if two PUCs are concatenated as in Fig. 3b the same transfer function is obtained. First, phases of the left PUC must be equal, ϕ1subscriptitalic-ϕ1\phi_{1}italic_ϕ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ϕ2subscriptitalic-ϕ2\phi_{2}italic_ϕ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and according to (1) the transfer function is:

ieiϕ(0110)𝑖superscript𝑒𝑖italic-ϕmatrix0110ie^{i\phi}\begin{pmatrix}0&1\\ 1&0\end{pmatrix}italic_i italic_e start_POSTSUPERSCRIPT italic_i italic_ϕ end_POSTSUPERSCRIPT ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) (12)

and the PUC is in cross state, keeping the amplitude constant and adding a phase of ϕ+π2italic-ϕ𝜋2\phi+\frac{\pi}{2}italic_ϕ + divide start_ARG italic_π end_ARG start_ARG 2 end_ARG, which is equivalent to an external phase shifter. Then, the second PUC will be tuned maintaining θ2subscript𝜃2\theta_{2}italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0. The final linear transformation of the system is consequently the same as in (11).

III.4 Matrix Architecture

We have established that two PUCs can be configured to function equivalently to a Mach-Zehnder interferometer (MZI) with one internal and one external phase shifter, the fundamental building block in many coherent integrated processors. This equivalence allows us to directly translate previously proposed architectures for unitary matrix multiplications on photonic circuits to our general-purpose processor. In our case, we show how to implement the rectangular (Clements) and triangular (Reck) architecture. In Fig. 4, we show the distribution of the building blocks (orange boxes) in the Clements (Fig. 4a) and in the Reck (Fig. 4b) architecture. The blue PUCs represent the equivalent building blocks within the hexagonal mesh. PUCs programmed in cross and bar states ensure that light propagates in a single direction and interacts with other paths only at tunable elements. The Reck architecture necessitates an additional column of horizontal PUCs compared to the Clements topology. As a consequence, in our experiment the symmetric splitter presented in Section III.1 is not compatible with the Reck architecture using the current mesh size of the Smartlight processor and we will use the non-symmetric version.

Refer to caption
Figure 4: a Schematic of the Clements topology with its translation to the hexagonal mesh, and b schematic of the Reck topology with its translation to the hexagonal mesh.

III.5 Phase Calibration

As discussed in Section II.2, the standard calibration method for programmable hexagonal meshes is insufficient for implementing coherent transformations. While it can measure and compensate for the passive phase difference between the top and bottom waveguides of a PUC, it doesn’t account for phase variations arising from different connections between PUCs in a programmed coherent architecture. These additional phase differences can also be treated as a passive phase inherent to the chosen architecture. To address this challenge, we propose a re-calibration procedure that focuses on measuring these inherent passive phase offsets within the architecture. The procedure assumes that the α𝛼\alphaitalic_α term in equation (3) is still valid and only the θ0subscript𝜃0\theta_{0}italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT term should be measure. These offsets will be incorporated as the passive phase of each PUC acting as a phase shifter within the mesh. The re-calibration procedure is adapted from methods used for single-phase actuator calibration in application-specific integrated circuits (ASPICs). Prabhu et al. (2020); Lin et al. (2024); Alexiev et al. (2021); Pentangelo et al. (2024) This method relies on constructing a temporary interferometer using single PUCs. In the ASPIC approach, this structure, also known as a META-MZI, is built using two PUCs configured as 50:50 couplers with all internal PUCs set to the bar state. However, in our general-purpose processor, we have additional PUCs acting as waveguides that are set to the cross state. Once the system is defined, we perform a two-step characterization process for each PUC acting as a phase shifter. In the first step, we sweep the current applied to the top arm while keeping the bottom arm current at zero. We measure the output power at the cross port and identify the current corresponding to the maximum power output. This ensures the PUC is in the cross state. In the second step, we sweep the current of both the top and bottom arms simultaneously. We add the previously measured optimal current as an offset to the top arm current during this sweep. This effectively modifies the phase of the optical signal while maintaining the coupling ratio of the PUC. This phase modification within the created META-MZI results in an interferometric pattern as shown in Fig. 6(a-b). This measurement can be fitted using equations (1) and (3) to extract the passive phase offset of the phase shifter under test.

Refer to caption
Figure 5: META-MZI method for the characterization of the passive phase of the phase shifters and its translation to the hexagonal mesh.
Refer to caption
Figure 6: Optical response of two META-MZI and their fitting curve.

IV Experiments and results

Refer to caption
Figure 7: Results of 1500 random unitary matrices using the Clements architecture with a size of 3x3 (blue) and 4x4 (orange). a Measured fidelity, b comparison between the ideal applied weights and the measured weights. The black line represents the scenario where there are no errors, c difference between the applied and measured weights and d-e difference between the measured and ideal result of the multiplication of the unitary matrix by a random vector.
Refer to caption
Figure 8: Results of 1500 random unitary matrices using the Reck architecture with a size of 3x3 (blue) and 4x4 (orange). a Measured fidelity, b comparison between the ideal applied weights and the measured weights. The black line represents the scenario where there are no errors, c difference between the applied and measured weights and d-e difference between the measured and ideal result of the multiplication of the unitary matrix by a random vector.

Following re-calibration of the hexagonal mesh, we proceeded to encode unitary matrices on the Smartlight processor. All measurements utilized the same 1550 nm, 10 dBm output power laser source employed for calibration. We programmed the Clements and the Reck architecture on the photonic processor using the aforementioned scheme. For each architecture, we tested two matrix sizes: 3x3 and 4x4. The translation between the unitary matrix weights and the phases that are needed to apply was carried out following the decomposition algorithms provided in Refs. Clements et al. (2016); Reck et al. (1994) The decomposition of the 3x3 matrices results in the same building block arrangement for both architectures. To differentiate the implementations, we positioned the matrix multipliers in different regions of the mesh. The Clements architecture employed the symmetric splitter tree, while the non-symmetric splitter tree was used with the Reck architecture. During the experiments, for each combination of architecture and matrix size, we generated 1500 random unitary matrices. These matrices were decomposed into the required phase shifts and programmed onto the photonic integrated chip. We measured three figures of merit.

First, we measured the fidelity of the matrices. Fidelity is a common metric in quantum research used to quantify the similarity between two unitary operations. It is calculated using the following equation:

=Tr(|UU|2)N𝑇𝑟superscriptsuperscript𝑈𝑈2𝑁\mathcal{F}=\frac{Tr(|U^{\dagger}U|^{2})}{N}caligraphic_F = divide start_ARG italic_T italic_r ( | italic_U start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_U | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_N end_ARG (13)

where Usuperscript𝑈U^{\dagger}italic_U start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT is the adjoint matrix of U, Tr𝑇𝑟Tritalic_T italic_r denotes the trace of the matrix and |.|2|.|^{2}| . | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT if the element-wise square-modulus of the matrix-matrix multiplication. The highest fidelity (F = 1) indicates perfect similarity between the applied and desired unitary operations. In our chip, we include the splitter tree, input vector and matrix, allowing for the coherent vector-matrix multiplication. As a result, we encoded in the input vector the complex columns of the adjoint matrix one at a time to perform the matrix-matrix multiplication in (13). The squared-modulus is applied on the photodetection stage. Finally, the trace of the recorded result is calculated to determine the measured fidelity.

Second, we assessed the accuracy of the programmed matrix weights. We encoded in the input vector the ithsubscript𝑖𝑡i_{th}italic_i start_POSTSUBSCRIPT italic_t italic_h end_POSTSUBSCRIPT column of the identity matrix which results in the squared modulus of the ithsubscript𝑖𝑡i_{th}italic_i start_POSTSUBSCRIPT italic_t italic_h end_POSTSUBSCRIPT row of the programmed matrix. We repeated this process sequentially for all columns, effectively reconstructing the entire programmed matrix. The measured data is then compared with the ideal matrix, and the r-squared coefficient of the linear regression between the two datasets is calculated to quantify the overall agreement. Additionally, we measure the difference between the measured and ideal weights, calculating the mean and standard deviation of the error. From this measurement, we can approximate the bit precision of our system using the following formulaZhang et al. (2022):

bitprecision=log2(maxweightminweightstd(err))𝑏𝑖𝑡𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝑙𝑜𝑔2𝑚𝑎subscript𝑥𝑤𝑒𝑖𝑔𝑡𝑚𝑖subscript𝑛𝑤𝑒𝑖𝑔𝑡𝑠𝑡𝑑𝑒𝑟𝑟bit-precision=log2(\frac{max_{weight}-min_{weight}}{std(err)})italic_b italic_i italic_t - italic_p italic_r italic_e italic_c italic_i italic_s italic_i italic_o italic_n = italic_l italic_o italic_g 2 ( divide start_ARG italic_m italic_a italic_x start_POSTSUBSCRIPT italic_w italic_e italic_i italic_g italic_h italic_t end_POSTSUBSCRIPT - italic_m italic_i italic_n start_POSTSUBSCRIPT italic_w italic_e italic_i italic_g italic_h italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_s italic_t italic_d ( italic_e italic_r italic_r ) end_ARG ) (14)

where std(err)𝑠𝑡𝑑𝑒𝑟𝑟std(err)italic_s italic_t italic_d ( italic_e italic_r italic_r ) is the standard deviation of the errors.

Finally, we evaluated the accuracy of on-chip matrix-vector multiplications. We sampled random complex vectors and they were multiplied by the programmed matrix within the photonic chip. The resulting outputs were measured and compared with the expected outcome of the ideal multiplication. For all measurements, the outputs were normalized by dividing each element by the total measured power.

The fidelity results for the Clements architecture are presented in Fig. 7a. We achieved an average fidelity of 99.2 ±plus-or-minus\pm± 0.3 for the 3x3 matrix size and a 98.4 ±plus-or-minus\pm± 0.3 for the 4x4 matrix size. Regarding weight accuracy, Fig. 7(b-c) shows the comparison between the ideal and measured weights. The linear regression returns an r2superscript𝑟2r^{2}italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT coefficient of 0.992 and 0.984 for the 3x3 and 4x4, respectively. The standard deviation of the errors between the measured and ideal weights is 0.0215 and 0.0246, corresponding to a bit-precision of 5.5 and 5.35 bits, respectively. The errors for random vector-matrix multiplications (VMMs) are presented in Fig. 7(d-e) and Table 1 where we show the error occurred at each output.

Table 1: Mean and standard deviation of the error at each output of 1500 random vector-matrix multiplications using the Clements architecture.
Size O1(102)subscript𝑂1superscript102O_{1}(10^{-2})italic_O start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ) O2(102)subscript𝑂2superscript102O_{2}(10^{-2})italic_O start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ) O3(102)subscript𝑂3superscript102O_{3}(10^{-2})italic_O start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ) O4(102)subscript𝑂4superscript102O_{4}(10^{-2})italic_O start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ( 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT )
3x3 0.4 ±plus-or-minus\pm±3.2 1.0 ±plus-or-minus\pm±3.3 0.6 ±plus-or-minus\pm± 3.2 -
4x4 0.3 ±plus-or-minus\pm±2.6 0.3 ±plus-or-minus\pm±3.1 0.8 ±plus-or-minus\pm± 3.2 0.9 ±plus-or-minus\pm± 2.7

For the Reck architecture, the measured fidelities are shown in Fig. 8a. We obtained an average fidelity with the 3x3 and 4x4 matrices of 99.0 ±plus-or-minus\pm± 0.3 and 97.8 ±plus-or-minus\pm± 0.5. The comparison between the applied and measured square-modulus of the complex weights is presented in Fig. 8(b-c). The r2superscript𝑟2r^{2}italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT coefficient of the linear regression is 0.993 for both matrix sizes. The standard deviation of the error between the measured and ideal scenario is 0.026 for the 3x3 matrix and 0.0216 for the 4x4 matrix, which translates in a bit-precision of 5.27 and 5.53 respectively. Errors of the random VMMs are presented in Fig. 8(d-e). The mean and standard deviation are presented in Table 2.

Table 2: Mean and standard deviation of the error at each output of 1500 random vector-matrix multiplications using the Reck architecture.
Size O1(102)subscript𝑂1superscript102O_{1}(10^{-2})italic_O start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ) O2(102)subscript𝑂2superscript102O_{2}(10^{-2})italic_O start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ) O3(102)subscript𝑂3superscript102O_{3}(10^{-2})italic_O start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ) O4(102)subscript𝑂4superscript102O_{4}(10^{-2})italic_O start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ( 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT )
3x3 0.2 ±plus-or-minus\pm±3.6 0.4 ±plus-or-minus\pm±3.6 0.3 ±plus-or-minus\pm± 3.3 -
4x4 0.04 ±plus-or-minus\pm± 3.5 0.2 ±plus-or-minus\pm±3.4 0.8 ±plus-or-minus\pm± 3.4 0.6 ±plus-or-minus\pm± 3.1

V Applications

V.1 Photonic neural networks

Photonic neural networks have emerged as one of the most promising applications of photonic processors, as they can significantly reduce the power consumption and latency constraints of current electronic devicesLiao et al. (2023). The use of low-bit precision models has been proposed in the literature to reduce the computational requirements of deep learning systemsMa et al. (2024); Li et al. (2017); Jacob et al. (2018). To illustrate the capabilities of our programmable photonic processor, we train two benchmark models: the flower classification problem using the Iris datasetFisher (1988), which aims to distinguish between three types of flowers based on four input features, and the handwritten digit recognition problem using the MNIST datasetLecun et al. (1998). For both problems, we used feedforward neural networks with two fully connected layers and ReLU activation functions except on the output where no activation function was applied. All weights were clipped to values between -1 and 1, and after each layer, a Gaussian layer with a standard deviation corresponding to the Clements or Reck 4x4 matrix was introduced to simulate the behavior of our photonic processor. For the flower classification problem, we used 150 nodes in each layer and 300 epochs; for the handwritten recognition problem, we used 512 nodes in the first layer, 256 in the second, and 20 epochs. The network was optimized using the stochastic gradient descent algorithm and cross-entropy loss. All the models were trained 25 times. In Fig. 9a-b, we present the results for the Iris dataset. We show the mean training losses and the mean test accuracies during training (solid lines) and the confidence band to show the variability of the different models as a consequence of the photonic processor precision. We achieve a mean final accuracy of 95.2 ±plus-or-minus\pm± 1.6 for both Clements and Reck architectures. The results for the MNIST dataset are presented in Fig. 9c-d. The final obtained accuracy is 97.358 ±plus-or-minus\pm± 0.072 for the Clements architecture and 97.358 ±plus-or-minus\pm± 0.076 for the Reck architecture.

Refer to caption
Figure 9: a-b Loss and accuracy on the flower classification problem using the Iris dataset, and c-d loss and accuracy on the handwritten classification problem using the MNIST dataset.

V.2 Quantum Gates

Linear photonic integrated circuits have also shown promising results by providing computational advantages in quantum systemsMadsen et al. (2022). We demonstrate the potential of general-purpose programmable photonic circuits in quantum computing applications by programming a set of quantum logic gates and comparing the experimental results with the ideal matrix. We implement the CNOT gate as shown in eq. 15, the Pauli Y gate extended to a 4x4 matrix as presented in eq. 16, and the Hadamard gate as shown in 17. The experimental results are illustrated in Fig. 10a for the CNOT gate, Fig. 10b for the Pauli Y gate, and Fig. 10c for the Hadamard gate. We achieved root mean squared errors of 0.020, 0.021, and 0.035, respectively. Regarding fidelity, we achieved highly competitive results of 0.99, 0.99, and 0.97, respectively.

(1000010000010010)matrix1000010000010010\begin{pmatrix}1&0&0&0\\ 0&1&0&0\\ 0&0&0&1\\ 0&0&1&0\end{pmatrix}( start_ARG start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) (15)
(0i00i000000i00i0)matrix0𝑖00𝑖000000𝑖00𝑖0\begin{pmatrix}0&-i&0&0\\ i&0&0&0\\ 0&0&0&-i\\ 0&0&i&0\end{pmatrix}( start_ARG start_ROW start_CELL 0 end_CELL start_CELL - italic_i end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL italic_i end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL - italic_i end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL italic_i end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) (16)
12(1111111111111111)12matrix1111111111111111\frac{1}{2}\begin{pmatrix}1&1&1&1\\ 1&-1&1&-1\\ 1&1&-1&-1\\ 1&-1&-1&1\end{pmatrix}divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( start_ARG start_ROW start_CELL 1 end_CELL start_CELL 1 end_CELL start_CELL 1 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL - 1 end_CELL start_CELL 1 end_CELL start_CELL - 1 end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL 1 end_CELL start_CELL - 1 end_CELL start_CELL - 1 end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL - 1 end_CELL start_CELL - 1 end_CELL start_CELL 1 end_CELL end_ROW end_ARG ) (17)
Refer to caption
Figure 10: a CNOT gate, b Pauli Y gate and c Hadamard gate.

VI Discussion and Conclusions

In this study, we have experimentally demonstrated the implementation of 3x3 and 4x4 random unitary matrices on a general-purpose programmable photonic processor. This marks, to the best of our knowledge, the first instance where random unitary matrices have been programmed on these type of platforms using both the rectangular and the triangular arrangements. The success of this implementation is attributed to the adaptation of the META-MZI algorithm for the calibration of the architecture-specific passive phases and to the construction of a building block mathematically equivalent to the mach-zehnder with one internal and one external phase shifter. Our finding indicate that both the rectangular and triangular architectures give similar results in terms of performance. The small difference in fidelity and bit-precision are within the margins of error. The obtained fidelities are compatible with quantum experiments.Pentangelo et al. (2024); Maring et al. (2024) Moreover, 5-bit precisions have demonstrated to be sufficient for quantized deep neural networks,Ma et al. (2024); Li et al. (2017); Jacob et al. (2018) highlighting the capability of the processor to support advanced computational tasks. To highlight these statements, we trained two neural networks using the measured experimental data to solve the Iris classification problem and the MNIST handwritten recognition problem, achieving competitive performances. Moreover, we demonstrated the capabilities of the general-purpose processors on quantum tasks by programming the CNOT, Pauli Y, and Hadamard gates, achieving RMSEs of less than 0.035 and fidelities greater than 97%percent\%%.

While our results are promising, it is important to acknowledge the scalability challenges associated with the system. Compared to application-specific systems, our general-purpose processor exhibits additional losses. In the splitter tree, we use PUCs instead of 3-dB MMI which present lower insertion losses in commercially available fabrication process. Furthermore, we have also demonstrated that as the size of the matrix scales we need to add extra PUCs to the splitter tree. The matrix part also introduces extra losses. We require twice the number of PUCs compared to an ASPIC. Our system presents 0.48 dB insertion loss for PUC. Reducing this number becomes crucial for large-scale implementations. PUCs based on thermo-optic phase shifters with < 0.1 dB of losses have been already demonstratedHarris et al. (2022), opening the path for high-dimensional photonic linear transformers. It is also possible to increase the size of the matrix multiplication by dividing the multiplication process into lower size multiplications using a compiler.Guo et al. (2022)

Regarding the power consumption of the system, each PUC in the processor consumes 1.3 mW/π𝜋\piitalic_π. The splitter tree requires 14 PUCs which is an extra average consumption of 18.2 mW if compared with an ASPIC. For the matrix stage, we not only require the tunable elements but also PUCs to be in bar𝑏𝑎𝑟baritalic_b italic_a italic_r and cross𝑐𝑟𝑜𝑠𝑠crossitalic_c italic_r italic_o italic_s italic_s state for light routing. Moreover, each building block requires 4 phase shifters instead of the 2 used in current ASPICs. The total average consumption of the matrix is 54.6 mW and 61.1 mw for a 4x4 matrix size using the Clements and Reck architecture, respectively. These values can be reduced by using non-volatile phase change materialsWuttig, Bhaskaran, and Taubner (2017) for the static PUCs and by using alternative building blocks for more compact and less power-consuming architectures. Bell and Walmsley (2021)

An important metric is the number of multiply-and-accumulate (MAC) operations per second that can be achieved. In our case, we used thermo-optic phase shifters with a switching speed limited to several microseconds for both vector and matrix encoding. As our system can perform complex operations, the number of MAC/s is MAC/s=4N2DR𝑀𝐴𝐶𝑠4superscript𝑁2𝐷𝑅MAC/s=4N^{2}DRitalic_M italic_A italic_C / italic_s = 4 italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_D italic_R, where DR𝐷𝑅DRitalic_D italic_R is the data rate. Then, our 4x4 matrices can provide up to 0.64 GMAC/s. The inclusion of high-speed modulators for the input vector encoding can increase this figure to the TMAC/s range. For example, using a DR of 20 GS/s will translate into 1.28 TMAC/s. No additional changes would be necessary as high-speed photodetectors working up to 40 GHz are already integrated in the processor.

The use of a general-purpose processor presents significant advantages, particularly in terms of reducing costs associated with design and fabrication. The flexibility of these processors also allows the integration of extra functionalities such as photonic filters or delay lines, which has been already combined with linear transformations in ASPICs for different coherent applications.Nakajima, Tanaka, and Hashimoto (2021); Romero et al. (2023) Future work could explore how the integration of these systems could impact the performance and precision of the unitary transformation.

Acknowledgements.
This work was supported by the H2020-ICT2019-2 Neoteric 871330 project, the European Research Council (ERC) Advanced Grant programme under grant agreement No. 101097092 (ANBIT), the ERC Starting Grant programme under grant agreement No. 101076175 (LS-Photonics Project), and the EUR2022-134023 grant funded by CIN/AEI/10.13039/501100011033 and the European Union (NextGenerationEU/ PRTR)

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

References

  • Bogaerts et al. (2020) W. Bogaerts, D. Pérez, J. Capmany, D. A. B. Miller, J. Poon, D. Englund, F. Morichetti,  and A. Melloni, “Programmable photonic circuits,” Nature 586, 207–216 (2020).
  • Pérez, Gasulla, and Capmany (2018) D. Pérez, I. Gasulla,  and J. Capmany, “Field-programmable photonic arrays,” Optics Express 26, 27265–27278 (2018).
  • Pérez et al. (2017) D. Pérez, I. Gasulla, L. Crudgington, D. J. Thomson, A. Z. Khokhar, K. Li, W. Cao, G. Z. Mashanovich,  and J. Capmany, “Multipurpose silicon photonics signal processor core,” Nature Communications 8, 636 (2017).
  • Pérez-López et al. (2019) D. Pérez-López, A. M. Gutierrez, E. Sánchez, P. DasMahapatra,  and J. Capmany, “Integrated photonic tunable basic units using dual-drive directional couplers,” Opt. Express 27, 38071–38086 (2019).
  • Zhuang et al. (2015) L. Zhuang, C. G. H. Roeloffzen, M. Hoekman, K.-J. Boller,  and A. J. Lowery, “Programmable photonic signal processor chip for radiofrequency applications,” Optica 2, 854–859 (2015).
  • Marpaung, Yao, and Capmany (2019) D. Marpaung, J. Yao,  and J. Capmany, “Integrated microwave photonics,” Nature Photonics 13, 80–90 (2019).
  • Miller (2017) D. A. B. Miller, “Attojoule Optoelectronics for Low-Energy Information Processing and Communications,” Journal of Lightwave Technology 35, 346–396 (2017).
  • Pérez-López et al. (2024) D. Pérez-López, A. Gutierrez, D. Sánchez, A. López-Hernández, M. Gutierrez, E. Sánchez-Gomáriz, J. Fernández, A. Cruz, A. Quirós, Z. Xie, J. Benitez, N. Bekesi, A. Santomé, D. Pérez-Galacho, P. DasMahapatra, A. Macho,  and J. Capmany, “General-purpose programmable photonic processor for advanced radiofrequency applications,” Nature Communications 15, 1563 (2024).
  • Harris et al. (2018) N. C. Harris, J. Carolan, D. Bunandar, M. Prabhu, M. Hochberg, T. Baehr-Jones, M. L. Fanto, A. M. Smith, C. C. Tison, P. M. Alsing,  and D. Englund, “Linear programmable nanophotonic processors,” Optica 5, 1623–1631 (2018).
  • Clements et al. (2016) W. R. Clements, P. C. Humphreys, B. J. Metcalf, W. S. Kolthammer,  and I. A. Walmsley, “Optimal design for universal multiport interferometers,” Optica 3, 1460–1465 (2016).
  • Reck et al. (1994) M. Reck, A. Zeilinger, H. J. Bernstein,  and P. Bertani, “Experimental realization of any discrete unitary operator,” Physical Review Letters 73, 58–61 (1994).
  • Bell and Walmsley (2021) B. A. Bell and I. A. Walmsley, “Further compactifying linear optical unitaries,” APL Photonics 6, 70804 (2021).
  • Harris et al. (2017) N. C. Harris, G. R. Steinbrecher, M. Prabhu, Y. Lahini, J. Mower, D. Bunandar, C. Chen, F. N. C. Wong, T. Baehr-Jones, M. Hochberg, S. Lloyd,  and D. Englund, “Quantum transport simulations in a programmable nanophotonic processor,” Nature Photonics 11, 447–452 (2017).
  • Carolan et al. (2015) J. Carolan, C. Harrold, C. Sparrow, E. Martín-López, N. J. Russell, J. W. Silverstone, P. J. Shadbolt, N. Matsuda, M. Oguma, M. Itoh, G. D. Marshall, M. G. Thompson, J. C. F. Matthews, T. Hashimoto, J. L. O’Brien,  and A. Laing, “Universal linear optics,” Science 349, 711–716 (2015).
  • Wang et al. (2020) J. Wang, F. Sciarrino, A. Laing,  and M. G. Thompson, “Integrated photonic quantum technologies,” Nature Photonics 14, 273–284 (2020).
  • Taballione et al. (2023) C. Taballione, M. C. Anguita, M. de Goede, P. Venderbosch, B. Kassenberg, H. Snijders, N. Kannan, W. L. Vleeshouwers, D. Smith, J. P. Epping, R. van der Meer, P. W. H. Pinkse, H. van den Vlekkert,  and J. J. Renema, “20-Mode Universal Quantum Photonic Processor,” Quantum 7, 1071 (2023).
  • Annoni et al. (2017) A. Annoni, E. Guglielmi, M. Carminati, G. Ferrari, M. Sampietro, D. A. B. Miller, A. Melloni,  and F. Morichetti, “Unscrambling light—automatically undoing strong mixing between modes,” Light: Science & Applications 6, e17110–e17110 (2017).
  • Choutagunta et al. (2020) K. Choutagunta, I. Roberts, D. A. B. Miller,  and J. M. Kahn, “Adapting Mach–Zehnder Mesh Equalizers in Direct-Detection Mode-Division-Multiplexed Links,” Journal of Lightwave Technology 38, 723–735 (2020).
  • Zhou et al. (2020) H. Zhou, Y. Zhao, X. Wang, D. Gao, J. Dong,  and X. Zhang, “Self-Configuring and Reconfigurable Silicon Photonic Signal Processor,” ACS Photonics 7, 792–799 (2020).
  • SeyedinNavadeh et al. (2024) S. SeyedinNavadeh, M. Milanizadeh, F. Zanetto, G. Ferrari, M. Sampietro, M. Sorel, D. A. B. Miller, A. Melloni,  and F. Morichetti, “Determining the optimal communication channels of arbitrary optical systems using integrated photonic processors,” Nature Photonics 18, 149–155 (2024).
  • Miller (2013) D. A. B. Miller, “Self-configuring universal linear optical component,” Photon. Res. 1, 1–15 (2013).
  • Shen et al. (2017) Y. Shen, N. C. Harris, S. Skirlo, M. Prabhu, T. Baehr-Jones, M. Hochberg, X. Sun, S. Zhao, H. Larochelle, D. Englund,  and M. Soljačić, “Deep learning with coherent nanophotonic circuits,” Nature Photonics 11, 441–446 (2017).
  • Pai et al. (2023) S. Pai, Z. Sun, T. W. Hughes, T. Park, B. Bartlett, I. A. D. Williamson, M. Minkov, M. Milanizadeh, N. Abebe, F. Morichetti, A. Melloni, S. Fan, O. Solgaard,  and D. A. B. Miller, “Experimentally realized in situ backpropagation for deep learning in photonic neural networks,” Science 380, 398–404 (2023).
  • Zhang et al. (2021) H. Zhang, M. Gu, X. D. Jiang, J. Thompson, H. Cai, S. Paesani, R. Santagati, A. Laing, Y. Zhang, M. H. Yung, Y. Z. Shi, F. K. Muhammad, G. Q. Lo, X. S. Luo, B. Dong, D. L. Kwong, L. C. Kwek,  and A. Q. Liu, “An optical neural chip for implementing complex-valued neural network,” Nature Communications 12, 457 (2021).
  • Prabhu et al. (2020) M. Prabhu, C. Roques-Carmes, Y. Shen, N. Harris, L. Jing, J. Carolan, R. Hamerly, T. Baehr-Jones, M. Hochberg, V. Čeperić, J. D. Joannopoulos, D. R. Englund,  and M. Soljačić, “Accelerating recurrent Ising machines in photonic integrated circuits,” Optica 7, 551–558 (2020).
  • Roques-Carmes et al. (2020) C. Roques-Carmes, Y. Shen, C. Zanoci, M. Prabhu, F. Atieh, L. Jing, T. Dubček, C. Mao, M. R. Johnson, V. Čeperić, J. D. Joannopoulos, D. Englund,  and M. Soljačić, “Heuristic recurrent algorithms for photonic Ising machines,” Nature Communications 11, 249 (2020).
  • Zhang et al. (2023) W. Zhang, A. Tait, C. Huang, T. Ferreira de Lima, S. Bilodeau, E. C. Blow, A. Jha, B. J. Shastri,  and P. Prucnal, “Broadband physical layer cognitive radio with an integrated photonic processor for blind source separation,” Nature Communications 14, 1107 (2023).
  • Romero et al. (2023) P. M.-C. Romero, J. R. Rausell-Campo, D. Pérez-Galacho, X. Li, T. Qing, T. Wang,  and D. Pérez-López, “Integrated Microwave Photonics Coherent Processor for Massive-MIMO Systems in Wireless Communications,” IEEE Journal of Selected Topics in Quantum Electronics 29, 1–12 (2023).
  • Zhang et al. (2024) W. Zhang, J. C. Lederman, T. Ferreira de Lima, J. Zhang, S. Bilodeau, L. Hudson, A. Tait, B. J. Shastri,  and P. R. Prucnal, “A system-on-chip microwave photonic processor solves dynamic RF interference in real time with picosecond latency,” Light: Science & Applications 13, 14 (2024).
  • Bandyopadhyay et al. (2022) S. Bandyopadhyay, A. Sludds, S. Krastanov, R. Hamerly, N. Harris, D. Bunandar, M. Streshinsky, M. Hochberg,  and D. Englund, “Single chip photonic deep neural network with accelerated training,”   (2022)arXiv:2208.01623 .
  • (31) “https://ipronics.com/,” .
  • López-Hernández, Gutiérrez-Zubillaga, and Pérez-López (2022) A. López-Hernández, M. Gutiérrez-Zubillaga,  and D. Pérez-López, “Automatic Self-calibration of Programmable Photonic Processors,” in 2022 IEEE Photonics Conference (IPC) (2022) pp. 1–2.
  • López et al. (2020) A. López, D. Pérez, P. DasMahapatra,  and J. Capmany, “Auto-routing algorithm for field-programmable photonic gate arrays,” Opt. Express 28, 737–752 (2020).
  • Lin et al. (2024) S. Lin, Y. Zhang, Z. Wu, S. Zeng, Q. Gao, J. Li, X. Yu,  and S. Yu, “Power-efficient programmable integrated multiport photonic interferometer in CMOS-compatible silicon nitride,” Photon. Res. 12, A11—-A20 (2024).
  • Alexiev et al. (2021) C. Alexiev, J. C. C. Mak, W. D. Sacher,  and J. K. S. Poon, “Calibrating rectangular interferometer meshes with external photodetectors,” OSA Continuum 4, 2892–2904 (2021).
  • Pentangelo et al. (2024) C. Pentangelo, N. D. Giano, S. Piacentini, R. Arpe, F. Ceccarelli, A. Crespi,  and R. Osellame, “High-fidelity and polarization-insensitive universal photonic processors fabricated by femtosecond laser writing,” Nanophotonics 13, 2259–2270 (2024).
  • Zhang et al. (2022) W. Zhang, C. Huang, H.-T. Peng, S. Bilodeau, A. Jha, E. Blow, T. F. de Lima, B. J. Shastri,  and P. Prucnal, “Silicon microring synapses enable photonic deep learning beyond 9-bit precision,” Optica 9, 579–584 (2022).
  • Liao et al. (2023) K. Liao, T. Dai, Q. Yan, X. Hu,  and Q. Gong, “Integrated Photonic Neural Networks: Opportunities and Challenges,” ACS Photonics 10, 2001–2010 (2023).
  • Ma et al. (2024) S. Ma, H. Wang, L. Ma, L. Wang, W. Wang, S. Huang, L. Dong, R. Wang, J. Xue,  and F. Wei, “The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits,”  (2024), arXiv:2402.17764 [cs.CL] .
  • Li et al. (2017) H. Li, S. De, Z. Xu, C. Studer, H. Samet,  and T. Goldstein, “Training quantized nets: a deeper understanding,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17 (Curran Associates Inc., Red Hook, NY, USA, 2017) pp. 5813–5823.
  • Jacob et al. (2018) B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam,  and D. Kalenichenko, “Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (Salt Lake City, UT, USA, 2018) pp. 2704–2713.
  • Fisher (1988) R. A. Fisher, “Iris,” UCI Machine Learning Repository (1988).
  • Lecun et al. (1998) Y. Lecun, L. Bottou, Y. Bengio,  and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE 86, 2278–2324 (1998).
  • Madsen et al. (2022) L. S. Madsen, F. Laudenbach, M. F. Askarani, F. Rortais, T. Vincent, J. F. F. Bulmer, F. M. Miatto, L. Neuhaus, L. G. Helt, M. J. Collins, A. E. Lita, T. Gerrits, S. W. Nam, V. D. Vaidya, M. Menotti, I. Dhand, Z. Vernon, N. Quesada,  and J. Lavoie, “Quantum computational advantage with a programmable photonic processor,” Nature 606, 75–81 (2022).
  • Maring et al. (2024) N. Maring, A. Fyrillas, M. Pont, E. Ivanov, P. Stepanov, N. Margaria, W. Hease, A. Pishchagin, A. Lemaître, I. Sagnes, T. H. Au, S. Boissier, E. Bertasi, A. Baert, M. Valdivia, M. Billard, O. Acar, A. Brieussel, R. Mezher, S. C. Wein, A. Salavrakos, P. Sinnott, D. A. Fioretto, P.-E. Emeriau, N. Belabas, S. Mansfield, P. Senellart, J. Senellart,  and N. Somaschi, “A versatile single-photon-based quantum computing platform,” Nature Photonics  (2024), 10.1038/s41566-024-01403-4.
  • Harris et al. (2022) N. C. Harris, D. Bunandar, A. Joshi, A. Basumallik,  and R. Turner, “Passage: A Wafer-Scale Programmable Photonic Communication Substrate,” in 2022 IEEE Hot Chips 34 Symposium (HCS) (2022) pp. 1–26.
  • Guo et al. (2022) Z. Guo, A. N. Tait, B. A. Marquez, M. Filipovich, H. Morison, P. R. Prucnal, L. Chrostowski, S. Shekhar,  and B. J. Shastri, “Multi-Level Encoding and Decoding in a Scalable Photonic Tensor Processor With a Photonic General Matrix Multiply (GeMM) Compiler,” IEEE Journal of Selected Topics in Quantum Electronics 28, 1–14 (2022).
  • Wuttig, Bhaskaran, and Taubner (2017) M. Wuttig, H. Bhaskaran,  and T. Taubner, “Phase-change materials for non-volatile photonic applications,” Nature Photonics 11, 465–476 (2017).
  • Nakajima, Tanaka, and Hashimoto (2021) M. Nakajima, K. Tanaka,  and T. Hashimoto, “Scalable reservoir computing on coherent linear photonic processor,” Communications Physics 4, 20 (2021).