# Storing information in light

## What is information?

This probably seems like a silly question at first, but it is important to get a solid concept of what information means in a more formal sense before we start discussing processing and transferring information. This is especially true because the way we think of information informally is generally very different from the way in which it is useful for mathematics and computing. Informally, we usually think of information as facts, such as the fact that the snowy albatross has the largest wingspan of any living bird. However, such a definition-by-example is not particularly useful when thinking systematically about how information is processed and stored.

One reason is that when designing a computer, communication system, or other system that deals with information, we don’t know or care which information will be processed. The concept that is useful in this context is the information capacity and how much information can be stored or transmitted in a system. It turns out that this is a well-defined question that can be treated mathematically. This can also be related to our more informal concept of information, and we can convert the letters in the phrase “the snowy albatross has the largest wingspan of any living bird” into ones and zeros by assigning each letter a string of ones and zeros. There are different ways to do this, but when done in a plain text file, we find that the size of this file is $520$ bits ($65$ bytes, $8$ bits each). In other words, using the method that most computers use, if we have $520$ things that each have two possible configurations, then it is possible to store this sentence, but not with less. The general formula for how much information can be stored in a system (known as information entropy) where each configuration has a probability to appear $p_i$times (measured in bits) is

The most efficient way to store information is to use each of d configurations roughly an equal proportion of the time. In other words, set$p=1/d$.In this case, the information capacity simplifies to

This formula intuitively makes sense, it takes four configurations to store two bits of information, one for each possible bitstring, $00$, $01$, $10$, or $11$. Every time the number of bits of information that needed to be stored increases by one, $d$ would have to double, we could rearrange a bit to write

## Bits and dits

If we apply our formula for $d$ naively, we get a result that may initially seem surprising. For example, if we were to directly store our new favorite piece of information about birds in a single system, it would need to have $2^{520}=3.4\times10^{156}$ or $3.4$ thousand (billion billion billion… $14$ more times) possible configurations. This is an inconceivably large number; there are only believed to be $10^{86}$ __atoms in the known universe__, which is less than one millionth (billionth billionth… $5$ more times) of this number. What it really tells us is that there are a huge amount of possible combinations of these bits. In other words, combining systems that can each independently take a few configurations creates a rapidly growing number of overall configurations. This is exactly the source of difficulty in combinatorial optimization as discussed here.

The key idea here is that the efficient way to store information is to have separate small systems that can take a number of states. In classical computers, subsystems that take two possible states (i.e. bits) are common, but a system that takes d possible configurations could be called a *dit*. The previous paragraph shows why making a computer with a single dit is impractical, the d would have to be enormous, even to store a simple sentence. Instead, we combine multiple bits to store an exponentially growing set of configurations.

For quantum systems, the fundamental ideas of information are not much changed. To denote quantum capabilities, bits become qubits and dits become __qudits__. In fact, an important early result in quantum computing was to show this need mathematically. Showing that to be effective, quantum computers needed “__a robust tensor product structure__” which is essentially a very mathematical way of saying that they need to be composed of multiple qubits or qudits. We now have the two components we need to discuss encoding information in photons, the idea that we need to both have a single qubit/qudit component and ways to store multiple of these independently.

## Time domain encoding

Fortunately, there is a very easy way to keep light particles separate, each can be physically separated along a fiber. Essentially, a “train” of separate particles can be used, as long as they don’t overlap the information and each acts as a separate qubit/dit. We refer to this strategy as **time domain encoding**. In principle, there are other ways to separate light particles. Any of the methods to store information in light particles discussed in the next section could in principle also be used to separate light particles encoding different qubits/dits. However separating them in the time domain has a clear advantage, it can scale, and more qudits or qudits can be achieved by adding a longer optical fiber.

## How much information can a light particle store?

The question then becomes how much information a single photon can store? In other words, if we consider a photon as a qudit, what is $d$? Let’s first consider this question very theoretically and set aside for the moment what is practical to do experimentally. Firstly, light has polarization, which corresponds to the direction in which the electric and magnetic fields point, since in empty space these are perpendicular to the direction the photon travels and space is three-dimensional. This leaves two dimensions, so two independent polarisations are allowed.

Next, we can think about the color of the light, corresponding to its wavelength, it seems like there should be an infinite number of possible wavelengths. However, even with theoretically perfect lab equipment, this comes with a tradeoff. How accurately the wavelength can be measured depends on how long the pulse of light is, with more precision for longer packets, making the packets longer means they would have to be more spaced out, reducing the number of qudits. We can still imagine making this tradeoff and having a particle with $d_\mathrm{wavelength}$ independently resolvable colors.

Another way we could tradeoff is to allow each light particle to take different positions corresponding to different arrival times, which we could resolve by measurement. Here the tradeoff is more obvious, if each light particle can take $d_\mathrm{time}$ different positions, we are reducing the total number of qudits we can fit into the same amount of space by this factor.

There is one more somewhat obscure property that can be used to encode information, one known as orbital angular momentum. This is fundamentally a wave property of light which __cannot easily be fully expressed without a long digression on wave mechanics__. In a rough way, it can be thought of as the light particle moving in a “corkscrew” pattern through space. In principle, there are an infinite number of orbital angular momentum modes a light particle can have, and no obvious tradeoff with the number of qudits. In a very theoretical sense, one could say there are an infinite number of possible orbital angular momentum configurations, still, we will assign a number here $d_\mathrm{OAM}$.

Since all of the four degrees of freedom can (in principal) be independently measured, we obtain a theoretical overall $d$ for a single photon qudit of:

This corresponds to a number of bits a photon can in principle carry,

As mentioned previously, this equation should probably be taken with a big grain of salt. This assumes that we are able to actually experimentally implement some measurements which would in practice be very difficult, on top of designing a computer that can use these different ways of storing information effectively. There is however a very important point here, even a single particle of light can carry a lot of information.

## Computing with light particles in practice

When constructing a real physical device, we need to think about what can practically be done in the lab so many more considerations need to be taken into account. In particular, encoding into multiple degrees of freedom is likely to be difficult, since we would need to develop physical interactions that can encode the actual computation across multiple facets of the light particles. However, the flexibility of light particles allows room to grow into more sophisticated encodings. For example, __in this paper__, our core technology is demonstrated to be able to separate modes by orbital angular momentum. In another work, __our core technology is used to sort by wavelength__. This gives us options for advanced encodings as our technology continues to mature. Some ways of encoding are also more stable than others. For example, time domain encoding tends to be more reliable than polarization encoding.

It is best, however, to start simple. Arguably the simplest encoding scheme for individual qubits is to use a time-bin encoding. For example, a qubit could be encoded in two potential arrival times for a single light particle. Arriving earlier or later, a series of such light particles can encode a string of bits, as shown below:

Since we already need to create trains of light pulses to obtain the encoding structure we want (separate qubits/dits, or a “robust tensor product structure” mathematically), this encoding is a natural starting point. A natural extension here is to move from single light particles to a few light particles. In this case, the number of light particles in a pulse encodes information, creating a qudit. The only added requirement is a sensor that can count the particles. We can think of this as number encoding or particle encoding. As discussed in our content about computing with a few photons, whether the discreteness of the particles is apparent experimentally depends on the kinds of optical states used. In a way this gives us more freedom in the kind of information we can encode and number states with well-defined numbers of photons will give effectively discrete variables, as shown below:

On the other hand, weak coherent states (or any other state with Poissonian statistics) will give a variable that acts more continuously. In principle weak coherent states are only continuous up until detection, at which point a perfect detector could resolve individual photons at least in principle, but the number would be drawn from a random distribution determined by the continuous variable as shown below:

This is a unique behavior of photonic systems, the kind of states that control the kind of information they encode.

## Conclusion

When designing a computer, thinking about how information is stored and processed is of utmost importance, and every system comes with benefits and tradeoffs. Storing information in light has some very tangible benefits, but also comes with some engineering challenges. When executed correctly, computers that store and process information with light will be a very important and viable method of computation in the near future.