Physicist, Tinkerer
One of the most important applications of quantum computers is breaking RSA encryption efficiently, i.e. not in exponential time. The quantum algorithm that does this is Shor’s algorithm, which efficiently factors large numbers into their prime factors. In this post I will go through how Shor’s does this at a pretty technical level. Knowledge of linear algebra is required and a little group theory helps. Bra-ket notation is used extensively throughout the post but I do explain it and how it maps to normal linear algebra notation to let readers from other fields follow along.
Suppose we want to factor $N$ into its prime factors. The inefficient classical way to do this is a brute force approach that is exponential. The most efficient classical algorithm for factoring numbers is the general number field sieve that takes sub-exponential time. Shor’s Algorithm is a quantum algorithm that can do this $\mathcal{O}\left(\left(\log N\right)^3\right)$ where N is the size of the integer.
Shor’s algorithm is essentially a classical algorithm which requires a quantum approach for one step. In this section we outline the classical approach.
For this discussion, we will make extensive use of Euclid’s greatest common divisor (gcd) algorithm. This is efficient and has a logarithmic complexity.
Suppose we want to factor $N$. Let’s start with a guess $a$. First let’s compute $d = \gcd(a,N)$. If $d \neq 1$ we are done because we guessed luckily and found a divisor $d$ of $N$. You can find the other one by doing $\frac{N}{d}$.
Instead assume that $d = 1$, i.e. $a$ and $N$ are coprime integers. This means that $a$ is an element of the multiplicative group of integers modulo $N$, i.e. $a\in \left(\mathbb{Z}/N\mathbb{Z}\right)^{\times}$.
Now consider $\langle a \rangle$ the subgroup of $\left(\mathbb{Z}/N\mathbb{Z}\right)^{\times}$ generated by $a$
\[\langle a \rangle = \{a^k | a\in \left(\mathbb{Z}/N\mathbb{Z}\right)^{\times}, k \in \mathbb{Z}\}\]This is a subgroup of the multiplicative group and so $a$ has an order $r$ which is the smallest integer such that $a^r = 1 \mod N$. For now suppose by magic you know what this $r$ is. This is the part that will require a quantum algorithm to do efficiently, which we will discuss later.
The statement $a^r = 1 \mod N$ means that $a^r - 1$ is divisible by $N$. Let’s factorise this into $\left(a^{\frac{r}{2}} + 1\right)\left(a^{\frac{r}{2}} - 1\right)$. Note that if $r$ was an odd number, $\frac{r}{2}$ will not be an integer so we have to restart with a different guess of $a$.
Does $N$ divide either of these factors? Suppose $N$ divides $\left(a^{\frac{r}{2}} - 1\right)$. That means $a^{\frac{r}{2}} = 1 \mod N$ which cannot be true since $r$ was already the smallest integer such that $a^r = 1 \mod N$. Let’s consider instead whether $N$ divides $a^{\frac{r}{2}} + 1$. More precisely, let’s compute $x = \gcd\left(a^{\frac{r}{2}} + 1, N\right)$. If $x = 1$, there is no common divisor and this was a bad guess so we need a different guess for $a$. Suppose instead that $x \neq 0$, this means that $x$ divides $N$, and we have found a divisor (and the other divisor $\frac{N}{x}$) and we are done.
Based on the above argument/proof, the (incomplete) algorithm to factor $N$ is as follows:
The quantum speedup in Shor’s Algorithm comes from applying quantum methods to find the order of $a$ in step 3. Let the order (or period) be called $r$. The basic strategy to find $r$ is to first apply an algorithm called quantum phase estimation with the unitary operator $U$ which represents multiplication by $a$ modulo $N$. When applied to the eigenvectors of $U$, phase estimation will return the eigenvalues which encode the value of $r$. After sampling the output of the phase estimation, we use the continued fractions algorithm to get the value of $r$ from the values sampled from the output of the algorithm.
Before expanding on the quantum order finding algorithm, we set out some preliminaries on quantum information. This introduction is intended for someone who might not necessarily have an exposure to quantum mechanics, but has an adequate linear algebra background. We go over the bare minimum of the mathematical axioms required to explain the implementation of Shor’s algorithm. Those comfortable with braket notation can skip this section. Those looking for a more in depth introduction can check out Nielsen and Chuang.
The fundamental unit of quantum information is the qubit. A classical bit can be either 0 or 1. A qubit is a quantum system with which can occupy two quantum states, which we label $\ket{0}$ and $\ket{1}$. Qubits can be realised in a variety of physical systems from electrons in atomic energy orbitals to micrometer scale electromagnetic circuits made out of superconductors. The question of how to engineer high quality qubits is a central one in quantum computing right now but for this blog post I’ll assume we have perfect qubits with negligible error rates.
Unlike a bit, a qubit can be in an arbitrary superposition of $\ket{0}$ and $\ket{1}$, i.e. it can be in a state
\[\ket{\psi} = \alpha\ket{0} + \beta\ket{1}\]Where $\alpha$ and $\beta \in \mathbb{C}$ and $\ket{0}$, $\ket{1}$ are orthonormal vectors. This is just physics notation for vectors in a complex Hilbert space. $\ket{a}$ (called a ket) is equivalent to $\vec{a}$ and $\bra{a}$ (called a bra) is equivalent to $\vec{a}^{\dagger}$. The inner product $\vec{a}^{\dagger} \vec{b}$ is denoted by $\braket{a}{b}$, which is called a bra(c)ket. Physicists prefer braket notation because it’s kind of silly to write something like $\vec{0}^{\dagger}\vec{1}$.
While a qubit can be in a superposition of the 0 and 1 basis states, any measurement on the qubit will only return either 0 or 1. The probability of each result is given by the inner product:
\[\Pr(0) = \left|\braket{0}{\psi}\right|^2 = \left|\alpha^2\right|\]$\Pr(1)$ is defined analogously. Note that this definition means that $\abs{\alpha}^2 + \abs{\beta}^2 = 1$ and that for any quantum state $\ket{\psi}$ we must have that $\abs{\braket{\psi}{\psi}}^2 = 1$. This is why we require that the basis states $\ket{0}$ and $\ket{1}$ be orthonormal.
Operations on qubits are represented by unitary matrices $U$, i.e. $UU^{\dagger} = I$ or equivalently $U$ preserves the inner product of the Hilbert space. These are the quantum mechanical equivalent of classical logic gates. These operators must be unitary to preserve the norm of the vectors they act on, which is the mathematical way of saying that information isn’t lost by quantum operations. Some basic examples of single qubit operators are the Pauli operators $\sigma_x$, $\sigma_y$, and $\sigma_z$:
\[\sigma_x = \begin{pmatrix} 0 & 1 \\ 1 & 0 \\ \end{pmatrix}, \ \sigma_y = \begin{pmatrix} 0 & -i \\ i & 0 \\ \end{pmatrix}, \sigma_z = \begin{pmatrix} 1 & 0 \\ 0 & -1 \\ \end{pmatrix}\]Classical computers require many billions and trillions of bits to perform their operations, so it’s unsurprising that quantum computers will also require multiple qubits to talk to each other. Suppose we have two qubits $A$ and $B$ with their corresponding Hilbert state spaces $\mathcal{H_A}$ and $\mathcal{H_B}$. Let the basis states be $\ket{0_A}, \ket{1_A}$ and $\ket{0_B}, \ket{1_B}$. When these two qubits are allowed to physically interact with each other, we say they have become entangled, and the state space of the combined A-B system is given by the tensor product of Hilbert spaces $\mathcal{H} = \mathcal{H_A} \otimes \mathcal{H_B}$. More concretely,the basis for $\mathcal{H}$ is given by $\ket{0_A 0_B}, \ket{0_A 1_B}, \ket{1_A 0_B}, \ket{1_A 1_B}$.
Often we will omit the subscripts and implicitly assume that the label at position $n$ in the ket or bra refers to the state of qubit $n$.
Note that it is not always possible to write a $\ket{\psi} \in \mathcal{H}$ as an explicit tensor product of a vector in $\mathcal{H_A}$ and $\mathcal{H_B}$. For example, in the case of two qubits, it’s not hard to show that the state $\ket{00} + \ket{11}$ cannot be “factorised” into the tensor product of something in $\mathcal{H_A}$ and $\mathcal{H_B}$. These new states that emerge when both qubits are entangled but can’t be split back into products of states in the individual qubit subspaces are an example of how quantum entanglement gives rise to new behaviour which can only be seen when two systems interact.
One final note is that to be able to factorise numbers, we will need a way to represent integers in the range ${0, 1, … N-1}$ using qubits. The way we do this is by representing those integers by their binary representations and mapping those onto tensor products of qubits. For example, if you want to represent the integers from 0 to 5, you can do that with three qubits:
\[\begin{align*} \ket{0} &= \ket{000} \ (0\cdot 2^2 + 0\cdot2^1 + 0\cdot2^0 = 0) \\ \ket{1} &= \ket{001} \ (0\cdot 2^2 + 0\cdot2^1 + 1\cdot2^0 = 1) \\ \ket{2} &= \ket{010} \ (0\cdot 2^2 + 1\cdot2^1 + 0\cdot2^0 = 2)\\ \ket{3} &= \ket{011} \ (0\cdot 2^2 + 1\cdot2^1 + 1\cdot2^0 = 3)\\ \ket{4} &= \ket{100} \ (1\cdot 2^2 + 0\cdot2^1 + 0\cdot2^0 = 4)\\ \ket{5} &= \ket{101} \ (1\cdot 2^2 + 0\cdot2^1 + 1\cdot2^0 = 5) \end{align*}\]In general to represent the integers ${0, 1, … N-1}$, you need $n = \ceil{\log_2(N-1)}$ qubits.
Note that in general with $n$ binary bits, the largest number you can make is $2^n - 1$. For our example with 3 qubits/bits, the largest number we can represent is $\ket{111}$ which is $\ket{1\times 2^0 + 1 \times 2^1 + 1 \times 2^2} = \ket{7}$. The largest binary number with $n$ digits is given by $2^0 + 2^1 + \dots + 2^{n-2} + 2^{n-1}$. This can be written as a geometric series $\sum_{j=0}^{n-1} 2^j$ which using the geometric series formula is equal to $2^n - 1$.
Operators can be defined in the combined Hilbert space with the tensor product also. If $U_A$ is a unitary gate acting on $A$ and $U_B$ acts on $B$, the combined action can be defined as $\left(U_A \otimes U_B\right) \ket{\psi}$. If the state $\ket{\psi}$ can be explicitly written as a product of a state in $\mathcal{H}_A$ and $\mathcal{H}_B$, then you can apply each unitary operator to the state from its individual space and then tensor product the results together. So for example if $\ket{\psi} = \ket{0_A 1_B}$ then
\[U_A \otimes U_B \ket{\psi} = \left(U_A \ket{0_A}\right) \otimes \left(U_B \ket{1_B}\right)\]This property will show up when we apply the quantum phase estimation algorithm.
The heart of the order finding procedure is in the quantum phase estimation algorithm. We will simply outline the main steps/results of the phase estimation algorithm. Those interested in seeing why this works, including a more in depth explanation of the Quantum Fourier Transform and its relationship to the Discrete Fourier Transform, can find a good explanation in Nielsen and Chuang.
Given a unitary operator $U$ and an eigenvector $\ket{u}$ with eigenvalue $e^{2\pi i \varphi}$ - i.e. $U\ket{u} = e^{2\pi i \varphi} \ket{u}$, the phase estimation algorithm takes as input a \textit{register} of $t = n + \ceil{\log \left(2 + \frac{1}{2\epsilon} \right)}$ qubits initialised to $\ket{0}$ tensored with another register of as many qubits as necessary to implement the eigenvector $\ket{u}$ and returns an $n$ bit approximation to the phase $\varphi$ with probability $1-\epsilon$. The steps are:
Armed with the phase estimation algorithm, we can now come up with the quantum algorithm to find the order $r$. As explained earlier, if we are dealing with numbers modulo $N$, we need at least $L = \ceil{\log_2 N}$ qubits. For example if we are trying to factor $N = 21$, we will need $L = \ceil{\log_2 21} = 5$ qubits to represent all the kets ${\ket{0}\dots \ket{20}}$. As we said before, in general with $L$ bits, the largest number we can represent is $2^L - 1$ so our full set of qubits will be ${\ket{0}\dots \ket{N-1},\ket{N} \dots \ket{2^L -1}}$. In our $N=21$ example, our computational basis will be ${\ket{0},\dots, \ket{21},\dots, \ket{31}}$. For reasons that will become clearer later, we will use $t = 2L+1 + \ceil{\log\left(2 + \frac{1}{2\epsilon}\right)}$ qubits which will return phase approximations accurate to $2L+1$ bits.
In our original setup of the factoring problem, the guess we made for the factor of $N$ was $a$. Suppose we have a quantum operator $U$ whose action on the states ${\ket{0}, \ket{1} \dots \ket{N-1}}$ is given by
\[U\ket{k} = \ket{ak \ \mathrm{mod} \ N}\]For example if $k = 20, a = 3, N = 21$ then $U\ket{k} = U\ket{20} = \ket{60 \ \mathrm{mod} \ 21} = \ket{18 \ \mathrm{mod} \ 21}$. For basis vectors with an index larger than $N-1$, i.e. in the range $N \leq k \leq 2^L - 1$, we let $U$ act as the identity. So over all of the computational qubits $U$ is defined as
\[U\ket{k} = \begin{cases} \ket{ak \ \mathrm{mod} \ N} & \text{if } 0 \leq k \leq N-1 \\ \ket{k} & \text{if } N \leq k \leq 2^L - 1 \end{cases}\]To show that this is unitary, we will show that it preserves the inner product between two vectors $\ket{\psi} = \frac{1}{\sqrt{n}}\sum_{j=0}^{n-1} c_j \ket{j}$ and $\ket{\phi} = \frac{1}{\sqrt{m}}\sum_{k=0}^{m-1} c_k \ket{k}$. First let’s compute the inner product of these two vectors without applying $U$.
\[\begin{align*} \braket{\phi}{\psi}&= \frac{1}{\sqrt{mn}}\sum_{k = 0}^{m-1} \sum_{j = 0}^{n-1} c_j c_m^*\braket{m}{j} \\ &= \frac{1}{\sqrt{mn}}\sum_{k = 0}^{m-1} \sum_{j = 0}^{n-1} c_j c_m^* \delta_{jm} \end{align*}\]Now applying $U$, we get
\[\begin{align*} \bra{\phi}U^{\dagger}U\ket{\psi}&= \frac{1}{\sqrt{mn}}\sum_{k = 0}^{m-1} \sum_{j = 0}^{n-1} c_j c_m^*\bra{m}U^{\dagger}U\ket{j} \\ \end{align*}\]Since $U$ has a piecewise definition, there are a few cases to consider to evaluate the inner product in the above equation. In the case where $m, j \geq N$, $U$ acts as the identity so the inner product is trivially preserved. In the case where $m, j < N$, $U$ acts by multiplying by $a \ \mathrm{mod} \ N$. The inner product in the summation then becomes $\braket{am|aj} = \delta_{aj, am}$. The fact that $\gcd(a, N) = 1$ is crucial because this means that $a$ has a well defined inverse in the multiplicative inverse modulo $N$ which we naturally call $a^{-1}$. $\delta_{aj, am} = 1$ iff $aj = am$ which because $a$ is invertible is equivalent to $j = m$. Thus $\delta_{aj, am} = \delta_{j, m}$ and the inner product is preserved. In the case where $m < N$ and $ j \geq N$, the unitary map will take $m$ to $am \ \mathrm{mod} \ N$ which is still less than $N$ so the inner product will always evaluate to $0$, just as it would if $U$ hadn’t been applied. Thus the operator preserves the inner product between two arbitrary vectors and it is a unitary operator.
Since we know that $a^r = 1$ modulo $N$, we know that $U^r = 1$. This means that the eigenvalues of the matrix have to be the $r-$th roots of unity, i.e. $\lambda_j = e^{\frac{2\pi ij}{r}}$.
Now that we know that $U$ is indeed a unitary operator and its eigenvalues, we need to find its eigenvectors to be able to use the phase estimation algorithm to find $r$. We are mainly interested in vectors in the subspace spanned by the interesting sub-basis of vectors ${\ket{0}\dots \ket{N-1}}$ since $U$ acts trivially on all other basis vectors. Since the action of $U$ is to map $\ket{k}$ to $\ket{ak}$, one might intuitively guess that the eigenvectors should be sums of powers of $a$, i.e. of the form
\[\ket{u} = \sum_{k = 0}^{r-1} c_k \ket{a^k}\]Where $r$ is the period (i.e. $a^r = 1 \ \mathrm{mod} \ N$) that we are trying to find to solve the original factoring problem.
As a toy example, consider the action of $U$ on $\ket{u} = \ket{1} + \ket{a} + \ket{a^2} + \dots + \ket{a^{r-1}}$:
\[\begin{align*} U\ket{u} &= U\left(\ket{1} + \ket{a} + \ket{a^2} + \dots + \ket{a^{r-1}}\right) \\ &= \ket{a} + \ket{a^2} + \ket{a^3} + \dots + \ket{1} \\ &= \ket{u} \end{align*}\]Since $U$ basically just permutes powers of $a$ into each other, we see that eigenvectors have to have this form. As the above example shows $\ket{u}$ is an eigenvector with eigenvalue $1$.
The general form of normalised eigenvectors is given by
\[\ket{u_j} = \frac{1}{\sqrt{r}}\sum_{k = 0}^{r-1}e^{-\frac{2\pi i j k}{r}} \ket{a^k}\]As an example, we can verify this for the case where $j = 1$.
\[\begin{align*} \ket{u_1} &= \frac{1}{\sqrt{r}}\left[\ket{1} + e^{-\frac{2\pi i}{r}} \ket{a} + e^{-\frac{2\pi(2)i}{r}} \ket{a^2} + \dots + e^{-2\pi i\frac{r-2}{r}} \ket{a^{r-2}} + e^{-2\pi i\frac{r-1}{r}} \ket{a^{r-1}}\right] \\ U\ket{u_1} &= \frac{1}{\sqrt{r}} \left[\ket{a} + e^{-\frac{2\pi i}{r}} \ket{a^2} + e^{-\frac{2\pi(2)i}{r}} \ket{a^3} + \dots + e^{-2\pi i\frac{r-2}{r}} \ket{a^{r-1}} + \ket{1} \right] \\ &= e^{\frac{2\pi i}{r}} \ket{u_1} \end{align*}\]Now we have the eigenvectors and eigenvalues of $U$. However note that both depend on knowing what $r$ is, which is what we originally wanted to find out. The trick to resolving this is a brilliant observation that while the individual eigenvectors of $U$ depend on $r$, their sum does not. Each of the $r$ eigenvalues corresponds to a unique eigenvector and summing all of these eigenvectors up and multiplying by $\frac{1}{\sqrt{r}}$ yields
\[\begin{align*} \frac{1}{\sqrt{r}}\sum_{j=0}^{r-1}\ket{u_j} &= \frac{1}{r}\sum_{j=0}^{r-1} \sum_{k=0}^{r-1} e^{-\frac{2\pi i kj}{r}}\ket{a^k} \\ &= \frac{1}{r}\sum_{k=0}^{r-1} \sum_{j=0}^{r-1} e^{-\frac{2\pi i kj}{r}}\ket{a^k} \\ &= \frac{1}{r}\left[\sum_{j=0}^{r-1}\ket{1} +\sum_{k=1}^{r-1} \sum_{j=0}^{r-1} e^{-\frac{2\pi i kj}{r}}\ket{a^k} \right] \\ &= \ket{1} + \frac{1}{r}\sum_{k=1}^{r-1}\left(\sum_{j=0}^{r-1} e^{-\frac{2\pi i kj}{r}}\right) \ket{a^k} \\ &= \ket{1} \end{align*}\]Where we have used the geometric series formula $\sum_{k=0}^n aR^k = a\left(\frac{1 - R^n}{1 - R}\right)$ with $a = 1$, $R = e^{-\frac{2\pi i k}{r}}$ and $n = r$ to see that the sum in parentheses in the second term goes to 0.
So we can instead apply phase estimation on the input vector $\ket{0}^{\otimes t} \ket{1} = \frac{1}{\sqrt{r}} \sum_{j=0}^{r-1}\ket{0}^{\otimes t}\ket{u_j}$. This means that instead of returning just one of the eigenvalues of the operator $U$, phase estimation will return an equal superposition of all of them, i.e. it will return $\frac{1}{\sqrt{r}}\sum_{j=0}^{r-1} \ket{2^t \tilde{\varphi}_j}\ket{u_j}$ where $\tilde{\varphi}_j$ is an $n$ bit approximation to $\frac{j}{r}$. Thus by measuring the first register of $t$ qubits, we are equally likely to get any of the results $\frac{1}{r}, \ \frac{2}{r}, \dots \frac{r-1}{r}$ upto an approximation of $2L+1$ bits.
Suppose after measuring the output of the phase estimation algorithm we obtain a result $\tilde{\phi} \approx \frac{j}{r}$. We know this value is approximate to $n = 2L+1$ bits only but we also know that it is a rational number. If we could find the closest fraction to $\tilde{\phi}$ we can obtain $r$.
There is an efficient algorithm to do exactly this, called the continued fractions algorithm.
The continued fractions algorithm is a way to describe real numbers in terms of integers alone. To be more precise, we represent the number in a sequence of integers of the form
\[\left[a_0; a_1, \dots a_n\right] = a_0 + \frac{1}{a_1 + \frac{1}{\dots + \frac{1}{a_n}}}\]For rational numbers this is pretty straightforward but for irrational numbers it gets a little trickier and actually gives an infinite continued fraction. To see this, let’s compute the continued fraction for both $\frac{99}{70} \approx \sqrt{2}$ with the former being rational and latter being irrational:
\[\begin{align*} \frac{99}{70} &= 1 + \frac{29}{70} \\ &= 1 + \frac{1}{2 + \frac{12}{29}} \\ &= 1 + \frac{1}{2 + \frac{1}{2 + \frac{5}{12}}} \\ &= 1 + \frac{1}{2 + \frac{1}{2 + \frac{1}{2 + \frac{2}{5}}}} \\ &= 1 + \frac{1}{2 + \frac{1}{2 + \frac{1}{2 + \frac{1}{2 + \frac{1}{2}}}}} \\ &= [1; 2, 2, 2, 2, 2] \end{align*}\]As expected, this fraction terminates. For $\sqrt{2}$ however:
\[\begin{align} \sqrt{2} &= 1 + \sqrt{2} - 1 \\ &= 1 + \frac{1}{1 + \sqrt{2}} \\ &= 1 + \frac{1}{2 + \frac{1}{1+\sqrt{2}}} \\ &= 1 + \frac{1}{2 + \frac{1}{2 + \frac{1}{1+\sqrt{2}}}} \\ &= 1 + \frac{1}{2 + \frac{1}{2 + \frac{1}{2 + \dots}}} \\ &= [1; 2, 2, 2, 2, \dots] \end{align}\]So we see that while $\frac{99}{70}$ terminates $\sqrt{2}$ never terminates since it’s an irrational number.
One other definition that is important for us is that of a \textit{convergent}. For a continued fraction $\left[a_0; a_1,\dots, a_N\right]$ define the $k^{\mathrm{th}}$ convergent to be $\left[a_0; a_1,\dots, a_k\right]$. For example the first convergent is given by $\left[a_0; a_1\right] = a_0 + \frac{1}{a_1} = \frac{a_0 a_1 + 1}{a_1}$
How do continued fractions help us in recovering $r$ from our phase estimate $\tilde{\phi}$? The key lies in the following theorem:
Let $\frac{s}{r}$ be a rational number such that
\[\left|\frac{s}{r} - \tilde{\varphi}\right| \leq \frac{1}{2r^2}\]Then $\frac{s}{r}$ is a convergent of the continued fraction expansion of $\tilde{\varphi}$.
A proof of this theorem can be found in Nielsen and Chuang. The reason why this theorem applies in our case is because we chose the number of qubits to be such that $n = 2L + 1$, which means that our phase estimates are accurate to $2L+1$ bits, i.e.
\[\begin{align*} \left|\frac{s}{r} - \tilde{\varphi} \right| \leq \frac{1}{2^{2L+1}} &= \frac{1}{2\cdot (2^L)^2}\\ &\leq \frac{1}{2r^2} \end{align*}\]Which is true since $r \leq N \leq 2^L$.
Given this, the technique to find $r$ is straightforward: we compute the convergents of $\tilde{\varphi}$ which will give us some fraction $\frac{s’}{r’}$. By testing the different values of $r’$, i.e. by calculating $x^{r’}\mod N$ and seeing if the result is 1. If yes, then we have found the period. We can now go back to step 4 of the main algorithm and proceed normally.
Let’s factorise 15 using the algorithm described above. Let’s make our guess be $a = 7$. We see that $\gcd(7,15) = 1$ so we proceed to the next step where we have to compute the order of $7 \mod 15$.
The first thing we need to do is initialise our registers. Recall that phase estimation requires two registers - one with a length of $t = n + \ceil{\log\left(2 + \frac{1}{2\epsilon}\right)}$ and another one to represent all the possible eigenvectors. Since our subspace of interest goes up to $\ket{15}$ we must have the second register be at least $L = 4$ qubits. For the size of the first register, suppose we want a probability of being incorrect being at most $\frac{1}{4}$, we should choose $t = 2(4)+1 + \ceil{\log\left(2 + \frac{1}{2\frac{3}{4}}\right)} = 11$ qubits. However for this simple example, we can get away with using $t = 3$ qubits.
First initialise the qubit state
\[\ket{0}^{\otimes 3} \ket{1} = \ket{000}\ket{0001}\]Now we apply the phase estimation algorithm as outlined above. First we apply quantum (Hademard) gates to change the state to
\[\frac{1}{\sqrt{2^3}}\sum_{k=0}^{2^3-1}\ket{k}\ket{1} = \frac{1}{\sqrt{8}}\left[\ket{000} + \ket{001} + \dots + \ket{111} \right]\ket{0001}\]Apply modular exponentiation to get $\frac{1}{\sqrt{8}}\sum_{k=0}^{2^3-1}\ket{k}\ket{7^k \mod 15}$. From now on we will stop writing the full qubit representation, i.e. instead of $\ket{011}\ket{1011}$ we will write $\ket{3}\ket{13}$. After applying modular exponentiation the state becomes
\[\begin{align*} &\frac{1}{\sqrt{8}}\left[\ket{0}\ket{1} + \ket{1}\ket{7} + \ket{2}\ket{4} + \ket{3}\ket{13} + \ket{4}\ket{1} + \ket{5}\ket{7} + \ket{6}\ket{4} + \ket{7}\ket{13}\right] \\ = &\frac{1}{\sqrt{8}}\left[\left(\ket{0} + \ket{4}\right)\ket{1} + \left(\ket{1} + \ket{5}\right)\ket{7} + \left(\ket{2} + \ket{6}\right)\ket{4} + \left(\ket{3} + \ket{7}\right)\ket{13} \right] \end{align*}\]Now we take the inverse Quantum Fourier Transform on the first register to obtain
\[\begin{align*} \frac{1}{4}\bigg[ & \left(\ket{0} + \ket{2} + \ket{4} + \ket{6}\right)\ket{1} + \\ & \left(\ket{0} - i \ket{2} - \ket{4} + i \ket{6}\right)\ket{7} + \\ & \left(\ket{0} - \ket{2} + \ket{4} - \ket{6}\right)\ket{4} + \\ & \left(\ket{0} + i \ket{2} - \ket{4} - i \ket{6}\right)\ket{13} \bigg] \end{align*}\]Rearranging to group terms by the elements of the first register gives
\[\begin{align*} \frac{1}{4}\bigg[ & \ket{0}\left(\ket{1} + \ket{7} + \ket{4} + \ket{13}\right) + \\ & \ket{2}\left(\ket{1} - i \ket{7} - \ket{4} + i \ket{13}\right) + \\ & \ket{4}\left(\ket{1} - \ket{7} + \ket{4} - \ket{13}\right) + \\ & \ket{6}\left(\ket{1} + i \ket{7} - \ket{4} - i \ket{13}\right) \bigg] \end{align*}\]Now if we measure the first register, we are equally likely to measure either $0$, $2$, $4$, or $6$. Suppose we measure $6$, which means that $6 = 2^t \frac{j}{r} = 8 \frac{j}{r}$. If we compute the continued fraction of $\frac{j}{r} = \frac{6}{8}$, we obtain $\frac{6}{8} = \frac{1}{1 + \frac{1}{3}} = \frac{3}{4}$. Our guess is therefore that $r = 4$, which indeed works as $7^4 = 1 \mod 15$.
Now we have the period of $7$ we can continue with the main algorithm. We now compute $\gcd(7^2 + 1, 15) = 5$. This is not one so we have a divisor of $15$, and the other divisor is $\frac{15}{5} = 3$. Done!