We present an oversimplified story about the Ito formula.

Consider the simple random walk:

\[X_1,X_2,\cdots \text{ iid } \mathbf{P}(X_i=\pm 1) = \tfrac{1}{2};\qquad S_0=0,\quad S_n = \sum_{k=1}^n X_k.\]

The Brownian motion can be viewed as the “blow-down” or zoom-out of $S_n.$ That is, rescale $S_n$: $W_t^{(n)} := \tfrac{1}{\sqrt{n}} S_{[nt]}$ so that $\mathbf{E} W_1^{(n)} = 0, \mathbf{V} W_1^{(n)} = 1.$ The limit (Donsker’s theorem) is called a Brownian motion.

In view of properties of $W_t^{(n)},$ a continuous-time process $W_t$ is called a Brownian motion, if $W_0=0$; $t\mapsto W_t(\omega)$ is continuous ($C^{\frac{1}{2}-}$) for any $\omega;$ for any partition $0=t_0<t_1<\cdots <t_n,$

\[W_{t_1}, \quad W_{t_2}-W_{t_1},\quad \cdots, \quad W_{t_n}-W_{t_{n-1}}\]

are independent; and for any $0<s<t, W_t-W_s\sim \mathcal{N}(0,t-s).$

Given a BM $W_t$ and $f$ not too bad, like the Riemann integral, the Ito integral satisfies

\[\int_0^T f(W_t)\,dW_t = \sum_{|\pi|\to 0} \sum_i f(W_{t_{i-1}})(W_{t_{i}} - W_{t_{i-1}}),\]

where \(\pi=\lbrace0=t_0<\cdots<t_N=T\rbrace,|\pi|=\max |t_{i}-t_{i-1}|.\)

$t\mapsto \int_0^t f(W_s)\,dW_s$ can be represented by a continous martingale $X_t$: $X_t\in L^1$, $\mathbf{E}[X_t\mid \mathcal{F}_s]=X_s$ for any $s<t$, where $\mathcal{F}_t$ is the “filtration” that contains all the info about $X_s$ for $s\le t$.

We write

\[dX_t = \Xi_t\,dW_t + \Theta_t\,dt,\]

if

\[X_t = X_0 + \int_0^t \Xi_s\,dW_s + \int_0^t \Theta_s\,ds.\]

Technically, we require $\Xi_t,\Theta_t$ are adapted: they are $\mathcal{F}_t$-measurable. Such $X_t$ is called an Ito process.

Ito formula (version 1): for any smooth function $f_t(x)=f(x,t)$ with $\mathbf{E}\int_0^T f_t^2 \,dt < \infty,$

\[df_t(W_t)=\left(\partial_t f_t+\tfrac{1}{2}\partial_x^2f_t\right) dt + \partial_x f_t \,dW_t,\]

where $f_t$ is evaluated at $W_t$.

It is handy to define some formal products

\[dW_t\cdot dW_t = dt,\quad dW_t\cdot dt = dt\cdot dt=0.\]

Then Ito really says \(df_t(W_t)=\partial_t f_t\,dt+\tfrac{1}{2}\partial_x^2f_t \,dW_t\cdot dW_t + \partial_x f_t \,dW_t.\)

Ito formula (version 2): if $dX_t=\Xi_t\,dW_t+\Theta_t\,dt$,

\[df_t(X_t)=\partial_t f_t\,dt+\tfrac{1}{2}\partial_x^2f_t \,dX_t\cdot dX_t + \partial_x f_t \,dX_t.\]

Ito formlua (multi-variable): if $X_t=[X_t^i]_{i=1}^n$ is a vector of Ito processes, then

\[df_t(X_t)= \partial_t f \,dt + \nabla f\cdot dX_t + \tfrac{1}{2} \langle dX_t, \nabla^2 f \cdot\,dX_t\rangle,\]

where we adopt the formal product as above.

We sketch the proof of the first version and we will see why there is a second-order term. It’s more than just “ignoring higher-order terms” and $dW_t\cdot dW_t=dt$ has actual meanings.

Lemma \(\lim_{|\pi|\to 0} \sum_i (W_{t_i}-W_{t_{i-1}})^2 = t^2\) in $L^2$.

Recall that for $Z\sim\mathcal{N}(0,t)$, $\mathbf{E}[|Z|^k]\le C_k t^{\frac{k}{2}}, \mathbf{E}[Z^4]=3t^2.$

Given a partition \(\pi=\lbrace 0=t_1<\cdots t_N=T\rbrace\) and write \(\delta_i=t_i-t_{i-1},\Delta_i=W_{t_i}-W_{t_{i-1}}\sim\mathcal{N}(0,\delta_i).\) As $|\pi|\to 0$,

\[\begin{align*} &\mathbf{E}\left\lbrace \sum_i \Delta_i^2 - \delta_i \right\rbrace^2 = \sum_i \mathbf{E} [\Delta_i^4] + \delta_i^2 - 2\delta_i\mathbf{E}[\Delta_i^2]\\ &= \sum_i 3\delta_i^2+\delta_i^2-2\delta_i^2 \le 2|\pi|\sum_i\delta_i = 2|\pi|t\to 0. \end{align*}\]

This Lemma really means $dW_t\cdot dW_t=dt.$

Now we prove version 1 with the same notations. By Taylor expansion,

\[\begin{align*} &f(W_t,t)-f(0,0) = \sum_i f(W_{t_i},t_i) -f(W_{t_{i-1}},t_{i})+f(W_{t_{i-1}},t_{i})- f(W_{t_{i-1}},t_{i-1})\\ &=\sum_i \partial_x f(W_{t_{i-1}})\Delta_i + \partial_tf(W_{t_{i-1}})\delta_i + \tfrac{1}{2}\partial_x^2f(W_{t_{i-1}})\Delta_i^2 + R_i, \end{align*}\]

where $R_i$ uses $f’’’$. As $|\pi|\to 0$, the first two terms converge to the right things and we only need to worry about the third term. By Lemma above,

\[\begin{align*} \mathbf{E}\left\lbrace\sum_i \partial_x^2f(W_{t_{i-1}}) \left(\Delta_i^2-\delta_i\right) \right\rbrace^2 \le |f|_{C^2}\mathbf{E}\left\lbrace\sum_i \Delta_i^2-\delta_i \right\rbrace^2 \end{align*}\to 0.\]

Finally,

\[\begin{align*} \mathbf{E} \left|\sum_i R_i\right| \le C|f|_{C^3}\sum_i \mathbf{E}|\Delta_i|^3 \le C \sum_i \delta_i^{3/2} \le C\sqrt{|\pi|}\cdot t\to 0. \end{align*}\]