
The Ising model offers a straightforward way to understand phase transitions: physicists have long since noticed that if you heat a magnet to a certain temperature, (referred to as Curie temperature), $T_c$ it will suddenly lose it's magnetic properties.
]
Summary
The model itself consists of a grid of spins which can be either up or down, which interact with their neighbors. Neighboring spins tend to align as it is more energetically favorable, but may flip due to random temperature fluctuations. Varying the temperature is the only knob we turn.
At low temperatures, the fluctuations are small, so spins tend to order by aligning with their neighbors. Macroscale alignment of spin values produces a net magnetic dipole moment: the system as a whole is magnetized. As we increase the temperature past the Curie temperature, previously stable, ordered behavior suddenly becomes chaotic. Here spins flip randomly, resulting in the macroscale change: net magnetization is now zero. The system has undergone a phase transition.
However it is possible to capture the system in the intermediate state, at the Curie temperature, (hereafter referred to as the critical point) where the order and disorder are matched. At this point a number of very curious properties emerge.
Model
To construct a Ising Model, consider a set $\Lambda$ of lattice sites, at each discrete coordinate site $(x,y) \in \Lambda$ there is a discrete variable $\sigma_{(x,y)}$, each such variable admits one of two orientations $\sigma \in \{\uparrow,\downarrow\}$ referred to as *spin*. We call an assignment of spin value to each set of coordinates on the lattice a *spin configuration*, $S = \{\sigma_j\}_{j \in \Lambda}$, for example:
$$S = \begin{matrix} \downarrow & \downarrow & \uparrow & \uparrow & \downarrow \\ \downarrow & \uparrow & \uparrow & \downarrow & \downarrow \\ \uparrow & \downarrow & \uparrow & \downarrow & \uparrow \\ \downarrow & \uparrow & \uparrow & \uparrow & \uparrow \\ \uparrow & \downarrow & \downarrow & \uparrow & \downarrow \\ \end{matrix}$$
In our case each site $j = (x,y) \in \Lambda$ has four neighbors. For $i$ adjacent to $j$ we write (denoting an "edge" between the two nodes) as $\langle i j \rangle$.
 For any two adjacent sites there is a *coupling constant* $J_{ij}$, which determines strength of the interaction between spins at adjacent sites $i$ and $j$.
 Every site $j \in \Lambda$ has an external magnetic field $h_j$ interacting with it with permeability constant $\mu$.
Energy of a given spin configuration $S$ is given by the Hamiltonian function:
$$\mathcal{H}(S) = \sum_{\langle ij \rangle} J_{ij} \sigma_i \sigma_j  \mu \sum_{j} h_j\sigma_j$$
For simplicity, consider the case $J_{ij} = 1$ is uniform for all pairs $i,j$, and there is no external magnetic field, that is $h_j = 0$ everywhere.
Spins at any two adjacent sites $i,j$ will tend to line up, since energy difference $E_{ij} \propto J_{ij} \sigma_i \sigma_j$ dictates it is more energetically favorable. However, flipping of spins is a stochastic process, which is a subject to random thermal fluctuations. The higher the temperature $T$ the more disordered the system will be. We can measure how ordered the configuration $S$ of $\Lambda$ is by measuring it's mean magnetization:
$$\langle M \rangle = \frac{1}{N}\sum^{N}_{i=1} \sigma_i$$
Where $N$ is number of sites on the lattice $\Lambda$.
Ising model has one main free parameter (the only knob we turn): temperature $T$, (in statistical mechanics it is more common to use it's inverse $\beta \propto 1/T$) that leads to emergence of an additional, macroscopically observed parameter, referred to as an order parameter: the total magnetization $\langle M \rangle$, mean value of all the individual spins on the lattice.
Consider a model at a temperature $T$. In statistical thermodynamics we typically write it as it's reciprocal: $\beta \propto 1/{T}$. The configuration probability of $S$ at inverse temperature $\beta$ is then given by the Boltzmann distribution:
$$P_{\beta}(S) = \frac{e^{\beta \mathcal{H}(S)}}{Z_\beta}$$
Where term $Z_\beta$ represents the probability over all possible configurations $S$ of $\Lambda$ at temperature $T \propto 1/\beta$:
$$Z_{\beta} = \sum_{S} e^{\beta \mathcal{H}(S)}$$
We observe that for lower temperatures (higher values of $beta$) the value of $\langle M \rangle$ is nonzero its value remains relatively stable. However as we increase temperature parameter, past a certain point, magnetization (and thus ordering) plummets to zero: the system has undergone a phase transition.
The temperature at which $\langle M \rangle$ undergoes this inflection is the *critical point* of the system.
Numerical Recipe for the Ising Model.
Total configuration space $\mathcal{S}$ of all possible $S$ is very large. However pairwise interactions permit us to consider $Z_\beta$ as a *partition fuinction*. Consider two configurations $S_0$ and $S_1$ that differ only by flipping of a single spin $\sigma_j$. At equilibrium, the following must be true:
$$P_\beta(S_0) P_\beta(S_0 \to S_1) = P_\beta(S_1) P_\beta(S_1 \to S_0)$$
Where $P_\beta(S_0 \to S_1)$ is the probability of transition from configuration $S_0$ to $S_1$. The energy difference is the difference between respective Hamiltonains:
$$\Delta E = E_1  E_0 = \mathcal{H}(S_1)  \mathcal{H}(S_0)$$
Therefore:
$$\frac{P_\beta(S_0 \to S_1)}{P_\beta(S_1 \to S_0)} = \frac{P_\beta(S_1)}{P_\beta(S_0)} = e^{\beta(E_1  E_0)}$$
If $E_0 > E_1$ then spin of $\sigma_j$ flips, and if $E_1 > E_0$ then it does not. By repeating single spin flip in this manner, the model attains equilibrium. If, for the sake of simplicity, we assume contributions of external magnetic field $h_j$ is zero everywhere, only thing that needs to be evaluated are four neighbors of site $j$:
$$\beta(E_0  E_1) = \beta \sum_{i :\langle ij \rangle} J_{ij} \sigma_i \sigma_j$$
Dynamic Correlations
Note that while only nearest neighbor interactions are explicitly defined, the information about flip in one spin has the potential of being transmitted across the lattice. If you flip a single spin, it may flip their neighbors, which in turn may flip their own neighbors, and so on. We can measure these long range interactions between nodes $i,j$ with a dynamic correlation function $\Gamma$.
For 2dimensional Ising model, $\Gamma_{i,j}$ is simply covariance of $\sigma_i$ and $\sigma_j$ sampled over some time interval $\Delta t$ at a temperature $T \propto 1/\beta$:
$$\Gamma_{i,j}(\beta) = \langle (\sigma_i  \langle \sigma_i \rangle)\sigma_j  \langle \sigma_j \rangle)\rangle = \mathrm{cov}(\sigma_i, \sigma_j)$$
Unsurprisingly, while dynamic correlation decreases with distance, but at the critical temperature the curve $T_c$ decays much more slowly than in the ordered $(T < T_c)$, or disordered $(T > T_c)$ phase.
Note that higher order correlation functions can be defined, but such high correlation functions are generally considered difficult to interpret or measure:
$$\Gamma_{i_1,i_2,\cdots,i_k} = \langle (\sigma_{i_1}  \langle \sigma_{i_1} \rangle)(\sigma_{i_2}  \langle \sigma_{i_2} \rangle) \cdots (\sigma_{i_k}  \langle \sigma_{i_k} \rangle) \rangle$$
Connection to Machine Learning
The Ising model informs Hopfield model's dynamics as follows:
HopfieldIsing Model Equivalence Theorem. The Hopfield model is the Ising model with a Hamiltonian $\mathcal{H} = \sum_{i,j} J_{ij} \sigma_i \sigma_j$, where $J_{ij} = W_{ij}$ and $\sigma_i = s_i$. 
See also
