How it works:
There are 3 main parts to generation, which are initialization, computation, and interpolation.

In the initialization phase, a table of pseudo-random numbers of a specified length is generated.
The table is filled with numbers from 0 to 255, which are used to generate different vectors all of length
0.5 or -0.5. The table of vectors is a 16 by 16 square of vectors, and this is what is used to generate the final noise.

In the generation phase, an array for every pixel is looped over and the values are generated for it.
First, the percentage of position in the grid is calculated (between 0 and 1 for both x and y), and dot
products are calculated for that position. Since there are 4 dot products, they need to be interpolated.

In the interpolation phase, the 4 dot products are combined into 2 x values, and then those are combined into 1 final
value through the y interpolation. Since linear interpolation would have introduced artifacts, the interpolation percentage
is run through a fade function (6t^5 - 15t^4 + 10t^3) to reduce the impact of the far away vectors on the final value.

For the drawing part, the value recieved is used as opacity for a white pixel over a colored background.
The whiter the square, the higher the value.

If you want a more detailed explanation, you can look at this page, this video,
which both helped me understand how to make perlin noise, or look at the source code for this page
(by right clicking on the background and then left clicking on "view page source").