Neural Networks

Intro

The image above might seem complicated but it’s fairly straight forward.

Think of a neural network as a chain of workers (neurons indicated with _j ) gathered together into (layers). So if a layer has 25 neurons then we will have 25 different formulas one for each neuron, or if you want to think of parameters w & b we will have 25 different formulas with 25 different sets of w _j & b_j one for each
Each layer accepts input either from user/data or from another layer.
Each layer produces an output (activation value indicated as a ^[l] ) and passes it down to another layer or to the final output
Each layer is labeled with l as you see above and each parameter w or b ^[l] with l being the previous layer number
Each activation paramerter is labeled with j as you see above, so if an activation puts out 3 outputs [a] they will be labeled as parameters w or b_1-3
So when you see the subscript _j it will address the unit
g as you remember is the sigmoid function and in regards to neural network it is known as the activation function a
The image above shows the notation for layer 3
So the formula will work for all activation points even a^[0]
In the future we will see how to use other functions besides the sigmoid function
So here now we know how to calculate the activation of any layer given the output of the previous layer
The function implemented by a neuron with no activation for a linear regression model is as we covered earlier

The function implemented by a neuron/unit with a sigmoid activation is the same as in the logistic regression section earlier

Make Predictions

Let’s design something that will recognize handwritten digits

Forward Propagation

Because you’re activating the propagation of the neurons from left to right it is called forward propagation. A backward propagation model is used to go backwards to learn

Let’s say we use an 8x8 pixel image and we want to know if it is a 0 or 1
We have 3 layers with the first having 25 neurons (units), second having 15 and the third having 1 unit
The input x could also be labeled as a^[0]
So for the first layer a^[1] can be written as

For layer 2 we have

For layer 3. Note that the probability can be labeled as f(x) as well

Tensorflow

Coffee Roasting

Is a framework like pytorch for deploying ML models. Let’s go through a simple example to explain how the same algorithm can be used for countless situations.

Let’s say, that we want to predict or cause inference as to which temperature and duration will cause our coffee beans to be perfectly roasted?

As you see in the image above, that the circles indicate the beans are not well roasted
Some temperatures cause them to be undercooked, overcooked
Some durations will cause them to be in the same condition as well
As you can see: the input will be two values: temp & duration
It will be input as X into the neurons which will activate to a second layer which will activate to another a value
That final a value will be tested to see if greater or equal to 0.5 which will tell us if the beans are perfectly roasted

Let’s break it down:

The input x will be an array of two values

Layer 1 which has 3 neurons.units will use the sigmoid function to activate (produce an output). Dense/layer is another name used in neural networks to indicate a layer

So then the output of layer 1 which is a1 activated on the input x. It will have 3 values derived from the 3 units/neurons that are in layer 1. It is shown as

Then we have layer 2 which has 1 unit/neuron, uses the same sigmoid function
Take a1 as input and outputs a2

Finally you test the value of a2 against the threshold to see if we have a 1 or 0
The entire model will be shown as this

Digit Classification

So the example explained above in the introduction will look like this