[![AnalyticsDojo](https://github.com/rpi-techfundamentals/spring2019-materials/blob/master/fig/final-logo.png?raw=1)](http://introml.analyticsdojo.com)

# Neural Networks 
- This was adopted from the PyTorch Tutorials. 
- http://pytorch.org/tutorials/beginner/pytorch_with_examples.html

## Neural Networks 
- Neural networks are the foundation of deep learning, which has revolutionized the 

```In the mathematical theory of artificial neural networks, the universal approximation theorem states[1] that a feed-forward network with a single hidden layer containing a finite number of neurons (i.e., a multilayer perceptron), can approximate continuous functions on compact subsets of Rn, under mild assumptions on the activation function.```

### Generate Fake Data
- `D_in` is the number of dimensions of an input varaible.
- `D_out` is the number of dimentions of an output variable.
- Here we are learning some special "fake" data that represents the xor problem. 
- Here, the dv is 1 if either the first or second variable is 


In [53]:
# -*- coding: utf-8 -*-
import numpy as np

#This is our independent and dependent variables. 
x = np.array([ [0,0,0],[1,0,0],[0,1,0],[0,0,0] ])
y = np.array([[0,1,1,0]]).T
print("Input data:\n",x,"\n Output data:\n",y)

Input data:
 [[0 0 0]
 [1 0 0]
 [0 1 0]
 [0 0 0]] 
 Output data:
 [[0]
 [1]
 [1]
 [0]]


### A Simple Neural Network 
- Here we are going to build a neural network with 2 hidden layers. 

In [54]:
np.random.seed(seed=83832)
#D_in is the number of input variables. 
#H is the hidden dimension.
#D_out is the number of dimensions for the output. 
D_in, H, D_out = 3, 2, 1

# Randomly initialize weights og out 2 hidden layer network.
w1 = np.random.randn(D_in, H)
w2 = np.random.randn(H, D_out)
bias = np.random.randn(H, 1)

### Learn the Appropriate Weights via Backpropogation
- Learning rate adjust how quickly the model will adjust parameters. 

In [55]:
# -*- coding: utf-8 -*-

learning_rate = .01
for t in range(500):
    # Forward pass: compute predicted y
    h = x.dot(w1)

    #A relu is just the activation.
    h_relu = np.maximum(h, 0)
    y_pred = h_relu.dot(w2)

    # Compute and print loss
    loss = np.square(y_pred - y).sum()
    print(t, loss)

    # Backprop to compute gradients of w1 and w2 with respect to loss
    grad_y_pred = 2.0 * (y_pred - y)
    grad_w2 = h_relu.T.dot(grad_y_pred)
    grad_h_relu = grad_y_pred.dot(w2.T)
    grad_h = grad_h_relu.copy()
    grad_h[h < 0] = 0
    grad_w1 = x.T.dot(grad_h)

    # Update weights
    w1 -= learning_rate * grad_w1
    w2 -= learning_rate * grad_w2

0 10.6579261591
1 9.10203339893
2 7.92822558061
3 7.01603070961
4 6.28979819918
5 5.69984738569
6 5.21233053023
7 4.80346624793
8 4.456102755
9 4.15758768903
10 3.89840273398
11 3.67126267684
12 3.47050562961
13 3.29167096682
14 3.13120131373
15 2.98622833978
16 2.8544162991
17 2.73384607859
18 2.62292812419
19 2.52033626007
20 2.42495682843
21 2.33584920317
22 2.25221484357
23 2.17337282724
24 2.09874034592
25 2.02781703626
26 1.96017229769
27 1.89543495408
28 1.83328476643
29 1.77344541638
30 1.71567866423
31 1.65977944954
32 1.60557175094
33 1.55290505986
34 1.50165135204
35 1.45170246386
36 1.40296779892
37 1.35537230457
38 1.3088546702
39 1.26336570846
40 1.21886688836
41 1.17532899574
42 1.13273090181
43 1.09105842491
44 1.05030327423
45 1.01046206727
46 0.971535415306
47 0.933527072858
48 0.896443149097
49 0.860291379845
50 0.825080459944
51 0.790819436113
52 0.757517160723
53 0.725181806896
54 0.693820445158
55 0.663438681518
56 0.634040356373
57 0.605627303094
58 0.57819916452

#CFully connected 

In [56]:

pred = np.maximum(x.dot(w1),0).dot(w2)

print (pred, "\n", y)


[[ 0.        ]
 [ 0.99992661]
 [ 1.00007337]
 [ 0.        ]] 
 [[0]
 [1]
 [1]
 [0]]


### Hidden Layers are Often Viewed as Unknown
- Just a weighting matrix

In [57]:
#However
w1

array([[-0.20401151,  1.01377406],
       [-0.10186284,  1.01392285],
       [ 1.07856887,  0.01873049]])

In [58]:
w2

array([[ 0.49346731],
       [ 0.98634069]])

In [49]:
# Relu just removes the negative numbers.  
h_relu

array([[ 0.        ,  0.        ,  0.        ],
       [ 0.72108356,  0.        ,  0.        ],
       [ 0.72753913,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ]])