**neural network:**

**What is a neural network?**

First of all, understand that our purpose is to be artificial intelligence. Since artificial intelligence wants to imitate people, the biggest feature of people is that there are many neurons, so it can be considered. For artificial neural networks, there are a large number of networks built by nodes, but after all, this is an abstract concept. Nodes are usually used to store numbers, and the edges are generally used to store weights and which neurons passed to.

neural network is divided into feeding neural network and feedback neural network

**Foreign neural network**: It is the simplest neural network. It adopts a unidirectional and multi -layer structure. Each neuron is arranged in layers. Each neuron is connected to the neurons in the previous layer. Pass to the next layer, and there is no feedback between the layers.

**Input node**: Input external information, no calculations are performed, and only transmit information to the next layer of nodes

**hidden node**: Receive the input of the previous layer, calculate, and pass it to the node of the next level

**output node**: Receive the input of the previous layer, calculate, and output the final result

The

input layer and output layer must be available, but there may be no (single -layer perceptual machine) or multiple layers (such feed neural networks are called multi -layer perceptuals)

Female neurotic network: Also known as recursive neural network, is a one -step transition that is connected to the input layer again. Such network neurons can be connected.

Take an example of feedback neural network

**The main difference between feedback neural network and feedback neural network**

1. Female neural network only accepts the data from the previous layer, processed and then passed into the next layer. The data is flowing forward, and the feedback neuron is connected. Feedback to the front layer.

2. Fedback neural network does not consider the delay in the output and input time. It can only indicate a mapping relationship between output and input. The feedback neural network is different. He will consider the delay between the output and input. Considering whether the output is useful for input.

3. Compared with the feeding neural network, feedback neural networks are more suitable for memory and other functions.

**How to design a neural network?**

- Design a neural network. The number of nodes of the input layer and output layer is often fixed, and the hidden layer in the middle is specified by itself.
- In a diagram of a neural network, the circle represents neurons, and the line represents the connection between neurons, and each line corresponds to a weight. This right requires training.

**Development history:**

Single-layer perception machine ——> Multi-layer perception machine ——–> neural network

**Perception machine:**

In the perceptual machine, there is no so -called hidden layer. Only the input and output are two layers. The input layer is only responsible for transmitting data. Without calculation, the output unit of the output layer calculates the previous layer.

**Structure diagram:**

If the output is not a value, but a vector, then add a Z2 at this time, then the output will become a vector.

A is the input matrix vector, Z is the output vector, G is a symbol function, and the output is 0, 1.

But it has a very bad point, that is, even simple or problems cannot be solved.

**Multi -layer perception machine:**

On the basis of the original single -layer perceptual machine, a computing layer is added. The two -layer computing network can not only solve different or problems, but also have a good solution for non -linear classification problems, but if it is two layers of two layers, it is two layers. The neural network was calculated at the time a big problem, and no good solution.

Then the BP algorithm appeared to solve the problem of two layers of computing network computing difficulties.

**Structure: **

** The network structure of the**

two layers contains an input layer and an output layer, and also includes an middle layer. At this time, the middle layer and output layer are our calculation layers.

The

input is the first layer of A. The calculation of the weight, the first layer of A, and then calculating the weight of the second layer to obtain the final output result z. If the output is a vector, you can add the output node.

Until now, I have never mentioned paranoid B. In fact, paranoia B has always existed by default.

Except for the output layer, it exists in each layer of the neural network.

The activation function G in the multi -layer neural network is no longer a symbol function, and it becomes the Sigmoid activation function.

When designing a neural network, the number of nodes of the input layer needs to match the dimension of the features, and the number of nodes of the output layer must match the dimension of the target. The number of nodes in the middle layer is caused by ityourselfspecified. However, the number of nodes sets the effect of the entire model.Generally, set a few values first, and then search for the best intermediate node by searching through a grid.

**Training:**

At the beginning, neural network did not know how to train, which led to not great development. Then with the gradual increase of data volume, more optimization algorithms were proposed, and neural networks gradually displayed.

The purpose of the

Machine learning model training is to make the parameters approaching as real models as much as possible. The specific method is like this. First give all parameters to random value. We use these randomly generated parameter values to predict the samples in the training data. The prediction goal of the sample ish, the real goal is y. Then, define a value LOSS, the calculation formula is as follows.

**loss = (h – y)2**

**This value is called****loss(LOSS), our goal is to make the loss and as small as possible to all training data.**

If the previous neural network predicted matrix formula is brought intoh(because Z =h), then we can write losses as functions about parameters. This function is calledloss function(Loss Function). The following question is to find: how to optimize the parameters, the minimum value of the loss function can be used.

At this time, this problem was transformed into an optimization problem. One commonly used method is the guidance of higher mathematics, but because there are more than one parameter here, the calculation of the calculation of the guidance is very large, so it is generally used to solve this optimization problem.gradient decreasealgorithm.

The

gradient drop algorithm calculates the gradient of the current parameter at each time, and then allows the parameter to move forward at the opposite direction of the gradient, and repeat continuously until the gradient is close to zero. Generally, all parameters are exactly reaching a state where the loss function reaches a minimum value.

In the neural network model, due to the complex structure, the cost of each calculation gradient is very large. So still need to be used**reverse communication**algorithm. The reverse communication algorithm is calculated using the structure of the neural network.It andCalculate the gradient of all parameters at one time, but from behind. First calculate the gradient of the output layer, then the gradient of the second parameter matrix, then the gradient of the middle layer, and then the gradient of the first parameter matrix, and finally the gradient of the input layer. After the calculation is over, the gradients of the two parameter matrix are available.

**reverse communication algorithm is to calculate the gradient from behind, little by little, and layer by layer.**

**Impact:**

Of course, it is beneficial. For example, in autonomous driving, voice and images, but there are still big problems, such as the training is too time -consuming, it is easy to local best, it is difficult to adjust the ginseng, etc. It is soon compared by SVM.

The concept of “Deep Faith Network”. Different from traditional training methods, there is a “in -depth belief network”.**Pre -training **

** The process of**“(Pre-Trayining), which can easily make the value in the neural network find a value close to the optimal solution, and then use”**Filter**“(Fine-Tuning) technology to optimize the entire network. The use of these two technologies has greatly reduced the time to train multi-layer neural networks. He gave a new term to multi-layer neural network learning methods -“**Deep learning**”。

**Structure:**

is based on the two layers, and another layer is connected after the original output layer.

g(**W**(1) * **a**(1)) = **a**(2);

g(**W**(2) * **a**(2)) = **a**(3);

g(**W**(3) * **a**(3)) = **z**;

**Multi -layer neural network, such as a layer of computing, we are called positive transmission.**

Through the three pictures above, it fully indicates that if the neuron is slightly more, the same parameter amount can be deeper to express.

Compared with the two layers of neural networks, in fact, the deep neural network has more layers than it.

The benefits of

**Deep network?**

will have more deeper representation, and the ability to simulate stronger functions. With the increase of the number of network layers, the characteristics extracted by each layer are different, so it will get more more detailed and deeper information. For example: the first layer learns some marginal features, the second layer learns some shapes, the third layer learns goals, and so on. Therefore, to better distinguish and gain stronger classification capabilities.

When a single layer of neural network, the activation function we use is the SGN function. When two layers of neural networks, we use the SIGMOID function most. When it comes to multi -layer neural networks, through a series of studies, it is found that when the RELU function is trained for multi -layer neural networks, it is easier to converge and predict the performance. Therefore, in deep learning, the most popular non -linear function is the RELU function.

In deep learning, generalization technology has become more important than ever. This is mainly because the number of neural networks has increased, and the parameters have also increased.**Overfitting phenomenon**. Therefore, regularization technology is very important. At present, Dropout technology and data-augmentation technology are the most regularized technologies currently used.

At the same time, the success of deep learning is not only the improvement of various optimization algorithms or better activation functions.

**external cause:**

**I hope this article can help you! Intersection Intersection Intersection Intersection**