Imagine wanting to find the fastest route to reach a destination by car. You could use a road map to estimate the distance and travel time of different roads. However, this method doesn’t account for traffic, which can vary significantly throughout the day.
Gradient Descent can be used to find the fastest route in real-time. In this case:
- The cost function represents the travel time of the journey.
- The parameter to optimize is the route to follow.
- The gradient indicates the direction in which travel time increases most rapidly.
The Gradient Descent algorithm can then be used to update the route iteratively, getting closer to the fastest route with each iteration.
Let’s now try to organize the definitions a bit.
Gradient Descent is an algorithm that tries to find the minimum of an objective function, i.e., the lowest possible value that the function can assume. To do this, the algorithm starts from a random point and moves in the opposite direction of the gradient, which is the direction in which the function grows most rapidly. The gradient is calculated as the derivative of the function, i.e., the slope of the curve at a point. The higher the gradient, the steeper the function.
Continue reading “The Gradient Descent Algorithm Explained Simply”