上QQ阅读APP看书，第一时间看更新

How to be an adaptive thinker

Reinforcement learning, one of the foundations of machine learning, supposes learning through trial and error by interacting with an environment. This sounds familiar, right? That is what we humans do all our lives—in pain! Try things, evaluate, and then continue; or try something else.

In real life, you are the agent of your thought process. In a machine learning model, the agent is the function calculating through this trial-and-error process. This thought process in machine learning is the MDP. This form of action-value learning is sometimes called Q.

To master the outcomes of MDP in theory and practice, a three-dimensional method is a prerequisite.

The three-dimensional approach that will make you an artificial expert, in general terms, means:

Starting by describing a problem to solve with real-life cases
Then, building a mathematical model
Then, write source code and/or using a cloud platform solution

It is a way for you to enter any project with an adaptive attitude from the outset.