septa97's BlogI will post random stuff here.null-safe Java<p>For those who have used Java, I’m sure that you have encountered a <strong>NullPointerException (NPE)</strong> atleast once. In this short blog post, I’m going to discuss how to be <strong>null-safe</strong> whenever you are dealing with these <strong>nullable</strong> expressions in Java.</p>
<h2 id="java-8s-optional">Java 8’s Optional</h2>
<p>Just recently, I’ve found out about Java 8’s <strong>Optional</strong> wrapper. It is located in the <strong>java.util</strong> package.</p>
<p>The idea is simple: wrap <strong>Optional</strong> in any data type that you think might have a <strong>null</strong> value.</p>
<p>For example, you have a hash map with a key of <strong>String</strong> and a value of <strong>Integer</strong>,</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">java.util.*</span><span class="o">;</span>
<span class="n">Map</span><span class="o"><</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">></span> <span class="n">map</span> <span class="o">=</span> <span class="k">new</span> <span class="n">HashMap</span><span class="o"><>();</span>
<span class="n">map</span><span class="o">.</span><span class="na">put</span><span class="o">(</span><span class="s">"a"</span><span class="o">,</span> <span class="mi">1</span><span class="o">);</span>
<span class="n">map</span><span class="o">.</span><span class="na">put</span><span class="o">(</span><span class="s">"b"</span><span class="o">,</span> <span class="mi">2</span><span class="o">);</span>
<span class="n">map</span><span class="o">.</span><span class="na">put</span><span class="o">(</span><span class="s">"c"</span><span class="o">,</span> <span class="mi">3</span><span class="o">);</span>
</code></pre></div></div>
<p>And, unintentionally, you try to use the <strong>get</strong> method of a map on a <strong>non-existing</strong> key,</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Integer</span> <span class="n">dValue</span> <span class="o">=</span> <span class="n">map</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">"d"</span><span class="o">);</span>
</code></pre></div></div>
<p>This might cause unexpected NPEs in your code. And obviously, the solution is to wrap these <strong>nullable</strong> values in an <strong>Optional</strong> wrapper.</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Optional</span><span class="o"><</span><span class="n">Integer</span><span class="o">></span> <span class="n">dValueOpt</span> <span class="o">=</span> <span class="n">Optional</span><span class="o">.</span><span class="na">ofNullable</span><span class="o">(</span><span class="n">map</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">"d"</span><span class="o">));</span>
<span class="k">if</span> <span class="o">(</span><span class="n">dValueOpt</span><span class="o">.</span><span class="na">isPresent</span><span class="o">())</span> <span class="o">{</span>
<span class="n">Integer</span> <span class="n">dValue</span> <span class="o">=</span> <span class="n">dValueOpt</span><span class="o">.</span><span class="na">get</span><span class="o">();</span>
<span class="c1">// do something if it has a value</span>
<span class="o">}</span> <span class="k">else</span> <span class="o">{</span>
<span class="c1">// do something if it doesn't have a value</span>
<span class="o">}</span>
</code></pre></div></div>
<p>You might have to use <strong>Optional.ofNullable</strong> most of the time to indicate that the return value of the expression can have <strong>null</strong> values.</p>
<p>I’m sure you’re wondering:</p>
<p><em>“Why don’t you just check for null values manually? Wrapping data in Optional will just yield me more keystrokes.”</em></p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Integer</span> <span class="n">dValue</span> <span class="o">=</span> <span class="n">map</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">"d"</span><span class="o">);</span>
<span class="k">if</span> <span class="o">(</span><span class="n">dvalue</span> <span class="o">==</span> <span class="kc">null</span><span class="o">)</span> <span class="o">{</span>
<span class="c1">// do something if null</span>
<span class="o">}</span> <span class="k">else</span> <span class="o">{</span>
<span class="c1">// do something if NOT null</span>
<span class="o">}</span>
</code></pre></div></div>
<p>Of course, <strong>Optional</strong> has other methods too aside from <strong>isPresent()</strong> and one of the most important method is <strong>flatMap(functionMapper)</strong> which chains operations on an <strong>Optional</strong> value. Personally, I also think it’s better to not use <strong>null</strong> in your code as much as possible. I don’t know if it is just me but whenever I see <strong>null</strong> in my Java code, it makes me want to gouge my eye out.</p>
<p>And for those functional programmers out there, I think you’ve noticed that Java is slowly adapting functional programming concepts. I want to discuss more of these concepts but I don’t have much time so that would be for another blog post and it’s going to be a bit <em>FUN</em>ctional. ;)</p>
<h2 id="conclusion">Conclusion</h2>
<p>Use <strong>java.util.Optional</strong> if you think your data might have <strong>null</strong> values. (<strong>Note</strong>: Applicable only on Java 8 onwards.)</p>
<p>For more details on the <strong>Optional</strong> wrapper, you can check the <a href="https://docs.oracle.com/javase/8/docs/api/java/util/Optional.html">Java 8 API documentation</a>.</p>
<p>Questions? Comments? Or maybe you’ve seen some typo? Put it in the comment section below.</p>
Sat, 18 May 2019 00:00:00 +0000
/2019/05/null-safe-Java
/2019/05/null-safe-JavaReinforcement Learning using Q-Learning on a Flappy Bird agent<p>With the recent <a href="https://blog.openai.com/openai-five/">news on OpenAI Five</a>, I’ve decided to learn and implement a Reinforcement Learning (RL) algorithm. While I’m trying to come up with some ideas, I remember watching a video where a Flappy Bird agent learn by trial and error. So, I now have an environment and an agent (Flappy Bird). The last thing that I need is an algorithm for the agent to learn the optimal decision. Finally, after a few hours of searching, I’ve decided to use Q-Learning as an algorithm.</p>
<h1 id="introduction-to-reinforcement-learning">Introduction to Reinforcement Learning</h1>
<p>As much as possible, I want to try explaining the terms using my own words to validate my understanding of the concepts. So what is Reinforcement Learning? Let’s define a few terms first:</p>
<ul>
<li><strong>Environment</strong> - think of this as the game world since a game will be used in this blog post.</li>
<li><strong>Agent</strong> - an entity which observes the environment.</li>
<li><strong>State</strong> - a condition of an agent and environment at a specific point in time, in this case, a state would be a frame in the game.</li>
<li><strong>Action</strong> - action done by an agent at a specific state. (<em>Yeah, I know, sorry for using the same word in the definition but I guess this is the best definition that I can give.</em>)</li>
<li><strong>Reward</strong> - a return value after an agent performs an action at a specific state. In this environment (the Flappy Bird game), let’s say you jump and dies after jumping, you should expect a negative reward and maybe a positive reward if the bird is still alive after jumping.</li>
<li><strong>Terminal State</strong> - a state (of course) which is either a <strong>game over</strong> or a <strong>goal</strong> state. But in this environment, there won’t be a goal state since the gameplay is continuous.</li>
</ul>
<p>Now that we’ve defined the terms, I’ll introduce the concept of Reinforcement Learning.</p>
<p>First, you will have an <strong>initial state</strong>, an agent, the agent’s <strong>set of actions</strong>, and the environment. Initially, the agent will choose an arbitrary action and perform that action while at a specific state. The environment will then <strong>return a new state</strong> that results from that action together with the associated <strong>reward</strong>. From the received reward, the agent will then choose an optimal action, performs it, and the <strong>loop will go on until convergence</strong>.</p>
<figure>
<br clear="all" />
<img src="https://i.stack.imgur.com/eoeSq.png" width="400" />
<figcaption>Figure 1. Reinforcement learning loop.</figcaption>
</figure>
<p><br clear="all" /></p>
<h1 id="introduction-to-q-learning">Introduction to Q-Learning</h1>
<p>Q-Learning is a type of reinforcement learning algorithm. As much as possible, I don’t want to describe the algorithm formally. I want you to gain an intuition of the algorithm. For the simplest example, I’ll use Siraj Raval’s <a href="https://www.youtube.com/watch?v=A5eihauRQvo">sample game in Q learning</a>.</p>
<h2 id="graph-game">Graph Game</h2>
<p><br clear="all" /></p>
<figure>
<img src="/assets/images/06-30-18/graph-game.png" width="400" />
<figcaption>Figure 2. Siraj Raval's graph game example.</figcaption>
</figure>
<p><br clear="all" /></p>
<p>So in this game, you have a weighted directed graph and the goal is to go to Node 5. First, let’s define the necessary terms:</p>
<ul>
<li><strong>Environment</strong> - the environment will be the weighted directed graph.</li>
<li><strong>Agent</strong> - an imaginary agent will be needed that will traverse this graph during the learning process.</li>
<li><strong>State</strong> - the state will be the current node of the agent.</li>
<li><strong>Action</strong> - the set of actions will be the nodes which has a connection from the outgoing edges of the current node of the agent.</li>
<li><strong>Reward</strong> - the reward will be the weight of the outgoing edges from the current node. <strong>-1</strong> would be the reward for the remaining edges that are not connected to the current node.</li>
<li><strong>Terminal State</strong> - node 5 which is the goal state. Another terminal state would be a node which doesn’t have an outgoing edge (luckily, in this example, there’s no such node).</li>
</ul>
<p>First step is to initialize the Q-value matrix. The Q-value matrix is a <strong>state-action mapping and it’s expected reward</strong>. In this example, the matrix would be 2-dimensional, with the states as the rows and the actions as the columns. The -1 value means that there’s no outgoing edge from node <strong>i</strong> to node <strong>j</strong>.</p>
<p><br clear="all" /></p>
<figure>
<img src="/assets/images/06-30-18/graph-game-matrix.png" width="400" />
<figcaption>Figure 3. Q-value matrix of the graph game.</figcaption>
</figure>
<p><br clear="all" /></p>
<p>Usually, the value of the matrix is arbitrary and the agent will learn by trial and error but in this case, we can initially assign a reward for each state-action mapping.</p>
<p>After initializing the Q-value matrix, we would of course have an initial state <strong>s</strong> (currently at Node 0). From that initial state, we can obtain the optimal action from the Q-value matrix and that action would be to move from Node 0 to Node 4 since other state-action values are -1.</p>
<p>After performing the action, we will receive the associated <strong>reward</strong> of that action which is 0. We will also receive the new state <strong>s’</strong> which is Node 4 since we move from Node 0 to Node 4.</p>
<p>The next step would be updating the value of the <strong>Q(state, action)</strong>, in this case, that would be <strong>Q[0][4]</strong>. The equation for this value update is quite mathematical but don’t be overwhelmed, I will explain this to you as intuitive as possible.</p>
<p><br clear="all" /></p>
<figure>
<img src="/assets/images/06-30-18/q-value-update.png" />
<figcaption>Figure 4. Equation of Q-value update.</figcaption>
</figure>
<p><br clear="all" /></p>
<p>This is just a weighted update of the state-action value. See this equation <script type="math/tex">(1-\alpha) * Q(s_t, a_t)</script>, we just multiply the current value <script type="math/tex">Q(s_t, a_t)</script> to <strong>1 minus the learning rate (alpha)</strong>. The right side has almost the same idea but instead of multiplying using <script type="math/tex">(1-\alpha)</script>, we use the alpha itself <script type="math/tex">\alpha</script> to multiply with learned value <script type="math/tex">(reward + lambda * Q(s', action_{optimal}))</script>. The equation <script type="math/tex">Q(s', action_{optimal})</script> is called the <strong>estimate of the optimal future value</strong>. Why is it called weighted? Because of the fact that <script type="math/tex">(1-\alpha) + \alpha = 1</script>.</p>
<p>After updating the Q-value matrix, the new state <strong>s’</strong> will now be the current state and the loop will start again until convergence. So that’s basically the Q-Learning algorithm.</p>
<p><br clear="all" /></p>
<figure>
<img src="/assets/images/06-30-18/graph-game-final-matrix.png" />
<figcaption>Figure 5. Final matrix after 1000 episodes. The agent learns that going from 0 to 4, then from 4 to 5 is the most optimal choice.</figcaption>
</figure>
<p><br clear="all" /></p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Initialize the Q-value matrix
state = choose an arbitrary initial state (sometimes, it is already given)
while the Q-values have not converged:
action = choose_optimal_action_given_state(state)
reward, next_state = perform_action_on_the_environment(action)
update the Q-value matrix by the equation
Q(state, action) = (1-alpha) * Q(state, action) + alpha * (reward + discount factor * Q(next_state, optimal_action))
state = next_state
</code></pre></div></div>
<h2 id="flappy-bird">Flappy Bird</h2>
<h3 id="defining-the-variables">Defining the variables</h3>
<ul>
<li><strong>Environment</strong> - the environment will be the game itself. The height of the game screen would be <strong>512 pixels</strong> and the width would be <strong>288 pixels</strong>.</li>
<li><strong>Agent</strong> - the bird will be the agent.</li>
<li><strong>State</strong> - for this game, I’ll be using 3 variables as the representation of the state. These would be the <strong>x-axis distance</strong> of the bird to the next pipes, and the <strong>y-axis distances</strong> of the bird to the top and bottom pipe.</li>
<li><strong>Action</strong> - there will only be 2 kinds of action for this scenario: <strong>to jump or not</strong>.</li>
<li><strong>Reward</strong> - for each state that the bird is still alive, the reward would be <strong>1</strong> and if the bird dies on that state, the reward would be <strong>-100</strong>.</li>
<li><strong>Terminal State</strong> - the <strong>only</strong> terminal state would be the state where the bird dies.</li>
</ul>
<h3 id="initialize-the-q-value-matrix">Initialize the Q-value matrix</h3>
<p>First, we will initialize the Q-value matrix with zeroes since we don’t have prior information about the rewards unlike the graph game previously. This would be a <strong>4-dimensional matrix of size 350 x 1024 x 1024 x 2</strong> (<em>you don’t have to visualize this matrix, just think of it as a combination of the state space and the action space</em>).</p>
<p>The <strong>350</strong> comes from the maximum possible <strong>x-axis distance</strong> between the bird and the next pipe. I set this to this value just to be safe.</p>
<p>For the <strong>y-axis distance</strong> between the bird and the next top pipe, the possible values can range from -512 to 512 that’s why I set it to 1024 since there won’t be a negative indexing right? Just add 512 to bound the range from O to 1024. Same goes to the y-axis distance between the bird and the next bottom pipe. If you’re curious why I didn’t set that to a higher value to be safe, I’ll ask you a question: <em>Is there a pipe that’s exactly on the top or bottom pixels?</em> None, right? That’s why it’s safe to use 1024 as the upper bound.</p>
<p><br clear="all" /></p>
<figure>
<img src="http://sarvagyavaish.github.io/FlappyBirdRL/images/StateSpace.png" />
<figcaption>Figure 6. Visualization of the x-axis distance and the y-axis distance of the bird.</figcaption>
</figure>
<p><br clear="all" /></p>
<h3 id="the-problem-with-the-current-state-representation">The problem with the current state representation</h3>
<p>The state space would be too large using these as a representation of a state. The convergence would take a while. I came up with a solution that would <strong>reduce the state space by a thousand times</strong>. So here’s the idea:</p>
<p>Let’s use the x-axis distance of the bird and the next pipe as an example. There’s <strong>350 possible distances</strong> (based on the size) for this variable. <em>What if we can merge the distances 0 to 10 as one, 11 to 20 as one too, same for 21 to 30, and so on.</em></p>
<p>What I mean is, <em>in reality, when a human is playing a Flappy Bird game, and in the scenario of jumping in between pipes, we don’t have to jump at an exact pixel just to get through the pipes, right?</em> It just have to be <strong>right enough</strong> that’s why these 10 pixels would be the approximation of those <strong>right enough</strong> distances. Makes sense right?</p>
<p>This way, we can reduce the 3 variables by a tenth of their original size, resulting to a reduction of states by a magnitude of a thousand (from <script type="math/tex">350 \times 1024 \times 1024 = 3 \times 10^8</script> to <script type="math/tex">35 \times 102 \times 102 = 3 \times 10^5</script>). I hope that makes sense to you.</p>
<h3 id="initialization-of-other-variables">Initialization of other variables</h3>
<p>After initializing the Q-value matrix, we will have an <strong>initial state</strong> that we can get using the <strong>getGameState()</strong> function (<em>see the link below for the source code</em>). We can now retrieve the <strong>optimal action</strong> of the initial state from the Q-value matrix. The agent will then perform this optimal action and the <strong>reward</strong> together with the <strong>next state</strong> will be returned.</p>
<h3 id="update-the-q-value-matrix">Update the Q-value matrix,</h3>
<p>We can now update the <strong>Q(state, optimal_action)</strong> value using the equation that we’ve used before. Set next state <strong>s’</strong> as the current state and loop until convergence.</p>
<h1 id="learning-rate-and-discount-factor-alpha-and-lambda-respectively">Learning rate and discount factor (alpha and lambda, respectively)</h1>
<p>I’ve decided not to discuss these constants but for you to have an idea, this constants are hyperparameters of the algorithm (<em>that means, you must tweak the algorithm using these parameters</em>). The possible values for this are between 0 to 1, inclusive. I’ve set the discount factor to 1 because from what I’ve read before, it is recommended to use a high discount factor if the environment is deterministic (i.e. no randomized results). I’ve set the learning rate to 0.1 since this is the recommended configuration so that the algorithm won’t overshoot.</p>
<h1 id="conclusion">Conclusion</h1>
<p>If you’re interested on the origin of the update equation, you can look up on the Bellman equation. My implementation is still a naive solution, resulting to a maximum score of only 164 after a few thousand episodes (iterations). You can improve the agent by adding more state representation like <strong>y-velocity</strong> of the bird. Later, I’ll upload a video of the agent learning from scratch. For now, you can clone the source code and run the agent on your local machine.</p>
<h1 id="source-code">Source Code</h1>
<p>For the source code, you can view it on this <a href="https://github.com/septa97/flappy-bird-q-learning">GitHub project link</a>.</p>
<h1 id="references">References</h1>
<h2 id="figures">Figures</h2>
<p>Figure 1: <a href="https://towardsdatascience.com/introduction-to-various-reinforcement-learning-algorithms-i-q-learning-sarsa-dqn-ddpg-72a5e0cb6287">https://towardsdatascience.com/introduction-to-various-reinforcement-learning-algorithms-i-q-learning-sarsa-dqn-ddpg-72a5e0cb6287</a><br />
Figure 2 and 3: <a href="https://www.youtube.com/watch?v=A5eihauRQvo">https://www.youtube.com/watch?v=A5eihauRQvo</a><br />
Figure 4: <a href="https://en.wikipedia.org/wiki/Q-learning">https://en.wikipedia.org/wiki/Q-learning</a><br />
Figure 6: <a href="http://sarvagyavaish.github.io/FlappyBirdRL/">http://sarvagyavaish.github.io/FlappyBirdRL/</a></p>
<h2 id="projects">Projects</h2>
<p>This project is highly inspired by these projects:</p>
<p><a href="https://github.com/chncyhn/flappybird-qlearning-bot">https://github.com/chncyhn/flappybird-qlearning-bot</a><br />
<a href="Equation of Q-value updat://github.com/SarvagyaVaish/FlappyBirdRL">https://github.com/SarvagyaVaish/FlappyBirdRL</a></p>
Sat, 30 Jun 2018 00:00:00 +0000
/2018/06/Q-Learning-Flappy-Bird
/2018/06/Q-Learning-Flappy-BirdStep-by-step gradient computation of the Mean Squared Error (MSE)<h2 id="equations">Equations</h2>
<script type="math/tex; mode=display">y_{prediction}(x_0, x_1, ..., x_{m-1}) = w_0x_0 + w_1x_1 + ... + w_{m-1}x_{m-1}</script>
<script type="math/tex; mode=display">MSE = \frac{1}{2n} \sum_{i=0}^{n-1}{(y_{actual} - (w_0x_0 + w_1x_1 + ... + w_{m-1}x_{m-1}))}^2</script>
<p>Gradient of the MSE:</p>
<script type="math/tex; mode=display">\nabla MSE(w_0, w_1, ..., w_{m-1}) = \frac{\partial_{MSE}}{\partial w_0}, \frac{\partial_{MSE}}{\partial w_1}, ..., \frac{\partial_{MSE}}{\partial w_{m-1}}</script>
<p>Let’s take the first partial derivative as an example.</p>
<script type="math/tex; mode=display">u = (y_{actual} - (w_0x_0 + w_1x_1 + ... + w_{m-1}x_{m-1}))</script>
<script type="math/tex; mode=display">MSE = \frac{1}{2n} \sum_{i=0}^{n-1}{u}^2</script>
<p>Using the chain rule:</p>
<script type="math/tex; mode=display">\frac{\partial_{MSE}}{\partial w_0} = \frac{\partial_{MSE}}{\partial u} * \frac{\partial u}{\partial w_0}</script>
<p>The constants are cancelled out.</p>
<script type="math/tex; mode=display">\frac{\partial_{MSE}}{\partial u} = \frac{1}{n} \sum_{i=0}^{n-1}{u}</script>
<p>Only <script type="math/tex">x_0</script> will remain from <script type="math/tex">u = (y_{actual} - (w_0x_0 + w_1x_1 + ... + w_{m-1}x_{m-1}))</script> since all other variables will be treated as a constant except for <script type="math/tex">w_0</script>.</p>
<script type="math/tex; mode=display">\frac{\partial u}{\partial w_0} = x_0</script>
<p>Back to the other equation:</p>
<script type="math/tex; mode=display">\frac{\partial_{MSE}}{\partial w_0} = \frac{\partial_{MSE}}{\partial u} * \frac{\partial u}{\partial w_0}</script>
<script type="math/tex; mode=display">\frac{\partial_{MSE}}{\partial w_0} = \frac{1}{n} \sum_{i=0}^{n-1}{(u)} * x_0</script>
<p>Substitute <script type="math/tex">u</script>.</p>
<script type="math/tex; mode=display">\frac{\partial_{MSE}}{\partial w_0} = \frac{1}{n} \sum_{i=0}^{n-1}{(y_{actual} - (w_0x_0 + w_1x_1 + ... + w_{m-1}x_{m-1}))} * x_0</script>
<p>Substitute the <script type="math/tex">y_{prediction}</script> function.</p>
<script type="math/tex; mode=display">\frac{\partial_{MSE}}{\partial w_0} = \frac{1}{n} \sum_{i=0}^{n-1}{(y_{actual} - y_{prediction})} * x_0</script>
<p>Then do this for all the weights:</p>
<script type="math/tex; mode=display">\frac{\partial_{MSE}}{\partial w_1} = \frac{1}{n} \sum_{i=0}^{n-1}{(y_{actual} - y_{prediction})} * x_1</script>
<script type="math/tex; mode=display">...</script>
<script type="math/tex; mode=display">\frac{\partial_{MSE}}{\partial w_{m-1}} = \frac{1}{n} \sum_{i=0}^{n-1}{(y_{actual} - y_{prediction})} * x_{m-1}</script>
Wed, 06 Jun 2018 00:00:00 +0000
/2018/06/MSE-gradient
/2018/06/MSE-gradientFirst Blog: Implementing Multivariate Regression using NumPy<p>While I’m taking <a href="https://developers.google.com/machine-learning/crash-course/">Google’s Machine Learning Crash Course</a>, I’ve got the idea of implementing a machine learning algorithm from scratch (but with a little help from NumPy for <a href="https://en.wikipedia.org/wiki/Array_programming">vectorization</a> purposes). The first algorithm that I’ve implemented from scratch is Multivariate Regression. For this article’s example, I’ll be using a Linear Regression (a Multivariate Regression model with only one variable) example so that you can easily visualize the graphs.</p>
<p>I’ve used <a href="http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_diabetes.html">this</a> dataset for this example. For graphs, I’ve used <a href="https://matplotlib.org/">Matplotlib</a>. I created a Python class named MultivariateRegression with these parameters on initialization:</p>
<ol>
<li><strong>batch_size</strong> - The number of rows that will be used in computing the gradient. By default, all of the rows will be used in gradient computation.</li>
<li><strong>learning_rate</strong> - A scalar that will be multiplied to the gradient. By default, the value is 0.001.</li>
<li><strong>loss</strong> - Loss function that will be used, either <strong>l1</strong> loss (<strong>absolute difference</strong>) or <strong>l2</strong> loss (<strong>squared difference</strong>). By default, the squared difference will be used. You can read more about their differences <a href="http://rishy.github.io/ml/2015/07/28/l1-vs-l2-loss/">here</a>.</li>
<li><strong>num_epoch</strong> - The number of pass on the dataset. By default, the number of pass is 100.</li>
</ol>
<p>Next would be the <strong>train</strong> function. First, we must add a bias column to the dataset. Think of it as the y-intercept of the line equation <script type="math/tex">y = mx + b</script> which can be generalized to <script type="math/tex">y = w_0x_0 + w_1x_1 + ... + w_{m-1}x_{m-1}</script>. where <script type="math/tex">m</script> is the number of weights/features and <script type="math/tex">w_0</script> is the bias weight. This would result to a matrix where there are <strong>n</strong> rows and <strong>m</strong> columns (the number of features). In linear regression, we must find the best-fit line, which minimizes the error between the actual data and the predicted data (data point along the function line).</p>
<h2 id="mean-squared-error">Mean Squared Error</h2>
<p>The Mean Squared Error (MSE) will be the loss function which uses the squared differences to compute the error (the constant 2 from the denominator is for the derivative of the squared function so that they will cancel-out later).</p>
<script type="math/tex; mode=display">MSE = \frac{1}{2n} \sum_{i=0}^{n-1}{(y_{actual} - y_{prediction}(x_0, x_1, ..., x_{m-1}))}^2</script>
<p>To minimize this error, we must use the Gradient Descent algorithm.</p>
<h2 id="gradient-descent">Gradient Descent</h2>
<p>The main idea is, we have a bowl-like function graph which has the minimum value at the bottom and the goal is to reach the bottom. We can reach the bottom by computing the gradient of the function. The gradient is a <strong>vector of partial derivatives derivation of a function</strong> (I’ll leave the partial derivatives to you). The gradient vector will then be multiplied to the learning rate (<script type="math/tex">\alpha</script>) and then the resulting value will be subtracted from the current weight value. Subtraction is performed since we are minimizing the error. Initially, the weights will be 0 (but you can choose any weights value if you want to experiment).</p>
<script type="math/tex; mode=display">\nabla f(w_0, w_1, ..., w_{m-1}) = \frac{\partial f}{\partial w_0}, \frac{\partial f}{\partial w_1}, ..., \frac{\partial f}{\partial w_{m-1}}</script>
<script type="math/tex; mode=display">\frac{\partial f}{\partial w_0} = \frac{1}{n} \sum_{i=0}^{n-1}{(y_{actual} - y_{prediction}) * x_0^i}</script>
<script type="math/tex; mode=display">\frac{\partial f}{\partial w_1} = \frac{1}{n} \sum_{i=0}^{n-1}{(y_{actual} - y_{prediction}) * x_1^i}</script>
<script type="math/tex; mode=display">...</script>
<script type="math/tex; mode=display">\frac{\partial f}{\partial w_{m-1}} = \frac{1}{n} \sum_{i=0}^{n-1}{(y_{actual} - y_{prediction}) * x_{m-1}^i}</script>
<p><strong>You can view the step-by-step computation of the gradient of the MSE <a href="https://septa97.me/2018/06/MSE-gradient">here</a>.</strong></p>
<p>The new weights will then be computed using this formula:</p>
<script type="math/tex; mode=display">w_0 = w_0 - \alpha * \frac{1}{n} \sum_{i=0}^{n-1}{(y_{actual} - y_{prediction}) * x_0^i}</script>
<script type="math/tex; mode=display">w_1 = w_1 - \alpha * \frac{1}{n} \sum_{i=0}^{n-1}{(y_{actual} - y_{prediction}) * x_1^i}</script>
<script type="math/tex; mode=display">...</script>
<script type="math/tex; mode=display">w_{m-1} = w_{m-1} - \alpha * \frac{1}{n} \sum_{i=0}^{n-1}{(y_{actual} - y_{prediction}) * x_{m-1}^i}</script>
<p>After the number of epochs specified is reached, the training will then stop and the current weights will be used in future predictions. We will have a learned function of the form:</p>
<script type="math/tex; mode=display">y_{prediction}(x_0, x_1, ..., x_{m-1}) = w_0x_0 + w_1x_1 + ... + w_{m-1}x_{m-1}</script>
<figure>
<img src="/assets/images/05-28-18/gradient-descent.png" width="400" />
<figcaption>Figure 1. Visualization of the Gradient Descent algorithm.</figcaption>
</figure>
<p><br clear="all" /></p>
<h2 id="results">Results</h2>
<figure>
<img src="/assets/images/05-28-18/training-dataset.png" width="400" />
<figcaption>Figure 2. Learned function line together with the training dataset.</figcaption>
</figure>
<p><br clear="all" /></p>
<figure>
<img src="/assets/images/05-28-18/testing-dataset.png" width="400" />
<figcaption>Figure 3. Learned function line together with the testing dataset.</figcaption>
</figure>
<p><br clear="all" /></p>
<figure>
<img src="/assets/images/05-28-18/gradient-descent-example.png" width="400" />
<figcaption>Figure 4. Graph of the error per epoch.</figcaption>
</figure>
<p><br clear="all" /></p>
<h2 id="source-code">Source Code</h2>
<p>You can view the source code here: <a href="https://github.com/septa97/pure-ML/blob/master/pureML/supervised_learning/regression/multivariate_regression.py">https://github.com/septa97/pure-ML/blob/master/pureML/supervised_learning/regression/multivariate_regression.py</a></p>
<h2 id="image-references">Image references</h2>
<p>Figure 1: <a href="https://developers.google.com/machine-learning/crash-course/reducing-loss/gradient-descent">https://developers.google.com/machine-learning/crash-course/reducing-loss/gradient-descent</a></p>
Mon, 28 May 2018 00:00:00 +0000
/2018/05/1st-blog
/2018/05/1st-blogPre-Blog<p>Hi. This would be my first site. I will be posting random stuff on this site, anything under the Sun (or maybe even beyond the Sun). I will be updating this site if I have the time.</p>
Mon, 07 May 2018 00:00:00 +0000
/2018/05/pre-blog
/2018/05/pre-blog