<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>Ask Ghassem - Recent questions tagged ml-exercise</title>
<link>https://ask.ghassem.com/tag/ml-exercise</link>
<description>Powered by Question2Answer</description>
<item>
<title>How to calculate feed-forward (forward-propagation) in neural network for classification?</title>
<link>https://ask.ghassem.com/1047/calculate-forward-forward-propagation-network-classification</link>
<description>&lt;p&gt;For the following neural network, calculate accuracy of classification, given these settings&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;img alt=&quot;&quot; height=&quot;1831&quot; src=&quot;https://i.imgur.com/nEyM4qU.jpeg&quot; width=&quot;2179&quot;&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1047/calculate-forward-forward-propagation-network-classification</guid>
<pubDate>Wed, 02 Oct 2024 14:47:26 +0000</pubDate>
</item>
<item>
<title>How to calculate the residual errors, (MSE),(MAE), and (RMSE)?</title>
<link>https://ask.ghassem.com/1031/how-to-calculate-the-residual-errors-mse-mae-and-rmse</link>
<description>&lt;p&gt;Given the following sample dataset with 5 samples and 2 features:&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;1&quot; style=&quot;width:500px&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;th&gt;Sample&lt;/th&gt;
&lt;th&gt;Feature 1&lt;/th&gt;
&lt;th&gt;Feature 2&lt;/th&gt;
&lt;th&gt;Actual Value&lt;/th&gt;
&lt;th&gt;Predicted Value&lt;/th&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;br&gt;
Calculate the residual errors, mean squared error (MSE), mean absolute error (MAE), and root mean squared error (RMSE) using a sample model.&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1031/how-to-calculate-the-residual-errors-mse-mae-and-rmse</guid>
<pubDate>Fri, 27 Jan 2023 04:09:28 +0000</pubDate>
</item>
<item>
<title>How to create a Decision Tree using the ID3 algorithm?</title>
<link>https://ask.ghassem.com/1008/how-to-create-a-decision-tree-using-the-id3-algorithm</link>
<description>&lt;p&gt;NASA wants to be able to discriminate between Martians (M) and Humans (H) based on the&lt;br&gt;
following characteristics: Green ∈{N, Y }, Legs ∈{2, 3}, Height ∈{S, T}, Smelly ∈{N, Y }.&lt;br&gt;
Our available training data is as follows:&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/3bC391L.png&quot;&gt;https://i.imgur.com/3bC391L.png&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;a)&amp;nbsp;&lt;/strong&gt;Greedily learn a decision tree using the ID3 algorithm and draw the tree.&lt;br&gt;
&lt;strong&gt;b)&amp;nbsp;&lt;/strong&gt;Write the learned concept for Martian as a set of conjunctive rules (e.g., if (green=Y&lt;br&gt;
and legs=2 and height=T and smelly=N), then Martian; else if ... then Martian; ...; else&lt;br&gt;
Human).&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1008/how-to-create-a-decision-tree-using-the-id3-algorithm</guid>
<pubDate>Wed, 01 Dec 2021 11:26:02 +0000</pubDate>
</item>
<item>
<title>How to update the weights in backpropagation algorithm when activation function in not linear?</title>
<link>https://ask.ghassem.com/901/update-weights-backpropagation-algorithm-activation-function</link>
<description>&lt;p&gt;The goal of backpropagation is to optimize the weights so that the neural network can learn how to correctly map arbitrary inputs to outputs.&lt;/p&gt;

&lt;p&gt;Assume for the following neural network, inputs = [$i_1,i_2$] = [0.05,&amp;nbsp;0.10], we want the neural network to output = [$o_1$,$o_2$] = [0.01,&amp;nbsp;0.99], and&amp;nbsp;for learning rate, $\alpha=0.5$.&lt;br&gt;
In addition, the activation function for the hidden layer (both $h_1$ and $h_2$)&amp;nbsp;is sigmoid (logistic):&lt;/p&gt;

&lt;p&gt;$S(x)=\frac{1}{1+e^{-x}}$&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/cnY5feu.png&quot;&gt;https://i.imgur.com/cnY5feu.png&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hint:&lt;/strong&gt;&lt;br&gt;
$w_{new} = w_{old} - \alpha \frac{\partial E}{\partial w}$&lt;/p&gt;

&lt;p&gt;$E_{\text {total}}=\sum \frac{1}{2}(\text {target}-\text {output})^{2}$&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;a) &lt;/strong&gt;Show step by step solution to&amp;nbsp;calculate weights $w_1$ to $w_8$ after one update in table below.&lt;br&gt;
&lt;strong&gt;b) &lt;/strong&gt;Calculate initial error and error after one update (assume&amp;nbsp;biases $[b_1,b_2]$ are not changing during the updates).&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;1&quot;&gt;
&lt;caption&gt;Updating weights in backpropagation algorithm&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Weights&lt;/td&gt;
&lt;td&gt;Initialization&lt;/td&gt;
&lt;td&gt;New weights after one step&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w1$&lt;/td&gt;
&lt;td&gt;0.15&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w2$&lt;/td&gt;
&lt;td&gt;0.20&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w3$&lt;/td&gt;
&lt;td&gt;0.25&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w4$&lt;/td&gt;
&lt;td&gt;0.30&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w5$&lt;/td&gt;
&lt;td&gt;0.40&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w6$&lt;/td&gt;
&lt;td&gt;0.45&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w7$&lt;/td&gt;
&lt;td&gt;0.50&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w8$&lt;/td&gt;
&lt;td&gt;0.55&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/901/update-weights-backpropagation-algorithm-activation-function</guid>
<pubDate>Mon, 10 Aug 2020 21:55:19 +0000</pubDate>
</item>
<item>
<title>How to calculate the class probabilities and classify using Naive Bayes classifier?</title>
<link>https://ask.ghassem.com/899/calculate-class-probabilities-classify-using-classifier</link>
<description>&lt;p&gt;We have data on 1000 pieces of fruit. The fruit being a Banana, Orange or some Other fruit and imagine we know 3 features of each fruit, whether it’s long or not, sweet or not and yellow or not, as displayed in the table below:&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/gOFzVXL.png&quot;&gt;https://i.imgur.com/gOFzVXL.png&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A&amp;nbsp;piece of an unknown fruit with these features are provided:&amp;nbsp;Long, Sweet and Yellow.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Calculate probability of each of these 3 classes based on Naive Bayes Classification algorithm and report the class.&lt;/strong&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/899/calculate-class-probabilities-classify-using-classifier</guid>
<pubDate>Mon, 10 Aug 2020 21:26:28 +0000</pubDate>
</item>
<item>
<title>How to calculate residual errors for linear regression and interpret regression metrics?</title>
<link>https://ask.ghassem.com/829/calculate-residual-regression-interpret-regression-metrics</link>
<description>Assuming we have a linear regression equation and some data points (sample), how can we calculate residual error for each data point, and total cost based on the metrics such as MAE, MSE, RMSE, MAPE, or MPE if we have their formula?</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/829/calculate-residual-regression-interpret-regression-metrics</guid>
<pubDate>Tue, 18 Feb 2020 18:30:51 +0000</pubDate>
</item>
<item>
<title>How to calculate the probability and accuracy of a Logistic Regression classifier?</title>
<link>https://ask.ghassem.com/795/calculate-probability-accuracy-logistic-regression-classifier</link>
<description>&lt;p&gt;How to solve this problem?&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/8urywpf.jpg&quot;&gt;https://i.imgur.com/8urywpf.jpg&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Q1) Complete the ? sections&lt;/p&gt;

&lt;p&gt;Q2) Accuracy of system if threshold = 0.5?&lt;/p&gt;

&lt;p&gt;Q3)&amp;nbsp;Accuracy of system if threshold = 0.95?&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/795/calculate-probability-accuracy-logistic-regression-classifier</guid>
<pubDate>Mon, 03 Feb 2020 20:31:49 +0000</pubDate>
</item>
<item>
<title>How to calculate Accuracy, Precision, Recall or F1?</title>
<link>https://ask.ghassem.com/789/how-to-calculate-accuracy-precision-recall-or-f1</link>
<description>&lt;p&gt;In the following example, calculate&amp;nbsp;Accuracy, Precision, Recall or F1?&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/OezFpqC.png&quot;&gt;https://i.imgur.com/OezFpqC.png&lt;/a&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/789/how-to-calculate-accuracy-precision-recall-or-f1</guid>
<pubDate>Mon, 27 Jan 2020 19:22:26 +0000</pubDate>
</item>
<item>
<title>How to perform a classification or regression using k-NN?</title>
<link>https://ask.ghassem.com/658/how-to-perform-a-classification-or-regression-using-k-nn</link>
<description>&lt;p&gt;Suppose, you have given the following dataset where x and y are the 2 features and color Red or Blue&amp;nbsp;is the target variable.&lt;/p&gt;

&lt;p&gt;a) A new&amp;nbsp;data point $x=1$ and $y=1$ is given. Using Euclidean distance in 3-NN, what you predict as the color for this data point?&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:300px; width:200px&quot;&gt;
&lt;caption&gt;Dataset&lt;/caption&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot;&gt;x&lt;/th&gt;
&lt;th scope=&quot;col&quot;&gt;y&lt;/th&gt;
&lt;th scope=&quot;col&quot;&gt;Color&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Red&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Blue&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Red&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;td&gt;Red&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;Blue&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Blue&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Red&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Blue&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;b) Now assume we have the following dataset and the target value is the price.&amp;nbsp;A new&amp;nbsp;data point $x=1$ and $y=1$ is given. Using Euclidean distance in 3-NN. What would be the estimated price?&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:300px; width:200px&quot;&gt;
&lt;caption&gt;Dataset&lt;/caption&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot;&gt;x&lt;/th&gt;
&lt;th scope=&quot;col&quot;&gt;y&lt;/th&gt;
&lt;th scope=&quot;col&quot;&gt;Price&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;$100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;$50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;$20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;td&gt;$40&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;$30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;$40&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;$70&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;$30&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/658/how-to-perform-a-classification-or-regression-using-k-nn</guid>
<pubDate>Thu, 27 Jun 2019 02:54:42 +0000</pubDate>
</item>
<item>
<title>How to calculate k-means clustering with a numerical example?</title>
<link>https://ask.ghassem.com/656/how-to-calculate-k-means-clustering-with-numerical-example</link>
<description>&lt;p&gt;Use the k-means algorithm and Euclidean distance to cluster the following 8 examples into 3 clusters:&lt;/p&gt;

&lt;p&gt;$A1=(2,10),&amp;nbsp;A2=(2,5), A3=(8,4), A4=(5,8), A5=(7,5), A6=(6,4), A7=(1,2), A8=(4,9)$.&lt;/p&gt;

&lt;p&gt;Suppose that the initial seeds (centers of each cluster) are $A1$, $A4$ and $A7$. Run the k-means algorithm for 1 epoch only. At the end of this epoch show:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;a)&lt;/strong&gt; The new clusters (i.e. the examples belonging to each cluster)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;b)&lt;/strong&gt; The centers of the new clusters&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;c)&lt;/strong&gt; Draw a 10 by 10 space with all the 8 points and show the clusters after the first epoch and the new centroids.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;d)&lt;/strong&gt; How many more iterations are needed to converge? Draw the result for each epoch&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/656/how-to-calculate-k-means-clustering-with-numerical-example</guid>
<pubDate>Thu, 27 Jun 2019 02:16:32 +0000</pubDate>
</item>
<item>
<title>How to calculate the class probabilities and classify using Naive Bayes classifier for NLP?</title>
<link>https://ask.ghassem.com/654/calculate-class-probabilities-classify-using-classifier</link>
<description>&lt;p&gt;We want to use Naive Bayes for tagging documents. It is a classification task that we want to assign a class (tag) to each string. We currently have two tags: &lt;strong&gt;Sport &lt;/strong&gt;and&lt;strong&gt; &lt;/strong&gt;&lt;strong&gt;Not Sport&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Which tag does the sentence&amp;nbsp;&lt;strong&gt;&lt;em&gt;A very close game&lt;/em&gt;&amp;nbsp;&lt;/strong&gt;belong to? Using Naive Bayes classifier, calculate the class probability for &lt;strong&gt;Sport&lt;/strong&gt; and &lt;strong&gt;Not sport &lt;/strong&gt;for this sentence based on&amp;nbsp;the dataset and decide about the tag.&lt;/p&gt;

&lt;table border=&quot;1px&quot; cellpadding=&quot;1px&quot; style=&quot;width:500px&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Text&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Tag&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;“A great game”&lt;/td&gt;
&lt;td&gt;Sports&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;“The election was over”&lt;/td&gt;
&lt;td&gt;Not sports&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;“Very clean match”&lt;/td&gt;
&lt;td&gt;Sports&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;“A clean but forgettable game”&lt;/td&gt;
&lt;td&gt;Sports&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;“It was a close election”&lt;/td&gt;
&lt;td&gt;Not sports&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/654/calculate-class-probabilities-classify-using-classifier</guid>
<pubDate>Wed, 26 Jun 2019 19:43:41 +0000</pubDate>
</item>
<item>
<title>How to calculate Covariance Matrix and Principal Components for PCA?</title>
<link>https://ask.ghassem.com/652/how-calculate-covariance-matrix-and-principal-components</link>
<description>&lt;p&gt;The dataset with two features $(x,y)$ is shown as follows (note $y$ in this example is the second feature, not a target value):&lt;/p&gt;

&lt;table border=&quot;01&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:50px&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;th&gt;x&lt;/th&gt;
&lt;th&gt;y&lt;/th&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2.5&lt;/td&gt;
&lt;td&gt;2.4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0.5&lt;/td&gt;
&lt;td&gt;0.7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2.2&lt;/td&gt;
&lt;td&gt;2.9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.9&lt;/td&gt;
&lt;td&gt;2.2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3.1&lt;/td&gt;
&lt;td&gt;3.0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2.3&lt;/td&gt;
&lt;td&gt;2.7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2.0&lt;/td&gt;
&lt;td&gt;1.6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;td&gt;1.1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.5&lt;/td&gt;
&lt;td&gt;1.6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.1&lt;/td&gt;
&lt;td&gt;0.9&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;a) Calculate the Covariance Matrix.&lt;br&gt;
b) Calculate eigenvalues and eigenvectors&lt;br&gt;
c) Calculate all the PCs&lt;br&gt;
d) How much percent of the total variance in the dataset is explained by each PC?&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/652/how-calculate-covariance-matrix-and-principal-components</guid>
<pubDate>Wed, 26 Jun 2019 10:40:02 +0000</pubDate>
</item>
<item>
<title>How to calculate convolutions on a CONV layer for a Convolutional Neural Network?</title>
<link>https://ask.ghassem.com/650/calculate-convolutions-layer-convolutional-neural-network</link>
<description>&lt;p&gt;Assume we have a $5\times5$ px&amp;nbsp;RGB image with 3&amp;nbsp;channels respectively for R, G, and B. If&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;R&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;G&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;B&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;We have one&amp;nbsp;$3\times3$ px kernel (filter) with 3 channels as follows:&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;Filter - R&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;Filter - G&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;Filter - B&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;a)&lt;/strong&gt; If&amp;nbsp;&lt;strong&gt;Stride = 2&lt;/strong&gt;,&lt;strong&gt; &lt;/strong&gt;and&lt;strong&gt;&amp;nbsp;Zero-padding = 1&lt;/strong&gt;, and &lt;strong&gt;Bias&amp;nbsp;= 1&lt;/strong&gt;, what will be the result of convolution?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;b)&lt;/strong&gt; What is the result after applying a &lt;strong&gt;ReLU&amp;nbsp;layer ($max(z,0)$)&lt;/strong&gt;on the result with the same size of the reuslt&amp;nbsp;in part a?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;c)&lt;/strong&gt; Calculate the output&amp;nbsp;by applying &lt;strong&gt;max-pooling&lt;/strong&gt; layer with the size of $2\times2$ on the output of part b, and &lt;strong&gt;Stride = 1&lt;/strong&gt;. (hint: max-pooling layer here and&amp;nbsp;usually do not include any zero-paddings)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;d)&lt;/strong&gt; What is the result after applying &lt;strong&gt;flatten&lt;/strong&gt; on the output of part c and creating a vector?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;e)&lt;/strong&gt; Assume the vector you created contains m elements. Consider it as the input vector for a &lt;strong&gt;Softmax&lt;/strong&gt; &lt;strong&gt;Regression classifier&amp;nbsp;&lt;/strong&gt;(without any hidden layers and biases and it is fully connected). Assume there are 2 classes of 0 and 1. For all the weights from&amp;nbsp;each element in the feature vector, the optimized weights are 1 for odd elements and 2 for even elements. For example, if the feature vector is [10,11,12,13,14], all the weights &lt;strong&gt;from &lt;/strong&gt;10 are 1 (because 10 is element 1 and 1 is odd), all the weights &lt;strong&gt;from&lt;/strong&gt; 11 are 2, all the weights &lt;strong&gt;from&lt;/strong&gt; 12 are&amp;nbsp;1, all the weights &lt;strong&gt;from&lt;/strong&gt; 13 are&amp;nbsp;2 and all the weights &lt;strong&gt;from&lt;/strong&gt; 14 are 1 and so on. Draw the&amp;nbsp;Softmax&amp;nbsp;Regression network and calculate the class should be 0 or 1?&lt;/p&gt;

&lt;p&gt;Hint:&amp;nbsp;&lt;br&gt;
&lt;strong&gt;Softmax Regression:&lt;/strong&gt;&amp;nbsp;$p_{i}=\frac{e^{z_{i}}}{\sum_{i=1}^{c} e^{z_{i}}}$&lt;br&gt;
Where $p_{i}$ is the probability of class $i$ anc $c$ is the number of classes.&lt;/p&gt;</description>
<category>Deep Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/650/calculate-convolutions-layer-convolutional-neural-network</guid>
<pubDate>Wed, 26 Jun 2019 08:54:12 +0000</pubDate>
</item>
<item>
<title>How to optimize weights in Logistic Regression?</title>
<link>https://ask.ghassem.com/639/how-to-optimize-weights-in-logistic-regression</link>
<description>&lt;p&gt;The hypothesis (model) of Logistic Regression which is a binary classifier&amp;nbsp;( $y =\{0,1\} $) is given in the equation below:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hypothesis&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;$S(z)=P(y=1 | x)=h_{\theta}(x)=\frac{1}{1+\exp \left(-\theta^{\top} x\right)}$&lt;/p&gt;

&lt;p&gt;Which calculates probability of Class 1, and by setting a threshold (such as $h_{\theta}(x) &amp;gt; 0.5 $) we can classify to 1, or 0.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost function&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The cost function for Logistic Regression is defined as below. It is called&amp;nbsp;&lt;em&gt;binary cross entropy loss function&lt;/em&gt;&lt;strong&gt;:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;$J(\theta)=-\frac{1}{m} \sum_{i}^{m}\left(y^{(i)} \log \left(h_{\theta}\left(x^{(i)}\right)\right)+\left(1-y^{(i)}\right) \log \left(1-h_{\theta}\left(x^{(i)}\right)\right)\right)$&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Iterative updates&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Assume we start all the model parameters&amp;nbsp;with a random number (in this case the only model parameters we have are&amp;nbsp;$\theta_j$ and assume we initialized all of them with 1:&amp;nbsp;&amp;nbsp;for all $\theta_j = 1$ for $j=\{0,1,...,n\}$ and $n$ is the number of features we have)&lt;/p&gt;

&lt;p&gt;$\theta_{j_{n e w}} \leftarrow \theta_{j_{o l d}}+\alpha \times \frac{1}{m} \sum_{i=1}^{m}\left[y^{(i)}-\sigma\left(\theta_{j_{o l d}}^{\top}\left(x^{(i)}\right)\right)\right] x_{j}^{(i)}$&lt;/p&gt;

&lt;p&gt;Where:&lt;br&gt;
$m =$ number of rows in the training batch&lt;br&gt;
$x^{(i)} = $ the feature &lt;em&gt;vector&lt;/em&gt; for sample $i$&lt;br&gt;
$\theta_j = $ the coefficient &lt;em&gt;vector &lt;/em&gt;corresponding the features&lt;br&gt;
$y^{(i)} = $ actual class label for sample $i$ in the training batch&lt;br&gt;
$x_{j}^{(i)} = $ the element (column) $j$ in&amp;nbsp;the feature &lt;em&gt;vector&lt;/em&gt; for sample $i$&lt;br&gt;
$\alpha =$ the learning rate&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dataset&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The training dataset of pass/fail in an exam for 5 students is given in the table below:&lt;br&gt;
&lt;img alt=&quot;&quot; height=&quot;203&quot; src=&quot;https://i.imgur.com/aVDAxTj.png&quot; width=&quot;300&quot;&gt;&lt;/p&gt;

&lt;p&gt;If we initialize all the model parameters with 1 (all $\theta_j = 1$), and the learning rate is $\alpha = 0.1$, and if we use &lt;strong&gt;batch gradient descent&lt;/strong&gt;, what will be the:&lt;/p&gt;

&lt;p&gt;$a)$ Accuracy of the model at initialization of the train set ($\text{accuracy} = \frac{\text{number of correct classifications}}{\text{all classifications}}$)?&lt;br&gt;
$b)$&amp;nbsp;Cost at initialization?&lt;br&gt;
$c)$ Cost after 1 epoch?&lt;br&gt;
$d)$ Repeat all $a,b,c$ steps if we use &lt;strong&gt;mini-batch gradient descent &lt;/strong&gt;and&lt;strong&gt;&amp;nbsp;&lt;/strong&gt;$\text{batch size} = 2$&lt;/p&gt;

&lt;p&gt;(Hint: For $x_{j}^{(i)}$ when $j=0$ we have&amp;nbsp;$x_{0}^{(i)}&amp;nbsp; = 1$ for all $i$ )&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/639/how-to-optimize-weights-in-logistic-regression</guid>
<pubDate>Wed, 05 Jun 2019 17:38:50 +0000</pubDate>
</item>
<item>
<title>How to update weights in backpropagation algorithm (a numerical example)?</title>
<link>https://ask.ghassem.com/612/update-weights-backpropagation-algorithm-numerical-example</link>
<description>&lt;p&gt;Assume we have the following neural network and all activation functions are $f(z)=z$. If the weights are initialized with the values you see in table below, what will be new updated weights after one step if learning rate, $\alpha = 0.05$?&lt;/p&gt;

&lt;p&gt;Assume the input values are [$i_1$,$i_2$] = [2,3] and target value&amp;nbsp;$out = 1$.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hint:&lt;/strong&gt;&lt;br&gt;
$w_{new} = w_{old} - \alpha \frac{\partial E}{\partial w}$&lt;/p&gt;

&lt;p&gt;$E_{\text {total}}=\sum \frac{1}{2}(\text {target}-\text {output})^{2}$&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;1&quot; style=&quot;height:225px; width:394px&quot;&gt;
&lt;caption&gt;Updating weights in backpropagation algorithm&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Weights&lt;/td&gt;
&lt;td&gt;Initialization&lt;/td&gt;
&lt;td&gt;New weights after one step&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w1$&lt;/td&gt;
&lt;td&gt;0.11&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w2$&lt;/td&gt;
&lt;td&gt;0.21&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w3$&lt;/td&gt;
&lt;td&gt;0.12&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w4$&lt;/td&gt;
&lt;td&gt;0.08&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w5$&lt;/td&gt;
&lt;td&gt;0.14&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w6$&lt;/td&gt;
&lt;td&gt;0.15&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/v0RMeOQ.png&quot;&gt;https://i.imgur.com/v0RMeOQ.png&lt;/a&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/612/update-weights-backpropagation-algorithm-numerical-example</guid>
<pubDate>Thu, 11 Apr 2019 17:02:04 +0000</pubDate>
</item>
<item>
<title>How to calculate univariate linear regression?</title>
<link>https://ask.ghassem.com/610/how-to-calculate-univariate-linear-regression</link>
<description>&lt;p&gt;For the following dataset, calculate the regression equation $\hat{y} = ax+b$&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;1&quot; style=&quot;height:246px; width:213px; border-spacing: 1px;&quot;&gt;
&lt;caption&gt;dataset&lt;/caption&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot;&gt;x&lt;/th&gt;
&lt;th scope=&quot;col&quot;&gt;y&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;42&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;75&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;26&lt;/td&gt;
&lt;td&gt;150&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;36&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;


</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/610/how-to-calculate-univariate-linear-regression</guid>
<pubDate>Thu, 11 Apr 2019 16:46:47 +0000</pubDate>
</item>
<item>
<title>How to calculate Softmax Regression probabilities in this example?</title>
<link>https://ask.ghassem.com/605/calculate-softmax-regression-probabilities-this-example</link>
<description>&lt;p&gt;The scatter plot of Iris Dataset is shown in the figure below. Assume&lt;strong&gt;&amp;nbsp;Softmax Regression&lt;/strong&gt;&amp;nbsp;is used to classify Iris to Setosa, Versicolor, or Viriginica&amp;nbsp;using just petal length and petal width. If&amp;nbsp; weights required for Softmax&amp;nbsp;Regression initialized to 1 for class Setosa, 2 for class Versicolor, and 3 for Virginica,&lt;/p&gt;

&lt;p&gt;1) What will be the probability of an iris with petal&amp;nbsp;length = 4.6&amp;nbsp; and petal width = 1.7 to be classified as Virginica?&amp;nbsp;&lt;/p&gt;

&lt;p&gt;2) What will be the probability of Virginica, if we use all features&amp;nbsp;petal&amp;nbsp;length = 4.6&amp;nbsp; and petal width = 1.7, sepal length = 5.5 and sepal width = 3.0&amp;nbsp;with the same weight initialization?&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;img alt=&quot;&quot; src=&quot;https://i.imgur.com/CezSTPM.png&quot;&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/605/calculate-softmax-regression-probabilities-this-example</guid>
<pubDate>Thu, 04 Apr 2019 18:20:53 +0000</pubDate>
</item>
<item>
<title>How to calculate feed-forward (forward-propagation) in neural network?</title>
<link>https://ask.ghassem.com/603/calculate-feed-forward-forward-propagation-neural-network</link>
<description>&lt;p&gt;In the figure&amp;nbsp;below, a neural network is shown. Calculate the following:&lt;/p&gt;

&lt;p&gt;1) How many neurons do we have in the input layer and the output layer?&lt;/p&gt;

&lt;p&gt;2) How many hidden layers do we have?&lt;/p&gt;

&lt;p&gt;3) If all the weights initialized with 1 ($w1=w2=w3=...=w19=1$), what is the output of this network after feed-forward for the sample shown in the figure&amp;nbsp;(X = (x1,x2,x3) = (2,5,3) and y=10)? What is the error of the network ($\text { Error }=\frac{1}{2}(\hat{y}-y)^{2}$)? Assume activation functions for all neurons except the output neuron is $f(z)=z$.&amp;nbsp;&lt;br&gt;
&lt;br&gt;
4) If we change the activation function of all&amp;nbsp;the neurons in the second hidden layer to Sigmoid ($S(x)=\frac{1}{1+e^{-x}}=\frac{e^{x}}{e^{x}+1}$), what would be the output of the network after this change? Calculate the error as well.&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/rtqPiRa.jpg&quot;&gt;https://i.imgur.com/rtqPiRa.jpg&lt;/a&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/603/calculate-feed-forward-forward-propagation-neural-network</guid>
<pubDate>Thu, 04 Apr 2019 15:54:17 +0000</pubDate>
</item>
<item>
<title>How to update weights using gradient decent algorithm?</title>
<link>https://ask.ghassem.com/596/how-to-update-weights-using-gradient-decent-algorithm</link>
<description>&lt;p&gt;For the&amp;nbsp;below neural network, imagine we are going to use&amp;nbsp;the&amp;nbsp;&lt;strong&gt;backpropagation algorithm&lt;/strong&gt; to update weights. If the Bias (b) in this problem is always 0 (ignore bias when you solve the problem), and we have a dataset with only one record of $x=2$ and the target value of $y=5$ as you can see in the following table,&amp;nbsp;and activation function&amp;nbsp;is defined as $f(z) = z$&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;1&quot; style=&quot;width:200px&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot;&gt;feature (x)&lt;/th&gt;
&lt;th scope=&quot;col&quot;&gt;Target (y)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;1) Define the cost function, $J(w)$, based on the error in backpropagation algorithm: $J(w) = E = \frac{1}{2}(predicted - target)^2$, and draw it&lt;/p&gt;

&lt;p&gt;2) Initialize the weight by $w=3$, and calculate the error&lt;/p&gt;

&lt;p&gt;3) Calculate updated weights using the gradient&amp;nbsp;decent algorithm &lt;strong&gt;after three updates &lt;/strong&gt;if we have the following values for learning rate ($\alpha$)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$\alpha$ = 1&lt;/li&gt;
&lt;li&gt;$\alpha$ = 0.1&lt;/li&gt;
&lt;li&gt;$\alpha$ = 0.5&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hint:&amp;nbsp; &amp;nbsp;$w_{new} = w_{old} - \alpha \frac{\partial E}{\partial w}$&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/uohFS6l.png&quot;&gt;https://i.imgur.com/uohFS6l.png&lt;/a&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/596/how-to-update-weights-using-gradient-decent-algorithm</guid>
<pubDate>Thu, 28 Mar 2019 17:17:39 +0000</pubDate>
</item>
<item>
<title>How to calculate Softmax Regression probabilities?</title>
<link>https://ask.ghassem.com/591/how-to-calculate-softmax-regression-probabilities</link>
<description>&lt;p&gt;The scatter plot of Iris Dataset is shown in the figure below. Assume Softmax Regression is used to classify Iris to Setosa, Versicolor, or Viriginica using just petal length and petal width. If all the weights required for Softmax Regression initialized to 0.5 and the network includes bias nodes:&lt;br&gt;
&lt;br&gt;
1) Write the weight vectors and equations for calculating the class probabilities.&lt;br&gt;
&lt;br&gt;
2) We have a new iris and we have measured petal length = 4.5 &amp;nbsp;and petal width = 1.6. Using the above initial model, what would be the result of classification?&lt;br&gt;
&lt;br&gt;
3) If we change all the weights related to the class blue to 1 and keep all other weights 0.5, what will be the predicted class?&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;&quot; src=&quot;https://i.imgur.com/CezSTPM.png&quot;&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/591/how-to-calculate-softmax-regression-probabilities</guid>
<pubDate>Thu, 21 Mar 2019 16:11:09 +0000</pubDate>
</item>
<item>
<title>How to calculate LogLoss in logistic regression?</title>
<link>https://ask.ghassem.com/588/how-to-calculate-logloss-in-logistic-regression</link>
<description>&lt;p&gt;The dataset of pass/fail in an exam for 5 students is given in the table below. If we use&amp;nbsp;&lt;strong&gt;Logistic Regression&lt;/strong&gt;&amp;nbsp;as the classifier and assume the model suggested by the optimizer will become the following for Odds of passing a course:&lt;/p&gt;

&lt;p&gt;$\log_e(Odds) = -64 + 2 \times hours$&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;&quot; height=&quot;203&quot; src=&quot;https://i.imgur.com/aVDAxTj.png&quot; width=&quot;300&quot;&gt;&lt;/p&gt;

&lt;p&gt;1) How to calculate&amp;nbsp;&lt;strong&gt;the loss of model&lt;/strong&gt;&amp;nbsp;for the student who studied 33 hours?&amp;nbsp;&lt;/p&gt;

&lt;p&gt;2) What is the &lt;strong&gt;total loss &lt;/strong&gt;of the model given in equation below?&amp;nbsp;&lt;/p&gt;

&lt;p&gt;$Logloss = -\frac{1}{N} \sum_{i=1}^N(y_i\log_e(p_i) + (1 - y_i)\log_e(1 - p_i))$&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/588/how-to-calculate-logloss-in-logistic-regression</guid>
<pubDate>Mon, 18 Mar 2019 20:34:40 +0000</pubDate>
</item>
<item>
<title>How to calculate probability in Logistic Regression?</title>
<link>https://ask.ghassem.com/587/how-to-calculate-probability-in-logistic-regression</link>
<description>&lt;p&gt;The dataset of pass/fail in an exam for 5 students is given in the table below. If we use&amp;nbsp;&lt;strong&gt;Logistic Regression&lt;/strong&gt;&amp;nbsp;as the classifier and assume the model suggested by the optimizer will become the following for Odds of passing a course:&lt;/p&gt;

&lt;p&gt;$\log (Odds) = -64 + 2 \times hours$&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;&quot; height=&quot;203&quot; src=&quot;https://i.imgur.com/aVDAxTj.png&quot; width=&quot;300&quot;&gt;&lt;/p&gt;

&lt;p&gt;1) How to calculate the&amp;nbsp;&lt;strong&gt;probability of Pass&lt;/strong&gt;&amp;nbsp;for the student who studied 33 hours?&amp;nbsp;&lt;/p&gt;

&lt;p&gt;2) &lt;strong&gt;At least how many hours &lt;/strong&gt;the student should study that makes sure will pass the course with the probability of more than 95%?&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/587/how-to-calculate-probability-in-logistic-regression</guid>
<pubDate>Mon, 18 Mar 2019 20:22:35 +0000</pubDate>
</item>
</channel>
</rss>