<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>Ask Ghassem - Recent questions tagged deep-learning</title>
<link>https://ask.ghassem.com/tag/deep-learning</link>
<description>Powered by Question2Answer</description>
<item>
<title>Step-by-Step Hidden State Calculation in a Recurrent Neural Network</title>
<link>https://ask.ghassem.com/1049/step-step-hidden-state-calculation-recurrent-neural-network</link>
<description>&lt;p&gt;Consider a simplified Recurrent Neural Network (RNN) with a single input and a single output. The hidden state is updated using the recurrence:&lt;/p&gt;

&lt;p&gt;$$ h_t = \text{ReLU}(W_{ih} \cdot x_t + W_{hh} \cdot h_{t-1}) $$&lt;/p&gt;

&lt;p&gt;Assume the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;\( x_t = 3 \) for every time step&lt;/li&gt;
&lt;li&gt;\( h_0 = 0 \)&lt;/li&gt;
&lt;li&gt;\( W_{ih} = 0.4 \)&lt;/li&gt;
&lt;li&gt;\( W_{hh} = 0.6 \)&lt;/li&gt;
&lt;li&gt;Activation function: ReLU&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Compute the value of the hidden state \( h_4 \) at time \( t = 4 \).&lt;/strong&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1049/step-step-hidden-state-calculation-recurrent-neural-network</guid>
<pubDate>Mon, 01 Dec 2025 18:32:24 +0000</pubDate>
</item>
<item>
<title>How to calculate feed-forward (forward-propagation) in neural network for classification?</title>
<link>https://ask.ghassem.com/1047/calculate-forward-forward-propagation-network-classification</link>
<description>&lt;p&gt;For the following neural network, calculate accuracy of classification, given these settings&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;img alt=&quot;&quot; height=&quot;1831&quot; src=&quot;https://i.imgur.com/nEyM4qU.jpeg&quot; width=&quot;2179&quot;&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1047/calculate-forward-forward-propagation-network-classification</guid>
<pubDate>Wed, 02 Oct 2024 14:47:26 +0000</pubDate>
</item>
<item>
<title>Bankruptcy prediction and credit card</title>
<link>https://ask.ghassem.com/1021/bankruptcy-prediction-and-credit-card</link>
<description>Hello everyone newbie data scientist here.&lt;br /&gt;
I&amp;#039;m working on a project to predict companies (probability of default) bankruptcy probability and to assign them a credit rating/score based on that :&lt;br /&gt;
For example below 50 probability is good and above is bad ( just for the example)&lt;br /&gt;
I have a dataset contains financial ratios and a class refers if the company is bankrupted or not (0 and one).&lt;br /&gt;
I&amp;#039;m planning to use this models:&lt;br /&gt;
Logistic regression linear discrimination analysis, decision trees, random forest, ANN, adaboost, Svm.&lt;br /&gt;
&lt;br /&gt;
The question is and i know it is a dumb question:&lt;br /&gt;
Does those models return a probability? Which i can transform to labels, I saw that in a thesis and I&amp;#039;m not sure about it.&lt;br /&gt;
&lt;br /&gt;
Otherwise, any guidance,tips anything will be appreciated.</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1021/bankruptcy-prediction-and-credit-card</guid>
<pubDate>Sun, 10 Apr 2022 05:50:14 +0000</pubDate>
</item>
<item>
<title>how many samples do we need to test image segmentation using synthetic data ?</title>
<link>https://ask.ghassem.com/993/many-samples-need-test-image-segmentation-using-synthetic</link>
<description>Hello,&lt;br /&gt;
&lt;br /&gt;
I trained a CNN using synthetic data to perform a segmentation task on human faces. During the test and to evaluate the prediction of this network, I used 200 examples from the database to compute precision and recall.&lt;br /&gt;
&lt;br /&gt;
Is this number sufficient, knowing that I control myself the data generator and that I build the database by randomly drawing the elements using centered Gaussian distributions.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Thank you,</description>
<category>Deep Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/993/many-samples-need-test-image-segmentation-using-synthetic</guid>
<pubDate>Mon, 21 Jun 2021 12:26:32 +0000</pubDate>
</item>
<item>
<title>Binary Classification and neutral tag</title>
<link>https://ask.ghassem.com/978/binary-classification-and-neutral-tag</link>
<description>&lt;p&gt;I am trying to create a sentiment analysis model using binary classification as loss.I have a batch of tweets that some of them are tagged as positive (labeled as 1)&amp;nbsp;and&amp;nbsp;negative (labeled as 0).I manage to gather some tweets that are tagged as neutral but there are less&amp;nbsp; tweets than positive and negative.My thinking is to tag them with 0.5 to balance the classification probability.Is this legit?&lt;/p&gt;

&lt;div id=&quot;gtx-trans&quot; style=&quot;position: absolute; left: 460px; top: 54px;&quot;&gt;
&lt;div class=&quot;gtx-trans-icon&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;/div&gt;</description>
<category>Deep Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/978/binary-classification-and-neutral-tag</guid>
<pubDate>Sat, 30 Jan 2021 10:08:01 +0000</pubDate>
</item>
<item>
<title>&quot;Rare words&quot; on vocabulary</title>
<link>https://ask.ghassem.com/977/rare-words-on-vocabulary</link>
<description>I am trying to create a sentiment analysis model and I have a question.&lt;br /&gt;
&lt;br /&gt;
After I preprocessed my tweets and created my vocabulary I&amp;#039;ve noticed that I have words that appear less than 5 times in my dataset (Also there are many of them that appear 1 time). Many of them are real words and not gibberish. My thinking is that if I keep those words then they will get wrong &amp;quot;sentimental&amp;quot; weights and gonna make my model worse.&lt;br /&gt;
Is my thinking right or am I missing something?&lt;br /&gt;
&lt;br /&gt;
My vocab size is around 40000 words and those that are &amp;quot;rare&amp;quot; are around 10k.Should I &amp;quot;sacrifice&amp;quot; them?&lt;br /&gt;
&lt;br /&gt;
Thanks in advance.</description>
<category>Deep Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/977/rare-words-on-vocabulary</guid>
<pubDate>Sat, 30 Jan 2021 09:57:31 +0000</pubDate>
</item>
<item>
<title>How to update the weights in backpropagation algorithm when activation function in not linear?</title>
<link>https://ask.ghassem.com/901/update-weights-backpropagation-algorithm-activation-function</link>
<description>&lt;p&gt;The goal of backpropagation is to optimize the weights so that the neural network can learn how to correctly map arbitrary inputs to outputs.&lt;/p&gt;

&lt;p&gt;Assume for the following neural network, inputs = [$i_1,i_2$] = [0.05,&amp;nbsp;0.10], we want the neural network to output = [$o_1$,$o_2$] = [0.01,&amp;nbsp;0.99], and&amp;nbsp;for learning rate, $\alpha=0.5$.&lt;br&gt;
In addition, the activation function for the hidden layer (both $h_1$ and $h_2$)&amp;nbsp;is sigmoid (logistic):&lt;/p&gt;

&lt;p&gt;$S(x)=\frac{1}{1+e^{-x}}$&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/cnY5feu.png&quot;&gt;https://i.imgur.com/cnY5feu.png&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hint:&lt;/strong&gt;&lt;br&gt;
$w_{new} = w_{old} - \alpha \frac{\partial E}{\partial w}$&lt;/p&gt;

&lt;p&gt;$E_{\text {total}}=\sum \frac{1}{2}(\text {target}-\text {output})^{2}$&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;a) &lt;/strong&gt;Show step by step solution to&amp;nbsp;calculate weights $w_1$ to $w_8$ after one update in table below.&lt;br&gt;
&lt;strong&gt;b) &lt;/strong&gt;Calculate initial error and error after one update (assume&amp;nbsp;biases $[b_1,b_2]$ are not changing during the updates).&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;1&quot;&gt;
&lt;caption&gt;Updating weights in backpropagation algorithm&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Weights&lt;/td&gt;
&lt;td&gt;Initialization&lt;/td&gt;
&lt;td&gt;New weights after one step&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w1$&lt;/td&gt;
&lt;td&gt;0.15&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w2$&lt;/td&gt;
&lt;td&gt;0.20&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w3$&lt;/td&gt;
&lt;td&gt;0.25&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w4$&lt;/td&gt;
&lt;td&gt;0.30&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w5$&lt;/td&gt;
&lt;td&gt;0.40&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w6$&lt;/td&gt;
&lt;td&gt;0.45&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w7$&lt;/td&gt;
&lt;td&gt;0.50&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w8$&lt;/td&gt;
&lt;td&gt;0.55&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/901/update-weights-backpropagation-algorithm-activation-function</guid>
<pubDate>Mon, 10 Aug 2020 21:55:19 +0000</pubDate>
</item>
<item>
<title>Pre trainned word Embeddings and Preproceess</title>
<link>https://ask.ghassem.com/849/pre-trainned-word-embeddings-and-preproceess</link>
<description>How should i preprocess my data if i am gonna use a pretrainned word embedding like glove or word2vec?Should I use stemming or stopword removal techniques?</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/849/pre-trainned-word-embeddings-and-preproceess</guid>
<pubDate>Fri, 10 Apr 2020 12:08:09 +0000</pubDate>
</item>
<item>
<title>How to calculate convolutions on a CONV layer for a Convolutional Neural Network?</title>
<link>https://ask.ghassem.com/650/calculate-convolutions-layer-convolutional-neural-network</link>
<description>&lt;p&gt;Assume we have a $5\times5$ px&amp;nbsp;RGB image with 3&amp;nbsp;channels respectively for R, G, and B. If&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;R&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;G&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;B&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;We have one&amp;nbsp;$3\times3$ px kernel (filter) with 3 channels as follows:&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;Filter - R&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;Filter - G&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;Filter - B&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;a)&lt;/strong&gt; If&amp;nbsp;&lt;strong&gt;Stride = 2&lt;/strong&gt;,&lt;strong&gt; &lt;/strong&gt;and&lt;strong&gt;&amp;nbsp;Zero-padding = 1&lt;/strong&gt;, and &lt;strong&gt;Bias&amp;nbsp;= 1&lt;/strong&gt;, what will be the result of convolution?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;b)&lt;/strong&gt; What is the result after applying a &lt;strong&gt;ReLU&amp;nbsp;layer ($max(z,0)$)&lt;/strong&gt;on the result with the same size of the reuslt&amp;nbsp;in part a?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;c)&lt;/strong&gt; Calculate the output&amp;nbsp;by applying &lt;strong&gt;max-pooling&lt;/strong&gt; layer with the size of $2\times2$ on the output of part b, and &lt;strong&gt;Stride = 1&lt;/strong&gt;. (hint: max-pooling layer here and&amp;nbsp;usually do not include any zero-paddings)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;d)&lt;/strong&gt; What is the result after applying &lt;strong&gt;flatten&lt;/strong&gt; on the output of part c and creating a vector?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;e)&lt;/strong&gt; Assume the vector you created contains m elements. Consider it as the input vector for a &lt;strong&gt;Softmax&lt;/strong&gt; &lt;strong&gt;Regression classifier&amp;nbsp;&lt;/strong&gt;(without any hidden layers and biases and it is fully connected). Assume there are 2 classes of 0 and 1. For all the weights from&amp;nbsp;each element in the feature vector, the optimized weights are 1 for odd elements and 2 for even elements. For example, if the feature vector is [10,11,12,13,14], all the weights &lt;strong&gt;from &lt;/strong&gt;10 are 1 (because 10 is element 1 and 1 is odd), all the weights &lt;strong&gt;from&lt;/strong&gt; 11 are 2, all the weights &lt;strong&gt;from&lt;/strong&gt; 12 are&amp;nbsp;1, all the weights &lt;strong&gt;from&lt;/strong&gt; 13 are&amp;nbsp;2 and all the weights &lt;strong&gt;from&lt;/strong&gt; 14 are 1 and so on. Draw the&amp;nbsp;Softmax&amp;nbsp;Regression network and calculate the class should be 0 or 1?&lt;/p&gt;

&lt;p&gt;Hint:&amp;nbsp;&lt;br&gt;
&lt;strong&gt;Softmax Regression:&lt;/strong&gt;&amp;nbsp;$p_{i}=\frac{e^{z_{i}}}{\sum_{i=1}^{c} e^{z_{i}}}$&lt;br&gt;
Where $p_{i}$ is the probability of class $i$ anc $c$ is the number of classes.&lt;/p&gt;</description>
<category>Deep Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/650/calculate-convolutions-layer-convolutional-neural-network</guid>
<pubDate>Wed, 26 Jun 2019 08:54:12 +0000</pubDate>
</item>
<item>
<title>What loss function to use in CNN-SVM model</title>
<link>https://ask.ghassem.com/641/what-loss-function-to-use-in-cnn-svm-model</link>
<description>I am using Matlab R2018b and am trying to infuse SVM classifier within CNN. My plan is to use CNN only as a feature extractor and use SVM as the classifier. I know people have already implemented it a few years back either in tensorflow or in other platforms. In implementing this I got stuck at a point during backward propagation. I got puzzled about which loss function I need to implement to upgrade the gradients and the parameters.&lt;br /&gt;
&lt;br /&gt;
Few points came up during this:&lt;br /&gt;
&lt;br /&gt;
1. I got a feeling to implement the hinge loss here. But which form of hinge loss should I implement? Should I move on to the second form of hinge loss implementation for calculating loss during backward propagation?&lt;br /&gt;
&lt;br /&gt;
2. Besides, calculating the backward loss, should I calculate the forward loss as well to find out the loss occurred in the model?&lt;br /&gt;
&lt;br /&gt;
Any form of advice doing this CNN-svm infusion will be appreciated as I am unable to find any such material implemented in Matlab to get help.&lt;br /&gt;
&lt;br /&gt;
Thanks.</description>
<category>Deep Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/641/what-loss-function-to-use-in-cnn-svm-model</guid>
<pubDate>Sat, 08 Jun 2019 09:24:21 +0000</pubDate>
</item>
<item>
<title>is impossible predict hours time series to minutes time series?</title>
<link>https://ask.ghassem.com/625/is-impossible-predict-hours-time-series-minutes-time-series</link>
<description>&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://stackoverflow.com/questions/55930051/is-impossible-predict-hours-time-series-to-minutes-time-series&quot;&gt;https://stackoverflow.com/questions/55930051/is-impossible-predict-hours-time-series-to-minutes-time-series&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;i want to this hours time series predict model to minute predict model&lt;/p&gt;</description>
<category>Deep Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/625/is-impossible-predict-hours-time-series-minutes-time-series</guid>
<pubDate>Wed, 01 May 2019 13:11:26 +0000</pubDate>
</item>
<item>
<title>How to update weights in backpropagation algorithm (a numerical example)?</title>
<link>https://ask.ghassem.com/612/update-weights-backpropagation-algorithm-numerical-example</link>
<description>&lt;p&gt;Assume we have the following neural network and all activation functions are $f(z)=z$. If the weights are initialized with the values you see in table below, what will be new updated weights after one step if learning rate, $\alpha = 0.05$?&lt;/p&gt;

&lt;p&gt;Assume the input values are [$i_1$,$i_2$] = [2,3] and target value&amp;nbsp;$out = 1$.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hint:&lt;/strong&gt;&lt;br&gt;
$w_{new} = w_{old} - \alpha \frac{\partial E}{\partial w}$&lt;/p&gt;

&lt;p&gt;$E_{\text {total}}=\sum \frac{1}{2}(\text {target}-\text {output})^{2}$&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;1&quot; style=&quot;height:225px; width:394px&quot;&gt;
&lt;caption&gt;Updating weights in backpropagation algorithm&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Weights&lt;/td&gt;
&lt;td&gt;Initialization&lt;/td&gt;
&lt;td&gt;New weights after one step&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w1$&lt;/td&gt;
&lt;td&gt;0.11&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w2$&lt;/td&gt;
&lt;td&gt;0.21&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w3$&lt;/td&gt;
&lt;td&gt;0.12&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w4$&lt;/td&gt;
&lt;td&gt;0.08&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w5$&lt;/td&gt;
&lt;td&gt;0.14&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w6$&lt;/td&gt;
&lt;td&gt;0.15&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/v0RMeOQ.png&quot;&gt;https://i.imgur.com/v0RMeOQ.png&lt;/a&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/612/update-weights-backpropagation-algorithm-numerical-example</guid>
<pubDate>Thu, 11 Apr 2019 17:02:04 +0000</pubDate>
</item>
<item>
<title>How to calculate feed-forward (forward-propagation) in neural network?</title>
<link>https://ask.ghassem.com/603/calculate-feed-forward-forward-propagation-neural-network</link>
<description>&lt;p&gt;In the figure&amp;nbsp;below, a neural network is shown. Calculate the following:&lt;/p&gt;

&lt;p&gt;1) How many neurons do we have in the input layer and the output layer?&lt;/p&gt;

&lt;p&gt;2) How many hidden layers do we have?&lt;/p&gt;

&lt;p&gt;3) If all the weights initialized with 1 ($w1=w2=w3=...=w19=1$), what is the output of this network after feed-forward for the sample shown in the figure&amp;nbsp;(X = (x1,x2,x3) = (2,5,3) and y=10)? What is the error of the network ($\text { Error }=\frac{1}{2}(\hat{y}-y)^{2}$)? Assume activation functions for all neurons except the output neuron is $f(z)=z$.&amp;nbsp;&lt;br&gt;
&lt;br&gt;
4) If we change the activation function of all&amp;nbsp;the neurons in the second hidden layer to Sigmoid ($S(x)=\frac{1}{1+e^{-x}}=\frac{e^{x}}{e^{x}+1}$), what would be the output of the network after this change? Calculate the error as well.&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/rtqPiRa.jpg&quot;&gt;https://i.imgur.com/rtqPiRa.jpg&lt;/a&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/603/calculate-feed-forward-forward-propagation-neural-network</guid>
<pubDate>Thu, 04 Apr 2019 15:54:17 +0000</pubDate>
</item>
<item>
<title>How to update weights using gradient decent algorithm?</title>
<link>https://ask.ghassem.com/596/how-to-update-weights-using-gradient-decent-algorithm</link>
<description>&lt;p&gt;For the&amp;nbsp;below neural network, imagine we are going to use&amp;nbsp;the&amp;nbsp;&lt;strong&gt;backpropagation algorithm&lt;/strong&gt; to update weights. If the Bias (b) in this problem is always 0 (ignore bias when you solve the problem), and we have a dataset with only one record of $x=2$ and the target value of $y=5$ as you can see in the following table,&amp;nbsp;and activation function&amp;nbsp;is defined as $f(z) = z$&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;1&quot; style=&quot;width:200px&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot;&gt;feature (x)&lt;/th&gt;
&lt;th scope=&quot;col&quot;&gt;Target (y)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;1) Define the cost function, $J(w)$, based on the error in backpropagation algorithm: $J(w) = E = \frac{1}{2}(predicted - target)^2$, and draw it&lt;/p&gt;

&lt;p&gt;2) Initialize the weight by $w=3$, and calculate the error&lt;/p&gt;

&lt;p&gt;3) Calculate updated weights using the gradient&amp;nbsp;decent algorithm &lt;strong&gt;after three updates &lt;/strong&gt;if we have the following values for learning rate ($\alpha$)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$\alpha$ = 1&lt;/li&gt;
&lt;li&gt;$\alpha$ = 0.1&lt;/li&gt;
&lt;li&gt;$\alpha$ = 0.5&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hint:&amp;nbsp; &amp;nbsp;$w_{new} = w_{old} - \alpha \frac{\partial E}{\partial w}$&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/uohFS6l.png&quot;&gt;https://i.imgur.com/uohFS6l.png&lt;/a&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/596/how-to-update-weights-using-gradient-decent-algorithm</guid>
<pubDate>Thu, 28 Mar 2019 17:17:39 +0000</pubDate>
</item>
<item>
<title>Determine weights on the paths that connect to the different data points in a neural network?</title>
<link>https://ask.ghassem.com/589/determine-weights-connect-different-points-neural-network</link>
<description>How do you determine the weight values that connect to the other data points when solving for our output in neural networks?</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/589/determine-weights-connect-different-points-neural-network</guid>
<pubDate>Mon, 18 Mar 2019 23:35:25 +0000</pubDate>
</item>
<item>
<title>Passing variable length sentences to Tensorflow LSTM</title>
<link>https://ask.ghassem.com/561/passing-variable-length-sentences-to-tensorflow-lstm</link>
<description>&lt;p&gt;I have a tensorflow LSTM model for predicting the sentiment. I build the model with the maximum sequence length 150. (Maximum number of words) While making predictions, i have written the code as below:&lt;/p&gt;

&lt;pre class=&quot;prettyprint lang-python&quot; data-pbcklang=&quot;python&quot; data-pbcktabsize=&quot;4&quot;&gt;
batchSize = 32
maxSeqLength = 150

def getSentenceMatrix(sentence):
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;arr = np.zeros([batchSize, maxSeqLength])
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;sentenceMatrix = np.zeros([batchSize,maxSeqLength], dtype=&#039;int32&#039;)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;cleanedSentence = cleanSentences(sentence)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;cleanedSentence = &#039; &#039;.join(cleanedSentence.split()[:150])
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;split = cleanedSentence.split()
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;for indexCounter,word in enumerate(split):
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;try:
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;sentenceMatrix[0,indexCounter] = wordsList.index(word)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;except ValueError:
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;sentenceMatrix[0,indexCounter] = 399999 #Vector for unkown words
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return sentenceMatrix

input_text = &quot;example data&quot;
inputMatrix = getSentenceMatrix(input_text)&lt;/pre&gt;

&lt;p&gt;&lt;br&gt;
&lt;br&gt;
In the code i&#039;m truncating my input text to 150 words and ignoring remaining data.Due to this my predictions are wrong.&lt;/p&gt;

&lt;pre class=&quot;prettyprint lang-python&quot; data-pbcklang=&quot;python&quot; data-pbcktabsize=&quot;4&quot;&gt;
cleanedSentence = &#039; &#039;.join(cleanedSentence.split()[:150]) &lt;/pre&gt;

&lt;p&gt;&lt;br&gt;
I know that if we have lesser length than sequence length we can pad with zero&#039;s. What we need to do if we have more length. Can you suggest me the best way to do this. Thanks in advance.&lt;/p&gt;</description>
<category>General</category>
<guid isPermaLink="true">https://ask.ghassem.com/561/passing-variable-length-sentences-to-tensorflow-lstm</guid>
<pubDate>Mon, 11 Feb 2019 05:06:27 +0000</pubDate>
</item>
<item>
<title>What is the difference between a batch and an epoch in a Neural Network?</title>
<link>https://ask.ghassem.com/497/what-the-difference-between-batch-and-epoch-neural-network</link>
<description>Both of the batch size and number of epochs are integer values and seem to do the same thing in Stochastic gradient descent. What are these two hyper-parameters of this learning algorithm?</description>
<category>Machine Learning Interview Questions</category>
<guid isPermaLink="true">https://ask.ghassem.com/497/what-the-difference-between-batch-and-epoch-neural-network</guid>
<pubDate>Tue, 30 Oct 2018 14:45:56 +0000</pubDate>
</item>
<item>
<title>What is the difference between machine learning and deep learning?</title>
<link>https://ask.ghassem.com/485/what-the-difference-between-machine-learning-deep-learning</link>
<description></description>
<category>Deep Learning Interview Questions</category>
<guid isPermaLink="true">https://ask.ghassem.com/485/what-the-difference-between-machine-learning-deep-learning</guid>
<pubDate>Tue, 30 Oct 2018 11:29:38 +0000</pubDate>
</item>
<item>
<title>Using Tensorflow.DNNClassifier, getting Error: assertion failed: [Labels must &gt;= 0]</title>
<link>https://ask.ghassem.com/440/tensorflow-dnnclassifier-getting-assertion-failed-labels</link>
<description>&lt;p&gt;Hi All,&lt;/p&gt;

&lt;p&gt;I am writing a simple program using Tensorflow and DNNClassifier. Training Data is 9 pixel with four spectral bands, i.e. 4*9=36 featurs. And each data-point will be mapped to a class (from 1 to 7).&amp;nbsp;&lt;/p&gt;

&lt;p&gt;Last parameter, is the class label.&lt;/p&gt;

&lt;p&gt;A line of data-point is like this:&lt;/p&gt;

&lt;pre&gt;
67,75,77,62,67,79,81,62,75,87,89,71,66,79,88,63,66,79,84,63,66,79,80,59,67,84,86,68,71,84,86,64,67,81,82,64,7&lt;/pre&gt;

&lt;p&gt;But I got below Error:&lt;/p&gt;

&lt;pre class=&quot;prettyprint lang-python&quot; data-pbcklang=&quot;python&quot; data-pbcktabsize=&quot;4&quot;&gt;
InvalidArgumentError (see above for traceback): assertion failed: [Labels must &amp;gt;= 0] [Condition x &amp;gt;= 0 did not hold element-wise:] [x (dnn/head/labels:0) = ] [[3][3][3]...]&lt;/pre&gt;

&lt;p&gt;I am sure there is no datapoint&amp;nbsp;which has a label&amp;nbsp;less than 0. Would you please advise?&lt;/p&gt;

&lt;pre class=&quot;prettyprint lang-python&quot; data-pbcklang=&quot;python&quot; data-pbcktabsize=&quot;4&quot;&gt;
import numpy as np

import pandas as pd

import tensorflow as tf

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV, KFold
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import StratifiedShuffleSplit

print(&#039;** DNN Classification *******************************************************&#039;)

landsatData = pd.read_csv(&quot;./resources/landsat/lantsat.1.csv&quot;)

landsatData.describe()

X_landSatAllFeatures = landsatData.iloc[:, np.arange(36)].copy()

y_midPixelAsTarget = landsatData.iloc[:, 36].copy()

# Testing and training sentences splitting (stratified + shuffled) based on the index (sentence ID)
allFeaturesIndexes = X_landSatAllFeatures.index
targetData = y_midPixelAsTarget
sss = StratifiedShuffleSplit(n_splits=1, test_size=0.3, random_state=42)

for train_index, test_index in sss.split(allFeaturesIndexes, targetData):
    train_ind, test_ind = allFeaturesIndexes[train_index], allFeaturesIndexes[test_index]

Test_Matrix = X_landSatAllFeatures.loc[test_ind]
Test_Target_Matrix = y_midPixelAsTarget.loc[test_ind]
Train_Matrix = X_landSatAllFeatures.loc[train_ind]
Train_Target_Matrix = y_midPixelAsTarget.loc[train_ind]

scaler = StandardScaler().fit(Train_Matrix)
Train_Matrix, Test_Matrix = scaler.transform(Train_Matrix), scaler.transform(Test_Matrix)

def reset_graph(seed=42):
    tf.reset_default_graph()
    tf.set_random_seed(seed)
    np.random.seed(seed)

X_train = Train_Matrix
y_train = Train_Target_Matrix
X_test = Test_Matrix
y_test = Test_Target_Matrix

xx, yy = Train_Matrix.shape
#training phase
feature_cols = [tf.feature_column.numeric_column(&quot;X&quot;, shape=[36])]
dnn_clf = tf.estimator.DNNClassifier(hidden_units=[300,100], n_classes=8, feature_columns=feature_cols)
# dnn_clf = tf.estimator.DNNClassifier(hidden_units=[300,100], n_classes=10)


input_fn = tf.estimator.inputs.numpy_input_fn(
    x={&quot;X&quot;: X_train}, y=y_train, num_epochs=40, batch_size=64, shuffle=True)
dnn_clf.train(input_fn=input_fn)

#testing phase
test_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={&quot;X&quot;: X_test}, y=y_test, shuffle=False)
eval_results = dnn_clf.evaluate(input_fn=test_input_fn)
print(&quot;The prediction result is : {0:.2f}%&quot;.format(100*eval_results[&#039;accuracy&#039;]))
y_pred_iter = dnn_clf.predict(input_fn=test_input_fn)
y_pred = list(y_pred_iter)
y_pred[0]


print(&#039;**********************************************************************************&#039;)&lt;/pre&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description>
<category>Deep Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/440/tensorflow-dnnclassifier-getting-assertion-failed-labels</guid>
<pubDate>Wed, 24 Oct 2018 03:12:33 +0000</pubDate>
</item>
<item>
<title>What are the main branches of Deep Learning algorithms?</title>
<link>https://ask.ghassem.com/391/what-are-the-main-branches-of-deep-learning-algorithms</link>
<description></description>
<category>Deep Learning Interview Questions</category>
<guid isPermaLink="true">https://ask.ghassem.com/391/what-are-the-main-branches-of-deep-learning-algorithms</guid>
<pubDate>Mon, 15 Oct 2018 02:52:59 +0000</pubDate>
</item>
<item>
<title>Why do we need big data to train Deep Neural Networks?</title>
<link>https://ask.ghassem.com/300/why-do-we-need-big-data-to-train-deep-neural-networks</link>
<description></description>
<category>Machine Learning Interview Questions</category>
<guid isPermaLink="true">https://ask.ghassem.com/300/why-do-we-need-big-data-to-train-deep-neural-networks</guid>
<pubDate>Mon, 08 Oct 2018 12:15:46 +0000</pubDate>
</item>
<item>
<title>Why should we use Machine learning instead of deep learning?</title>
<link>https://ask.ghassem.com/291/why-should-we-use-machine-learning-instead-of-deep-learning</link>
<description>I am wondering why should we use machine learning instead of deep learning. We know that deep learning is very powerful. Anything which machine learning algorithm can do deep learning could achieve that.&lt;br /&gt;
&lt;br /&gt;
Plus using deep learning we don&amp;#039;t have to worry about feature extraction, data cleaning etc.&lt;br /&gt;
&lt;br /&gt;
So why should we use machine learning algorithms instead of deep learning ?</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/291/why-should-we-use-machine-learning-instead-of-deep-learning</guid>
<pubDate>Mon, 08 Oct 2018 03:54:29 +0000</pubDate>
</item>
<item>
<title>What are the best resources for studying Deep Learning?</title>
<link>https://ask.ghassem.com/2/what-are-the-best-resources-for-studying-deep-learning</link>
<description>I am wondering if anyone can suggest the best resources for studying Deep Learning?</description>
<category>Deep Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/2/what-are-the-best-resources-for-studying-deep-learning</guid>
<pubDate>Sun, 26 Aug 2018 07:43:30 +0000</pubDate>
</item>
</channel>
</rss>