Ask Ghassem - Recent questions tagged ml-exercise

How to calculate feed-forward (forward-propagation) in neural network for classification?

Wed, 02 Oct 2024 14:47:26 +0000

For the following neural network, calculate accuracy of classification, given these settings

How to calculate the residual errors, (MSE),(MAE), and (RMSE)?

Fri, 27 Jan 2023 04:09:28 +0000

Given the following sample dataset with 5 samples and 2 features:

Sample	Feature 1	Feature 2	Actual Value	Predicted Value
1	2	3	4	6
2	3	4	5	6
3	4	5	6	7
4	5	6	7	8
5	6	7	8	9

Calculate the residual errors, mean squared error (MSE), mean absolute error (MAE), and root mean squared error (RMSE) using a sample model.

How to create a Decision Tree using the ID3 algorithm?

Wed, 01 Dec 2021 11:26:02 +0000

NASA wants to be able to discriminate between Martians (M) and Humans (H) based on the
following characteristics: Green ∈{N, Y }, Legs ∈{2, 3}, Height ∈{S, T}, Smelly ∈{N, Y }.
Our available training data is as follows:

https://i.imgur.com/3bC391L.png

a) Greedily learn a decision tree using the ID3 algorithm and draw the tree.
b) Write the learned concept for Martian as a set of conjunctive rules (e.g., if (green=Y
and legs=2 and height=T and smelly=N), then Martian; else if ... then Martian; ...; else
Human).

How to update the weights in backpropagation algorithm when activation function in not linear?

Mon, 10 Aug 2020 21:55:19 +0000

The goal of backpropagation is to optimize the weights so that the neural network can learn how to correctly map arbitrary inputs to outputs.

Assume for the following neural network, inputs = [$i_1,i_2$] = [0.05, 0.10], we want the neural network to output = [$o_1$,$o_2$] = [0.01, 0.99], and for learning rate, $\alpha=0.5$.
In addition, the activation function for the hidden layer (both $h_1$ and $h_2$) is sigmoid (logistic):

$S(x)=\frac{1}{1+e^{-x}}$

https://i.imgur.com/cnY5feu.png

Hint:
$w_{new} = w_{old} - \alpha \frac{\partial E}{\partial w}$

$E_{\text {total}}=\sum \frac{1}{2}(\text {target}-\text {output})^{2}$

a) Show step by step solution to calculate weights $w_1$ to $w_8$ after one update in table below.
b) Calculate initial error and error after one update (assume biases $[b_1,b_2]$ are not changing during the updates).

Updating weights in backpropagation algorithm
Weights	Initialization	New weights after one step
$w1$	0.15	?
$w2$	0.20	?
$w3$	0.25	?
$w4$	0.30	?
$w5$	0.40	?
$w6$	0.45	?
$w7$	0.50	?
$w8$	0.55	?

How to calculate the class probabilities and classify using Naive Bayes classifier?

Mon, 10 Aug 2020 21:26:28 +0000

We have data on 1000 pieces of fruit. The fruit being a Banana, Orange or some Other fruit and imagine we know 3 features of each fruit, whether it’s long or not, sweet or not and yellow or not, as displayed in the table below:

https://i.imgur.com/gOFzVXL.png

A piece of an unknown fruit with these features are provided: Long, Sweet and Yellow.

Calculate probability of each of these 3 classes based on Naive Bayes Classification algorithm and report the class.

How to calculate residual errors for linear regression and interpret regression metrics?

Tue, 18 Feb 2020 18:30:51 +0000

Assuming we have a linear regression equation and some data points (sample), how can we calculate residual error for each data point, and total cost based on the metrics such as MAE, MSE, RMSE, MAPE, or MPE if we have their formula?

How to calculate the probability and accuracy of a Logistic Regression classifier?

Mon, 03 Feb 2020 20:31:49 +0000

How to solve this problem?

https://i.imgur.com/8urywpf.jpg

Q1) Complete the ? sections

Q2) Accuracy of system if threshold = 0.5?

Q3) Accuracy of system if threshold = 0.95?

How to calculate Accuracy, Precision, Recall or F1?

Mon, 27 Jan 2020 19:22:26 +0000

In the following example, calculate Accuracy, Precision, Recall or F1?

https://i.imgur.com/OezFpqC.png

How to perform a classification or regression using k-NN?

Thu, 27 Jun 2019 02:54:42 +0000

Suppose, you have given the following dataset where x and y are the 2 features and color Red or Blue is the target variable.

a) A new data point $x=1$ and $y=1$ is given. Using Euclidean distance in 3-NN, what you predict as the color for this data point?

Dataset
x	y	Color
-1	1	Red
0	1	Blue
0	2	Red
1	-1	Red
1	0	Blue
1	2	Blue
2	2	Red
2	3	Blue

b) Now assume we have the following dataset and the target value is the price. A new data point $x=1$ and $y=1$ is given. Using Euclidean distance in 3-NN. What would be the estimated price?

Dataset
x	y	Price
-1	1	$100
0	1	$50
0	2	$20
1	-1	$40
1	0	$30
1	2	$40
2	2	$70
2	3	$30

How to calculate k-means clustering with a numerical example?

Thu, 27 Jun 2019 02:16:32 +0000

Use the k-means algorithm and Euclidean distance to cluster the following 8 examples into 3 clusters:

$A1=(2,10), A2=(2,5), A3=(8,4), A4=(5,8), A5=(7,5), A6=(6,4), A7=(1,2), A8=(4,9)$.

Suppose that the initial seeds (centers of each cluster) are $A1$, $A4$ and $A7$. Run the k-means algorithm for 1 epoch only. At the end of this epoch show:

a) The new clusters (i.e. the examples belonging to each cluster)

b) The centers of the new clusters

c) Draw a 10 by 10 space with all the 8 points and show the clusters after the first epoch and the new centroids.

d) How many more iterations are needed to converge? Draw the result for each epoch

How to calculate the class probabilities and classify using Naive Bayes classifier for NLP?

Wed, 26 Jun 2019 19:43:41 +0000

We want to use Naive Bayes for tagging documents. It is a classification task that we want to assign a class (tag) to each string. We currently have two tags: Sport and Not Sport

Which tag does the sentence A very close game belong to? Using Naive Bayes classifier, calculate the class probability for Sport and Not sport for this sentence based on the dataset and decide about the tag.

Text	Tag
“A great game”	Sports
“The election was over”	Not sports
“Very clean match”	Sports
“A clean but forgettable game”	Sports
“It was a close election”	Not sports

How to calculate Covariance Matrix and Principal Components for PCA?

Wed, 26 Jun 2019 10:40:02 +0000

The dataset with two features $(x,y)$ is shown as follows (note $y$ in this example is the second feature, not a target value):

x	y
2.5	2.4
0.5	0.7
2.2	2.9
1.9	2.2
3.1	3.0
2.3	2.7
2.0	1.6
1.0	1.1
1.5	1.6
1.1	0.9

a) Calculate the Covariance Matrix.
b) Calculate eigenvalues and eigenvectors
c) Calculate all the PCs
d) How much percent of the total variance in the dataset is explained by each PC?

How to calculate convolutions on a CONV layer for a Convolutional Neural Network?

Wed, 26 Jun 2019 08:54:12 +0000

Assume we have a $5\times5$ px RGB image with 3 channels respectively for R, G, and B. If

R
2	0	0	0	0
1	2	0	0	1
2	0	1	0	2
1	2	1	0	1
0	1	0	2	0

G
0	2	1	2	2
1	1	1	0	0
0	0	2	2	0
2	0	0	2	0
0	2	1	1	1

B
0	1	0	0	1
1	1	2	0	1
1	0	2	0	2
1	0	1	1	0
1	2	1	1	2

We have one $3\times3$ px kernel (filter) with 3 channels as follows:

Filter - R
0	0	1
1	0	1
1	0	0

Filter - G
0	0	-1
1	0	0
1	-1	0

Filter - B
1	0	1
0	1	-1
1	-1	0

a) If Stride = 2, and Zero-padding = 1, and Bias = 1, what will be the result of convolution?

b) What is the result after applying a ReLU layer ($max(z,0)$)on the result with the same size of the reuslt in part a?

c) Calculate the output by applying max-pooling layer with the size of $2\times2$ on the output of part b, and Stride = 1. (hint: max-pooling layer here and usually do not include any zero-paddings)

d) What is the result after applying flatten on the output of part c and creating a vector?

e) Assume the vector you created contains m elements. Consider it as the input vector for a Softmax Regression classifier (without any hidden layers and biases and it is fully connected). Assume there are 2 classes of 0 and 1. For all the weights from each element in the feature vector, the optimized weights are 1 for odd elements and 2 for even elements. For example, if the feature vector is [10,11,12,13,14], all the weights from 10 are 1 (because 10 is element 1 and 1 is odd), all the weights from 11 are 2, all the weights from 12 are 1, all the weights from 13 are 2 and all the weights from 14 are 1 and so on. Draw the Softmax Regression network and calculate the class should be 0 or 1?

Hint:
Softmax Regression: $p_{i}=\frac{e^{z_{i}}}{\sum_{i=1}^{c} e^{z_{i}}}$
Where $p_{i}$ is the probability of class $i$ anc $c$ is the number of classes.

How to optimize weights in Logistic Regression?

Wed, 05 Jun 2019 17:38:50 +0000

The hypothesis (model) of Logistic Regression which is a binary classifier ( $y =\{0,1\} $) is given in the equation below:

Hypothesis

$S(z)=P(y=1 | x)=h_{\theta}(x)=\frac{1}{1+\exp \left(-\theta^{\top} x\right)}$

Which calculates probability of Class 1, and by setting a threshold (such as $h_{\theta}(x) > 0.5 $) we can classify to 1, or 0.

Cost function

The cost function for Logistic Regression is defined as below. It is called binary cross entropy loss function:

$J(\theta)=-\frac{1}{m} \sum_{i}^{m}\left(y^{(i)} \log \left(h_{\theta}\left(x^{(i)}\right)\right)+\left(1-y^{(i)}\right) \log \left(1-h_{\theta}\left(x^{(i)}\right)\right)\right)$

Iterative updates

Assume we start all the model parameters with a random number (in this case the only model parameters we have are $\theta_j$ and assume we initialized all of them with 1: for all $\theta_j = 1$ for $j=\{0,1,...,n\}$ and $n$ is the number of features we have)

$\theta_{j_{n e w}} \leftarrow \theta_{j_{o l d}}+\alpha \times \frac{1}{m} \sum_{i=1}^{m}\left[y^{(i)}-\sigma\left(\theta_{j_{o l d}}^{\top}\left(x^{(i)}\right)\right)\right] x_{j}^{(i)}$

Where:
$m =$ number of rows in the training batch
$x^{(i)} = $ the feature vector for sample $i$
$\theta_j = $ the coefficient vector corresponding the features
$y^{(i)} = $ actual class label for sample $i$ in the training batch
$x_{j}^{(i)} = $ the element (column) $j$ in the feature vector for sample $i$
$\alpha =$ the learning rate

Dataset

The training dataset of pass/fail in an exam for 5 students is given in the table below:

If we initialize all the model parameters with 1 (all $\theta_j = 1$), and the learning rate is $\alpha = 0.1$, and if we use batch gradient descent, what will be the:

$a)$ Accuracy of the model at initialization of the train set ($\text{accuracy} = \frac{\text{number of correct classifications}}{\text{all classifications}}$)?
$b)$ Cost at initialization?
$c)$ Cost after 1 epoch?
$d)$ Repeat all $a,b,c$ steps if we use mini-batch gradient descent and $\text{batch size} = 2$

(Hint: For $x_{j}^{(i)}$ when $j=0$ we have $x_{0}^{(i)} = 1$ for all $i$ )

How to update weights in backpropagation algorithm (a numerical example)?

Thu, 11 Apr 2019 17:02:04 +0000

Assume we have the following neural network and all activation functions are $f(z)=z$. If the weights are initialized with the values you see in table below, what will be new updated weights after one step if learning rate, $\alpha = 0.05$?

Assume the input values are [$i_1$,$i_2$] = [2,3] and target value $out = 1$.

Hint:
$w_{new} = w_{old} - \alpha \frac{\partial E}{\partial w}$

$E_{\text {total}}=\sum \frac{1}{2}(\text {target}-\text {output})^{2}$

Updating weights in backpropagation algorithm
Weights	Initialization	New weights after one step
$w1$	0.11	?
$w2$	0.21	?
$w3$	0.12	?
$w4$	0.08	?
$w5$	0.14	?
$w6$	0.15	?

https://i.imgur.com/v0RMeOQ.png

How to calculate univariate linear regression?

Thu, 11 Apr 2019 16:46:47 +0000

For the following dataset, calculate the regression equation $\hat{y} = ax+b$

dataset
x	y
1	42
3	50
10	75
16	100
26	150
36	200

How to calculate Softmax Regression probabilities in this example?

Thu, 04 Apr 2019 18:20:53 +0000

1) What will be the probability of an iris with petal length = 4.6 and petal width = 1.7 to be classified as Virginica?

2) What will be the probability of Virginica, if we use all features petal length = 4.6 and petal width = 1.7, sepal length = 5.5 and sepal width = 3.0 with the same weight initialization?

How to calculate feed-forward (forward-propagation) in neural network?

Thu, 04 Apr 2019 15:54:17 +0000

In the figure below, a neural network is shown. Calculate the following:

1) How many neurons do we have in the input layer and the output layer?

2) How many hidden layers do we have?

3) If all the weights initialized with 1 ($w1=w2=w3=...=w19=1$), what is the output of this network after feed-forward for the sample shown in the figure (X = (x1,x2,x3) = (2,5,3) and y=10)? What is the error of the network ($\text { Error }=\frac{1}{2}(\hat{y}-y)^{2}$)? Assume activation functions for all neurons except the output neuron is $f(z)=z$.

4) If we change the activation function of all the neurons in the second hidden layer to Sigmoid ($S(x)=\frac{1}{1+e^{-x}}=\frac{e^{x}}{e^{x}+1}$), what would be the output of the network after this change? Calculate the error as well.

https://i.imgur.com/rtqPiRa.jpg

How to update weights using gradient decent algorithm?

Thu, 28 Mar 2019 17:17:39 +0000

For the below neural network, imagine we are going to use the backpropagation algorithm to update weights. If the Bias (b) in this problem is always 0 (ignore bias when you solve the problem), and we have a dataset with only one record of $x=2$ and the target value of $y=5$ as you can see in the following table, and activation function is defined as $f(z) = z$

feature (x)	Target (y)
2	5

1) Define the cost function, $J(w)$, based on the error in backpropagation algorithm: $J(w) = E = \frac{1}{2}(predicted - target)^2$, and draw it

2) Initialize the weight by $w=3$, and calculate the error

3) Calculate updated weights using the gradient decent algorithm after three updates if we have the following values for learning rate ($\alpha$)

$\alpha$ = 1
$\alpha$ = 0.1
$\alpha$ = 0.5

Hint: $w_{new} = w_{old} - \alpha \frac{\partial E}{\partial w}$

https://i.imgur.com/uohFS6l.png

How to calculate Softmax Regression probabilities?

Thu, 21 Mar 2019 16:11:09 +0000

The scatter plot of Iris Dataset is shown in the figure below. Assume Softmax Regression is used to classify Iris to Setosa, Versicolor, or Viriginica using just petal length and petal width. If all the weights required for Softmax Regression initialized to 0.5 and the network includes bias nodes:

1) Write the weight vectors and equations for calculating the class probabilities.

2) We have a new iris and we have measured petal length = 4.5 and petal width = 1.6. Using the above initial model, what would be the result of classification?

3) If we change all the weights related to the class blue to 1 and keep all other weights 0.5, what will be the predicted class?

How to calculate LogLoss in logistic regression?

Mon, 18 Mar 2019 20:34:40 +0000

The dataset of pass/fail in an exam for 5 students is given in the table below. If we use Logistic Regression as the classifier and assume the model suggested by the optimizer will become the following for Odds of passing a course:

$\log_e(Odds) = -64 + 2 \times hours$

1) How to calculate the loss of model for the student who studied 33 hours?

2) What is the total loss of the model given in equation below?

$Logloss = -\frac{1}{N} \sum_{i=1}^N(y_i\log_e(p_i) + (1 - y_i)\log_e(1 - p_i))$

How to calculate probability in Logistic Regression?

Mon, 18 Mar 2019 20:22:35 +0000

$\log (Odds) = -64 + 2 \times hours$

1) How to calculate the probability of Pass for the student who studied 33 hours?

2) At least how many hours the student should study that makes sure will pass the course with the probability of more than 95%?

x	y
2.5	2.4
0.5	0.7
2.2	2.9
1.9	2.2
3.1	3.0
2.3	2.7
2.0	1.6
1.0	1.1
1.5	1.6
1.1	0.9

x	y
2.5	2.4
0.5	0.7
2.2	2.9
1.9	2.2
3.1	3.0
2.3	2.7
2.0	1.6
1.0	1.1
1.5	1.6
1.1	0.9

x	y
2.5	2.4
0.5	0.7
2.2	2.9
1.9	2.2
3.1	3.0
2.3	2.7
2.0	1.6
1.0	1.1
1.5	1.6
1.1	0.9