Ask Ghassem - Recent questions tagged machine-learning

Step-by-Step Hidden State Calculation in a Recurrent Neural Network

Mon, 01 Dec 2025 18:32:24 +0000

Consider a simplified Recurrent Neural Network (RNN) with a single input and a single output. The hidden state is updated using the recurrence:

$$ h_t = \text{ReLU}(W_{ih} \cdot x_t + W_{hh} \cdot h_{t-1}) $$

Assume the following:

$ x_t = 3 $ for every time step
$ h_0 = 0 $
$ W_{ih} = 0.4 $
$ W_{hh} = 0.6 $
Activation function: ReLU

Compute the value of the hidden state $ h_4 $ at time $ t = 4 $.

How to calculate feed-forward (forward-propagation) in neural network for classification?

Wed, 02 Oct 2024 14:47:26 +0000

For the following neural network, calculate accuracy of classification, given these settings

When to use one hot encode a category and when to segment by category?

Wed, 22 Feb 2023 20:30:38 +0000

When pre processing data for machine learning. Is there any difference in using one hot encoding to turn categoric variables into numeric variables or to segment the data and the model being used along the category. So say you run a multivariate regression model on data covering 5 cities. Would a single model with one variable for each city be more better or worse than having 5 models specific for each city? Or is there no difference? Or does it depend on certain factors and intuition?

How to calculate the residual errors, (MSE),(MAE), and (RMSE)?

Fri, 27 Jan 2023 04:09:28 +0000

Given the following sample dataset with 5 samples and 2 features:

Sample	Feature 1	Feature 2	Actual Value	Predicted Value
1	2	3	4	6
2	3	4	5	6
3	4	5	6	7
4	5	6	7	8
5	6	7	8	9

Calculate the residual errors, mean squared error (MSE), mean absolute error (MAE), and root mean squared error (RMSE) using a sample model.

Creating tables from unstructured texts about stock market

Tue, 02 Aug 2022 00:47:49 +0000

I am trying to extract information such as profits, revenues and others along with their corresponding dates and quarters from an unstructured text about stock market and convert it into a report in the table form but as there is not format of the input text, it is hard to know which entity belong to what date and quarters and which value belong to which entity. Chunking works on few documents but not enough. Is there any unsupervised way to linking entities with their corresponding dates, values and quarters?

Kmeans clustering in python - Giving original labels to predicted clusters

Wed, 27 Apr 2022 05:32:54 +0000

I have a dataset with 7 labels in the target variable.

X = data.drop('target', axis=1)
Y = data['target']
Y.unique()

array(['Normal_Weight', 'Overweight_Level_I', 'Overweight_Level_II',
'Obesity_Type_I', 'Insufficient_Weight', 'Obesity_Type_II',
'Obesity_Type_III'], dtype=object)

km = KMeans(n_clusters=7, init="k-means++", random_state=300)
km.fit_predict(X)
np.unique(km.labels_)

array([0, 1, 2, 3, 4, 5, 6])

After performing KMean clustering algorithm with number of clusters as 7, the resulted clusters are labeled as 0,1,2,3,4,5,6. But how to know which real label matches with the predicted label.

In other words, I want to know how to give original label names to new predicted labels, so that they can be compared like how many values are clustered correctly (Accuracy).

how to output f1-score instead of accuracy

Sat, 02 Apr 2022 13:04:21 +0000

I have the code below, outputting the accuracy. How can I output the F1-score instead? Thanks in advance,

 clf.fit(data_train,target_train)  
preds = clf.predict(data_test)  
# accuracy for the current fold only     
r2score = clf.score(data_test,target_test)

I cannot get this code to work. please help.

Mon, 21 Mar 2022 05:59:53 +0000

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from sklearn.model_selection import train_test_split

model = Sequential()
model.add(LSTM( 10, input_shape=(1, 1)))
model.add(Dense(1, activation="linear"))
model.compile(loss="mse", optimizer="adam")

X, y = get_data()

X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=1)
X_train_2, X_val, y_train_2, y_val = train_test_split(X_train, y_train, test_size=0.25, random_state=1)

model.fit(X_train, y_train, epochs=800, validation_data=(X_val, y_val), shuffle=False)

html, body, table, thead, input, textarea, select {color: #bab5ab!important; background: #35393b;} input[type="text"], textarea, select {color: #bab5ab!important; background: #35393b;} [data-darksite-inline-background-image-gradient] {background: linear-gradient(rgba(0, 0, 0, 0.5), rgba(0, 0, 0, 0.5))!important; -webkit-background-size: cover!important; -moz-background-size: cover!important; -o-background-size: cover!important; background-size: cover!important;} [data-darksite-force-inline-background] * {background-color: rgba(0,0,0,0.7)!important;} [data-darksite-inline-background] {background-color: rgba(0,0,0,0.7)!important;} [data-darksite-inline-color] {color: #fff!important;} [data-darksite-inline-background-image] {background-image: linear-gradient(rgba(0,0,0,0.3), rgba(0,0,0,0.3))!important}

When dealing with categorical values, should the 'year' column be encoded using OHE or OrdinalEncoder?

Sat, 18 Dec 2021 18:46:07 +0000

It's a car prices dataset, and so I'm assuming that the more recent the more value a car should have. The values in the 'year' column simply consist of years from 1995 to 2020.
I am trying to predict the selling price of the car.

I'm a bit new to ML, currently still doing my undergraduate so any help / tips are appreciated. Thank you.

How do I know which encoder to use to convert from categorical variables to numerical?

Mon, 29 Nov 2021 04:09:06 +0000

So say I have a column with categorical data like different styles of temperature: 'Lukewarm', 'Hot', 'Scalding', 'Cold', 'Frostbite',... etc.

I know that we can use pd.get_dummies to convert the column to numerical data within the dataframe, but I also know that there are other 'converters' (not sure if that's the correct terminology) that we can use, i.e. OneHotEncoder from Sk-learn (like I could use the pipeline module to make a nice pipeline and feed my dataframe through the pipeline to also get my categorical data encoded to numerical).

How do I know which to use? Does it matter? If it does matter, when does it matter the most (i.e. what types of problems? When there are lots of categorical variables, or few?) If anyone can give me any pointers on this type of stuff I'd greatly appreciate it.

Can Data Science solve this problem?

Sun, 24 Oct 2021 15:43:11 +0000

So, I live in Brazil, and I have a task for college that I don't know what data science method to use, if at all, to solve it. My idea is the following: We Brazilians have Real (BRL) as currency, and we of course have the dollar quotation value to see "how many Reais a dollar is worth". What I wanted to do was to make a research and see whether the Country News have any influence over this price. So for example, if Bolsonaro, our president, says some dumb stuff, the dollar got up in price, and vice versa. What I wanted to do was collect all dollar values and variance over a set time interval, and try and get webscraping to get the news over some economy sites. Here's my question then: How can I correlate the news with the dollar variance over a set time? Can data science do that? How do I preprocess this, if at all? Do I need to use bag-of-words? At least I heard so... Please help and thank you for reading.

how many samples do we need to test image segmentation using synthetic data ?

Mon, 21 Jun 2021 12:26:32 +0000

Hello,

I trained a CNN using synthetic data to perform a segmentation task on human faces. During the test and to evaluate the prediction of this network, I used 200 examples from the database to compute precision and recall.

Is this number sufficient, knowing that I control myself the data generator and that I build the database by randomly drawing the elements using centered Gaussian distributions.

Thank you,

Can we have multiple target values in a ML problem dataset for supervised learning?

Sun, 30 May 2021 16:31:48 +0000

Intermittent Mathematics (Logarim)

Wed, 05 May 2021 12:16:20 +0000

The old keypad of the telephone, it has 10 numbers (10 keys) , this keypad allows the user to enter a text by successively pressing certain key many times in a small period of time. you need to draw a graph of entering a text input using this keypad. after that you need to have a certain algorithm of finding the length of a path to enter certain text
example
aaa --> 6
aba --> 5

the link below shows the phone keypad

https://commons.wikimedia.org/wiki/File:Telephone-keypad.png

Very short text classification when category text should be replaced by another category text?

Thu, 11 Feb 2021 12:48:47 +0000

I need some tool to classify articles based on short category text which consists of two or three words separated by '-'. The RSS/XML tag content is for example:

Foreign - News

Football - Foreign

I created my own categories in DB and now I need to classify categories from parsed RSS of this news source, so it fits news categories defined by me.

I would, for example need all articles containing category "football" to be identified as a category Sport but sometimes those categories XML tags contains exact match like Foreign - News should belong in the DB to category defined by me as Foreign.

Since I used only trained decision trees frameworks from AI so for another project so far, I would like to hear advice about probably AI based approach, technique or particular framework I can use to solve this problem. I don't want to get into a dead-end street by my own poor, in the field of AI not very experienced decision.

While it can be solved by many ifs and 'contains' function, it seems to me like not a very good solution.

TLDR; I need basically something like "clever, flexible and universal if-elseif".

NOTE: I can also use article description text, if that would be necessary but it seems to me that this former category text is unambiguous enough for this kind of problem.

Do I need to save the standardization transformation?

Tue, 15 Dec 2020 13:06:48 +0000

When I standardized my data when I created my model. Do I need to save the standardization transformation when I want to predict with my model new data ?

Why should I use Dynamic Time Warping over GMM for timer series clustering?

Fri, 04 Dec 2020 03:19:16 +0000

How to predict from unseen data?

Tue, 17 Nov 2020 16:18:28 +0000

Hi. I have a question about model-based predictions when data is only available after the fact. Let me give you an example. I try to predict the result (HOME, AWAY or a DRAW) of the match based on data like number of shots, ball possession, number of fouls, etc.

TARGET	TEAM 1	TEAM 2	possesion team 1	possesion team 2	shots team 1	shots team 2	fouls team 1	fouls team 2
HOME	Arsenal	Chelsea	60	40	12	8	5	7

TARGET

TEAM 1

TEAM 2

possesion

team 1

possesion

team 2

shots

team 1

shots

team 2

fouls

team 1

fouls

team 2

HOME

Arsenal

Chelsea

Let's say I'm already after training the model and I want to see if I can predict the upcoming match. However, this match is only a few days away and I want to know the result of the model today. I understand that if the match had already taken place and I had the data, I could test it on the model and get the result. The goal is for the model to predict what will happen before the match.

Is it possible at all? What are my options? Should I only select pre-match variables? For example, last game form, match referee etc or should I aggregate the variables and include average possession, average shots and average number of fouls from recent matches?

How to model unknown yet data

Tue, 27 Oct 2020 10:39:47 +0000

So far, I have modeled on known historical data. What if there are variables known only after the fact?
Let me give you an example. I want to predict the outcome of the match, win, lose or draw. I use variables from previous games such as ball possession, number of shots, corners, etc. Let's say the Chelsea-Arsenal game is approaching Saturday. How am I supposed to build a model and predict the result if this data is not yet available for my event? What to do in such cases, is it possible to forecast such data?

From microarray data, which tools of pattern recognition can you apply to identify the genes responsible for diseases?

Thu, 15 Oct 2020 20:11:31 +0000

“During the last decade, the advent of microarray datasets stimulated a new line of research called Bioinformatics. A microarray database is a repository containing microarray gene expression data. Microarray data pose a great challenge for computational techniques, due to their large dimensionality (up to several tens of thousands of genes) and their sample sizes. Furthermore, additional experimental complications like noise and variability render the analysis of microarray data an exciting domain [Saeys et al. 2007, Bioinformatics]".

In light of the aforesaid excerpt, from microarray data which tools of the pattern recognition can you apply to identify the genes responsible for diseases like cancer? Explain how.

Can we use a trained model to supervise the other machine learning models?

Mon, 28 Sep 2020 14:17:37 +0000

Is that possible to train a machine using another trained machine?

Where can I find illustrative real life machine learning examples (In business, work. etc.)?

Tue, 22 Sep 2020 00:47:09 +0000

Is there a website for finding illustrative real-life examples of using machine learning? For instance: for End to End Machine Learning, End to End Machine Learning, Classification, Clustering, and Unsupervised Learning.

Where can I find simple machine learning mathematics explained visually?

Mon, 21 Sep 2020 23:55:12 +0000

Could you please let me know where I can find simple machine learning mathematics explained visually?

How to update the weights in backpropagation algorithm when activation function in not linear?

Mon, 10 Aug 2020 21:55:19 +0000

The goal of backpropagation is to optimize the weights so that the neural network can learn how to correctly map arbitrary inputs to outputs.

Assume for the following neural network, inputs = [$i_1,i_2$] = [0.05, 0.10], we want the neural network to output = [$o_1$,$o_2$] = [0.01, 0.99], and for learning rate, $\alpha=0.5$.
In addition, the activation function for the hidden layer (both $h_1$ and $h_2$) is sigmoid (logistic):

$S(x)=\frac{1}{1+e^{-x}}$

https://i.imgur.com/cnY5feu.png

Hint:
$w_{new} = w_{old} - \alpha \frac{\partial E}{\partial w}$

$E_{\text {total}}=\sum \frac{1}{2}(\text {target}-\text {output})^{2}$

a) Show step by step solution to calculate weights $w_1$ to $w_8$ after one update in table below.
b) Calculate initial error and error after one update (assume biases $[b_1,b_2]$ are not changing during the updates).

Updating weights in backpropagation algorithm
Weights	Initialization	New weights after one step
$w1$	0.15	?
$w2$	0.20	?
$w3$	0.25	?
$w4$	0.30	?
$w5$	0.40	?
$w6$	0.45	?
$w7$	0.50	?
$w8$	0.55	?

How to calculate the class probabilities and classify using Naive Bayes classifier?

Mon, 10 Aug 2020 21:26:28 +0000

We have data on 1000 pieces of fruit. The fruit being a Banana, Orange or some Other fruit and imagine we know 3 features of each fruit, whether it’s long or not, sweet or not and yellow or not, as displayed in the table below:

https://i.imgur.com/gOFzVXL.png

A piece of an unknown fruit with these features are provided: Long, Sweet and Yellow.

Calculate probability of each of these 3 classes based on Naive Bayes Classification algorithm and report the class.

How to print confusion matrix if I am using stratifiedkfold method?

Thu, 06 Aug 2020 21:41:19 +0000

How to split into train and test using PKL file?

Thu, 30 Jul 2020 22:08:47 +0000

What is difference between Support vector machine and Support Vector Classification?

Wed, 13 May 2020 20:22:23 +0000

guidance on sequencing data science courses below

Fri, 20 Mar 2020 13:55:49 +0000

Hello
my name is lutaaya mudathiru.

I am planning to start data science online

professional courses at Harvard

University, but i don't know which course i should begin with . I request for help in sequencing these courses below so that i can

benefitt more:

1. Principles, Statistical and Computational Tools for Reproducible Science.

2.Data Science: Inference and Modeling.

3. Data Science: Productivity Tools

4.Data Science: Wrangling

5.Data Science: Linear Regression.

6.Data Science: Machine Learning

7.Data Science: Capstone

8. Data Science: R Basics

9.DataScience:Visualization

10. DataScience:Probability.

11. High-Dimensional Data Analysis

12. Introduction to Linear Models and Matrix Algebra

13. Data science:Statistics and R

14. Fat Chance: Probability from the Ground Up

15. Introduction to Probability (on edX)

What are the differences among Data Science, Artificial Intelligence and Machine Learning?

Thu, 05 Mar 2020 03:02:31 +0000

What are the differences among Data Science, Artificial Intelligence and Machine Learning?

Can PCA be used for supervised learning?

Tue, 18 Feb 2020 21:49:18 +0000

Can PCA be used for supervised learning???

I've seen some data scientists using PCA to transform their data for only numerical variables.

However, some other data scientists say that it is only used for unsupervised ML techniques.

How to calculate residual errors for linear regression and interpret regression metrics?

Tue, 18 Feb 2020 18:30:51 +0000

Assuming we have a linear regression equation and some data points (sample), how can we calculate residual error for each data point, and total cost based on the metrics such as MAE, MSE, RMSE, MAPE, or MPE if we have their formula?

Can I use a single Pipeline for multiple estimators in scikit-learn?

Tue, 18 Feb 2020 14:14:30 +0000

Is there any proper way to combine multiple classifiers and their parameter grids in one Pipeline?

How can I find the "Sate of the art" approaches in Machine Learning?

Sat, 08 Feb 2020 00:56:39 +0000

If I want to find the latest trends in Machine Learning and best approaches known as the "State of the art" approach, what resources I can use?

How to calculate the probability and accuracy of a Logistic Regression classifier?

Mon, 03 Feb 2020 20:31:49 +0000

How to solve this problem?

https://i.imgur.com/8urywpf.jpg

Q1) Complete the ? sections

Q2) Accuracy of system if threshold = 0.5?

Q3) Accuracy of system if threshold = 0.95?

How to calculate Accuracy, Precision, Recall or F1?

Mon, 27 Jan 2020 19:22:26 +0000

In the following example, calculate Accuracy, Precision, Recall or F1?

https://i.imgur.com/OezFpqC.png

score() vs accuracy_score() in sklearn

Tue, 21 Jan 2020 21:28:11 +0000

Hi,

Since I still have confuse to use the score() and accuracy_score(), so I want to confirm my test assumption.
Q1: score(), we use the split data to test the accuracy by knn.score(X_test, y_test) to prevent bias using the same training data, right? here knn.score(X_test, y_test) just compare the pair of test value.

Q2: accuracy_score from sklearn.metrics to test the predicted output of target value "y_pred" with the y_test, using accuracy_score(y_test, y_pred), just compare the actual target value and predicted target value?

Q3.My result is the same after using both methods, are they doing the same thing?

Q4.using accuracy_score(), I can using to compare the split training target data y_train with the y_train_pred(return form knn.predict(X_train) ). Then it should be OK now, using it to show the accuracy by accuracy_score(y_train, y_train_pred), since the prediction is done and just compare the original data, then the bias does not exist?

Thanks.

Best algorithm for table reservation

Mon, 21 Oct 2019 18:03:19 +0000

What kind of algorithm would best for following problem.
I try to forecast reservation of different kind of tables. Let's say I have 100 different tables, which are reserved for from 17.00-22.00 daily. Each table is either reserved (1) or available (0) on certain hour. I'm interested in to forecast each table based on history data from 2 previous weeks. So for example result is that tomorrow 18.00-19.00 certain table is either 0 (available) or 1 (reserved).

What are the types of Classification and regression algorithms in Machine learning ?

Thu, 27 Jun 2019 21:00:05 +0000

For example, Logistic regression is classification likewise what are the other types? I am a bit confused.

How to perform a classification or regression using k-NN?

Thu, 27 Jun 2019 02:54:42 +0000

Suppose, you have given the following dataset where x and y are the 2 features and color Red or Blue is the target variable.

a) A new data point $x=1$ and $y=1$ is given. Using Euclidean distance in 3-NN, what you predict as the color for this data point?

Dataset
x	y	Color
-1	1	Red
0	1	Blue
0	2	Red
1	-1	Red
1	0	Blue
1	2	Blue
2	2	Red
2	3	Blue

b) Now assume we have the following dataset and the target value is the price. A new data point $x=1$ and $y=1$ is given. Using Euclidean distance in 3-NN. What would be the estimated price?

Dataset
x	y	Price
-1	1	$100
0	1	$50
0	2	$20
1	-1	$40
1	0	$30
1	2	$40
2	2	$70
2	3	$30

How to calculate k-means clustering with a numerical example?

Thu, 27 Jun 2019 02:16:32 +0000

Use the k-means algorithm and Euclidean distance to cluster the following 8 examples into 3 clusters:

$A1=(2,10), A2=(2,5), A3=(8,4), A4=(5,8), A5=(7,5), A6=(6,4), A7=(1,2), A8=(4,9)$.

Suppose that the initial seeds (centers of each cluster) are $A1$, $A4$ and $A7$. Run the k-means algorithm for 1 epoch only. At the end of this epoch show:

a) The new clusters (i.e. the examples belonging to each cluster)

b) The centers of the new clusters

c) Draw a 10 by 10 space with all the 8 points and show the clusters after the first epoch and the new centroids.

d) How many more iterations are needed to converge? Draw the result for each epoch

How to calculate the class probabilities and classify using Naive Bayes classifier for NLP?

Wed, 26 Jun 2019 19:43:41 +0000

We want to use Naive Bayes for tagging documents. It is a classification task that we want to assign a class (tag) to each string. We currently have two tags: Sport and Not Sport

Which tag does the sentence A very close game belong to? Using Naive Bayes classifier, calculate the class probability for Sport and Not sport for this sentence based on the dataset and decide about the tag.

Text	Tag
“A great game”	Sports
“The election was over”	Not sports
“Very clean match”	Sports
“A clean but forgettable game”	Sports
“It was a close election”	Not sports

How to calculate Covariance Matrix and Principal Components for PCA?

Wed, 26 Jun 2019 10:40:02 +0000

The dataset with two features $(x,y)$ is shown as follows (note $y$ in this example is the second feature, not a target value):

x	y
2.5	2.4
0.5	0.7
2.2	2.9
1.9	2.2
3.1	3.0
2.3	2.7
2.0	1.6
1.0	1.1
1.5	1.6
1.1	0.9

a) Calculate the Covariance Matrix.
b) Calculate eigenvalues and eigenvectors
c) Calculate all the PCs
d) How much percent of the total variance in the dataset is explained by each PC?

How to calculate convolutions on a CONV layer for a Convolutional Neural Network?

Wed, 26 Jun 2019 08:54:12 +0000

Assume we have a $5\times5$ px RGB image with 3 channels respectively for R, G, and B. If

R
2	0	0	0	0
1	2	0	0	1
2	0	1	0	2
1	2	1	0	1
0	1	0	2	0

G
0	2	1	2	2
1	1	1	0	0
0	0	2	2	0
2	0	0	2	0
0	2	1	1	1

B
0	1	0	0	1
1	1	2	0	1
1	0	2	0	2
1	0	1	1	0
1	2	1	1	2

We have one $3\times3$ px kernel (filter) with 3 channels as follows:

Filter - R
0	0	1
1	0	1
1	0	0

Filter - G
0	0	-1
1	0	0
1	-1	0

Filter - B
1	0	1
0	1	-1
1	-1	0

a) If Stride = 2, and Zero-padding = 1, and Bias = 1, what will be the result of convolution?

b) What is the result after applying a ReLU layer ($max(z,0)$)on the result with the same size of the reuslt in part a?

c) Calculate the output by applying max-pooling layer with the size of $2\times2$ on the output of part b, and Stride = 1. (hint: max-pooling layer here and usually do not include any zero-paddings)

d) What is the result after applying flatten on the output of part c and creating a vector?

e) Assume the vector you created contains m elements. Consider it as the input vector for a Softmax Regression classifier (without any hidden layers and biases and it is fully connected). Assume there are 2 classes of 0 and 1. For all the weights from each element in the feature vector, the optimized weights are 1 for odd elements and 2 for even elements. For example, if the feature vector is [10,11,12,13,14], all the weights from 10 are 1 (because 10 is element 1 and 1 is odd), all the weights from 11 are 2, all the weights from 12 are 1, all the weights from 13 are 2 and all the weights from 14 are 1 and so on. Draw the Softmax Regression network and calculate the class should be 0 or 1?

Hint:
Softmax Regression: $p_{i}=\frac{e^{z_{i}}}{\sum_{i=1}^{c} e^{z_{i}}}$
Where $p_{i}$ is the probability of class $i$ anc $c$ is the number of classes.

What is the difference between cross-validation and validation set?

Wed, 19 Jun 2019 18:39:39 +0000

I am confused about this figure. Is not this a cross-validation test or we have a fixed few examples for which it is tested while you also have various folds being tested at the same time?

https://i.imgur.com/aVru1MX.png

In DBSCAN algorithm, how should we choose optimal eps and minimum points?

Thu, 13 Jun 2019 17:22:08 +0000

How to optimize weights in Logistic Regression?

Wed, 05 Jun 2019 17:38:50 +0000

The hypothesis (model) of Logistic Regression which is a binary classifier ( $y =\{0,1\} $) is given in the equation below:

Hypothesis

$S(z)=P(y=1 | x)=h_{\theta}(x)=\frac{1}{1+\exp \left(-\theta^{\top} x\right)}$

Which calculates probability of Class 1, and by setting a threshold (such as $h_{\theta}(x) > 0.5 $) we can classify to 1, or 0.

Cost function

The cost function for Logistic Regression is defined as below. It is called binary cross entropy loss function:

$J(\theta)=-\frac{1}{m} \sum_{i}^{m}\left(y^{(i)} \log \left(h_{\theta}\left(x^{(i)}\right)\right)+\left(1-y^{(i)}\right) \log \left(1-h_{\theta}\left(x^{(i)}\right)\right)\right)$

Iterative updates

Assume we start all the model parameters with a random number (in this case the only model parameters we have are $\theta_j$ and assume we initialized all of them with 1: for all $\theta_j = 1$ for $j=\{0,1,...,n\}$ and $n$ is the number of features we have)

$\theta_{j_{n e w}} \leftarrow \theta_{j_{o l d}}+\alpha \times \frac{1}{m} \sum_{i=1}^{m}\left[y^{(i)}-\sigma\left(\theta_{j_{o l d}}^{\top}\left(x^{(i)}\right)\right)\right] x_{j}^{(i)}$

Where:
$m =$ number of rows in the training batch
$x^{(i)} = $ the feature vector for sample $i$
$\theta_j = $ the coefficient vector corresponding the features
$y^{(i)} = $ actual class label for sample $i$ in the training batch
$x_{j}^{(i)} = $ the element (column) $j$ in the feature vector for sample $i$
$\alpha =$ the learning rate

Dataset

The training dataset of pass/fail in an exam for 5 students is given in the table below:

If we initialize all the model parameters with 1 (all $\theta_j = 1$), and the learning rate is $\alpha = 0.1$, and if we use batch gradient descent, what will be the:

$a)$ Accuracy of the model at initialization of the train set ($\text{accuracy} = \frac{\text{number of correct classifications}}{\text{all classifications}}$)?
$b)$ Cost at initialization?
$c)$ Cost after 1 epoch?
$d)$ Repeat all $a,b,c$ steps if we use mini-batch gradient descent and $\text{batch size} = 2$

(Hint: For $x_{j}^{(i)}$ when $j=0$ we have $x_{0}^{(i)} = 1$ for all $i$ )

Could you please explain math symbols behind Machine Learning equations?

Sat, 18 May 2019 19:56:35 +0000

How do I Plot the linear classifier calculated with LIBLINEAR using sklearn?

Thu, 16 May 2019 08:13:06 +0000

Make a scatter plot where the x-axis is the height of the citizens and the y-axis is the weight of the citizens. The color of the points need to be different for males and females. In the same figure, plot the linear classifier calculated with LIBLINEAR using sklearn

is impossible predict hours time series to minutes time series?

Wed, 01 May 2019 13:11:26 +0000

https://stackoverflow.com/questions/55930051/is-impossible-predict-hours-time-series-to-minutes-time-series

i want to this hours time series predict model to minute predict model

x	y
2.5	2.4
0.5	0.7
2.2	2.9
1.9	2.2
3.1	3.0
2.3	2.7
2.0	1.6
1.0	1.1
1.5	1.6
1.1	0.9

x	y
2.5	2.4
0.5	0.7
2.2	2.9
1.9	2.2
3.1	3.0
2.3	2.7
2.0	1.6
1.0	1.1
1.5	1.6
1.1	0.9

x	y
2.5	2.4
0.5	0.7
2.2	2.9
1.9	2.2
3.1	3.0
2.3	2.7
2.0	1.6
1.0	1.1
1.5	1.6
1.1	0.9