Ask Ghassem - Recent questions tagged nlp

Creating tables from unstructured texts about stock market

Tue, 02 Aug 2022 00:47:49 +0000

I am trying to extract information such as profits, revenues and others along with their corresponding dates and quarters from an unstructured text about stock market and convert it into a report in the table form but as there is not format of the input text, it is hard to know which entity belong to what date and quarters and which value belong to which entity. Chunking works on few documents but not enough. Is there any unsupervised way to linking entities with their corresponding dates, values and quarters?

Binary Classification and neutral tag

Sat, 30 Jan 2021 10:08:01 +0000

I am trying to create a sentiment analysis model using binary classification as loss.I have a batch of tweets that some of them are tagged as positive (labeled as 1) and negative (labeled as 0).I manage to gather some tweets that are tagged as neutral but there are less tweets than positive and negative.My thinking is to tag them with 0.5 to balance the classification probability.Is this legit?

"Rare words" on vocabulary

Sat, 30 Jan 2021 09:57:31 +0000

I am trying to create a sentiment analysis model and I have a question.

After I preprocessed my tweets and created my vocabulary I've noticed that I have words that appear less than 5 times in my dataset (Also there are many of them that appear 1 time). Many of them are real words and not gibberish. My thinking is that if I keep those words then they will get wrong "sentimental" weights and gonna make my model worse.
Is my thinking right or am I missing something?

My vocab size is around 40000 words and those that are "rare" are around 10k.Should I "sacrifice" them?

Thanks in advance.

How to calculate the class probabilities and classify using Naive Bayes classifier for NLP?

Wed, 26 Jun 2019 19:43:41 +0000

We want to use Naive Bayes for tagging documents. It is a classification task that we want to assign a class (tag) to each string. We currently have two tags: Sport and Not Sport

Which tag does the sentence A very close game belong to? Using Naive Bayes classifier, calculate the class probability for Sport and Not sport for this sentence based on the dataset and decide about the tag.

Text	Tag
“A great game”	Sports
“The election was over”	Not sports
“Very clean match”	Sports
“A clean but forgettable game”	Sports
“It was a close election”	Not sports

How to perform sentiment analysis in NLP?

Wed, 17 Oct 2018 00:45:12 +0000

If trying to read text and need to finalize texts as good, bad , ugly or any such buckets, where to start? What sentiment functions to use?

What are Natural Language Processing (NLP) and its applications?

Mon, 08 Oct 2018 11:59:52 +0000

What is TF-IDF algorithm?

Mon, 08 Oct 2018 11:57:39 +0000