<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>Ask Ghassem - Recent activity in Exploratory Data Analysis</title>
<link>https://ask.ghassem.com/activity/data-science/eda</link>
<description>Powered by Question2Answer</description>
<item>
<title>Can you verify the validity of this chart comparing the review scores for Marvel Phase 4?</title>
<link>https://ask.ghassem.com/1030/verify-validity-chart-comparing-review-scores-marvel-phase</link>
<description>&lt;p&gt;I have some skepticism about the validity of the charts below comparing the critic and audience reviews for Phase 4 of the MCU to the previous 3 phases. There are over 18 movies and tv shows in Phase 4 compared to the 6 movies in Phases 1 &amp;amp; 2 and the 11 movies in Phase 3. Also, there are far fewer critic reviews for the Phase 4 tv shows than the Phase 4 movies. For example, on Rotten Tomatoes there are only 40 critic reviews for The Falcon and the Winter Soldier and 452 critic reviews for Black Widow. Could this uneven and inconsistent number of reviews between tv shows and movies in Phase 4 be inaccurately making the overall averages higher than they should be? Or do you agree with the conclusions presented in the charts?&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://cdn.discordapp.com/attachments/997145183172964435/1059948060194652230/image.png&quot;&gt;https://cdn.discordapp.com/attachments/997145183172964435/1059948060194652230/image.png&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://cdn.discordapp.com/attachments/997145183172964435/1049356020469739520/image.png&quot;&gt;https://cdn.discordapp.com/attachments/997145183172964435/1049356020469739520/image.png&lt;/a&gt;&lt;/p&gt;</description>
<category>Exploratory Data Analysis</category>
<guid isPermaLink="true">https://ask.ghassem.com/1030/verify-validity-chart-comparing-review-scores-marvel-phase</guid>
<pubDate>Mon, 09 Jan 2023 16:29:14 +0000</pubDate>
</item>
<item>
<title>forecast log transformed fitted values for 2 years using ARMA model</title>
<link>https://ask.ghassem.com/1023/forecast-transformed-fitted-values-years-using-arma-model</link>
<description>Input is a stock price in exponential transformation. We are asked to forecast using ARMA results for 2 years.</description>
<category>Exploratory Data Analysis</category>
<guid isPermaLink="true">https://ask.ghassem.com/1023/forecast-transformed-fitted-values-years-using-arma-model</guid>
<pubDate>Wed, 04 May 2022 20:31:44 +0000</pubDate>
</item>
<item>
<title>How do I know which encoder to use to convert from categorical variables to numerical?</title>
<link>https://ask.ghassem.com/1006/know-which-encoder-convert-categorical-variables-numerical</link>
<description>So say I have a column with categorical data like different styles of temperature: &amp;#039;Lukewarm&amp;#039;, &amp;#039;Hot&amp;#039;, &amp;#039;Scalding&amp;#039;, &amp;#039;Cold&amp;#039;, &amp;#039;Frostbite&amp;#039;,... etc.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
I know that we can use pd.get_dummies to convert the column to numerical data within the dataframe, but I also know that there are other &amp;#039;converters&amp;#039; (not sure if that&amp;#039;s the correct terminology) that we can use, i.e. OneHotEncoder from Sk-learn (like I could use the pipeline module to make a nice pipeline and feed my dataframe through the pipeline to also get my categorical data encoded to numerical).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
How do I know which to use? Does it matter? If it does matter, when does it matter the most (i.e. what types of problems? When there are lots of categorical variables, or few?) If anyone can give me any pointers on this type of stuff I&amp;#039;d greatly appreciate it.</description>
<category>Exploratory Data Analysis</category>
<guid isPermaLink="true">https://ask.ghassem.com/1006/know-which-encoder-convert-categorical-variables-numerical</guid>
<pubDate>Mon, 29 Nov 2021 04:09:06 +0000</pubDate>
</item>
<item>
<title>ValueError: Length mismatch: Expected axis has 60 elements, new values have 2935849 elements</title>
<link>https://ask.ghassem.com/1005/valueerror-length-mismatch-expected-elements-2935849-elements</link>
<description>&lt;p&gt;I&#039;m creating a new data frame&amp;nbsp;with the most used items grouped together. But I got the following error when grouping through ID and items.&amp;nbsp;ValueError: Length mismatch: Expected axis has 60 elements, new values have 2935849 elements.&lt;/p&gt;

&lt;pre class=&quot;prettyprint lang-python&quot; data-pbcklang=&quot;python&quot; data-pbcktabsize=&quot;4&quot;&gt;
df = sales_df[sales_df[&#039;shop_id&#039;].duplicated(keep=False)]
df[&#039;Grouped&#039;] = sales_df.groupby(&#039;shop_id&#039;)[&#039;item_name&#039;].transform(lambda x: &#039;,&#039;.join(x))
df2 = df[[&#039;shop_id&#039;, &#039;Grouped&#039;]].drop_duplicates()&lt;/pre&gt;

&lt;p&gt;In the aforementioned code, I&#039;m making a data frame with respect to shop id and then grouping through shop items. My objective here is to group items with similar ID.&lt;/p&gt;</description>
<category>Exploratory Data Analysis</category>
<guid isPermaLink="true">https://ask.ghassem.com/1005/valueerror-length-mismatch-expected-elements-2935849-elements</guid>
<pubDate>Fri, 26 Nov 2021 06:09:16 +0000</pubDate>
</item>
<item>
<title>Answered: What are basic steps for treating missing values?</title>
<link>https://ask.ghassem.com/430/what-are-basic-steps-for-treating-missing-values?show=431#a431</link>
<description>&lt;p&gt;Please watch the vide regarding treating missing values &lt;a rel=&quot;nofollow&quot; href=&quot;https://www.lynda.com/Python-tutorials/Treat-missing-values/520233/601940-4.html&quot;&gt;here&lt;/a&gt;. Also take a look at &lt;a rel=&quot;nofollow&quot; href=&quot;https://www.datasciencecentral.com/profiles/blogs/how-to-treat-missing-values-in-your-data-1&quot;&gt;this article&lt;/a&gt;. If you found a better article, please share it with us!&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description>
<category>Exploratory Data Analysis</category>
<guid isPermaLink="true">https://ask.ghassem.com/430/what-are-basic-steps-for-treating-missing-values?show=431#a431</guid>
<pubDate>Fri, 19 Oct 2018 04:11:21 +0000</pubDate>
</item>
</channel>
</rss>