Data Continuum
Posts
How Analytics can answer a lot questions and Analyzing Retail Transactions

How Analytics can answer a lot questions and Analyzing Retail Transactions

"What's so unique about Looker Studio?"

Sasi SB
June 25, 2024

Data Science Nugget 🧽

Data Analytics can be divided based on the 5 types of questions it can answer.

5 Types of Questions Data Analytics Can Answer

Descriptive Analytics – What happened?

Summarizes large datasets to show what occurred.
Example: Analyzing sales data to see how many products were sold last month.

Diagnostic Analytics – Why did it happen?

Digs deeper into data to find causes of events.
Example: Investigating why sales dropped by examining related data and trends.

Predictive Analytics – What will happen in the future?

Uses statistical and machine learning techniques to forecast future events.
Example: Predicting next quarter's sales based on historical data.

Prescriptive Analytics – What actions should be taken?

Recommends actions to achieve specific goals.
Example: Suggesting marketing strategies to boost future sales.
Cognitive Analytics – How can the problem be solved best?

Uses AI and machine learning to continuously improve decision-making.
Example: Implementing a self-learning system that becomes better at recommending products to customers over time.

These different types of data analytics help businesses understand their data, identify trends, predict future outcomes, and make informed decisions.

Interesting Dataset for Practice 📊

Retail Transaction Dataset

This dataset contains detailed information about retail transactions, aimed at providing a comprehensive view of customer behavior and purchasing patterns.

Project Ideas:

1) Cluster analysis to group customers based on their purchase behavior

2) Regression analysis to identify the reasons for a bad customer Rating

3) EDA - Visualization to understand the spread and structure of the dataset

Data Analysis Tool of the Week 🛠️

Looker is a powerful data analysis and business intelligence (BI) tool that enables companies to explore, analyze, and share real-time business analytics easily.

It provides a user-friendly interface for creating dynamic and interactive dashboards, making data visualization more accessible.

Looker's unique modeling language, LookML, allows data teams to define and manage data relationships, ensuring consistent metrics and definitions across the organization.

Additionally, it integrates seamlessly with various databases and supports collaborative data analysis, helping teams make data-driven decisions more effectively.

Overall, Looker simplifies complex data workflows and empowers users with actionable insights.

Looker studio is a really fun tool when it comes to writing expressions as they are very similar to SQL!

Q&A Section 🙋

“Why do we use a significance level of 0.05 in statistics?”

This is one of the most frequently asked questions in Interviews.

I've already written a thread about it, but I believe it would be really helpful for you if I explained it in more detail in my newsletter.

Now, let's answer the question.

In hypothesis testing, the significance level (denoted as 𝛼) represents the probability of rejecting the null hypothesis when it is actually true. This is also known as the Type I error rate. A common choice for 𝛼 is 0.05, meaning there is a 5% chance of making a Type I error.

The choice of 0.05 is somewhat arbitrary but has been standardized over time due to a balance between being too lenient and too strict. A lower 𝛼 (e.g., 0.01) would reduce the likelihood of a Type I error but increase the chance of a Type II error (failing to reject a false null hypothesis).

Conversely, a higher 𝛼 (e.g., 0.10) would increase the risk of a Type I error, potentially leading to false positives. The 0.05 level is seen as a reasonable compromise, balancing the trade-offs between Type I and Type II errors.

Let’s use this analogy to help you understand it better

Imagine you are a quality control inspector at a factory producing light bulbs. Your job is to decide whether a batch of light bulbs meets the quality standards or should be rejected due to defects.

Null Hypothesis (𝐻0): The batch of light bulbs is of acceptable quality.

Alternative Hypothesis (𝐻1): The batch of light bulbs is defective.

You have a tool that checks the quality of the light bulbs, but it's not perfect – it can sometimes incorrectly tell you that a good batch is defective or a defective batch is good.

Using a significance level of 0.05 is like deciding that you will reject a batch of light bulbs if your tool gives you a signal that indicates defects 5% of the time by random chance alone. This means you accept a 5% risk of rejecting a good batch (Type I error).

Oh and by the way, if you’re looking to learn more about Data Science, preparing for your PL300 exam, and want direct access to me where I answer your questions and provide you with:

Portfolio project opportunities in Power BI and Python
Additional weekly assignments
And more planned stuff

Click Here

Reply

or to participate.