Skip to content

ml

Basics of Statistics for ML

graph TD;
    Inferential_Statistics["Inferential Statistics"]
    Descriptive_Statistics["Descriptive Statistics"]
    Measure_of_Central_Tendency["Measure of Central Tendency"]
    Weighted_Mean["Weighted Mean"]
    Trimmed_Mean["Trimmed Mean"]
    Measure_of_Dispersion["Measure of Dispersion"]
    Standard_Deviation["Standard Deviation"]
    CV["Coefficient of Variation"]
    Five_Number_Summary["5 Number Summary"]
    Box_Plot["Box Plot / Whisker Plot"]

    Statistics --> Descriptive_Statistics
    Statistics --> Inferential_Statistics
    Descriptive_Statistics --> Measure_of_Central_Tendency
    Descriptive_Statistics --> Measure_of_Dispersion
    Measure_of_Central_Tendency --> Mean
    Measure_of_Central_Tendency --> Median
    Measure_of_Central_Tendency --> Mode
    Mean --> Weighted_Mean
    Mean --> Trimmed_Mean
    Measure_of_Dispersion --> UniVariate
    Measure_of_Dispersion --> BiVariate
    UniVariate --> Range
    UniVariate --> Variance
    UniVariate --> Standard_Deviation
    UniVariate --> CV
    UniVariate --> Five_Number_Summary
    Five_Number_Summary --> Percentile
    Five_Number_Summary --> Box_Plot
    BiVariate --> Covariance
    BiVariate --> Correlation

🧪 Introduction To Hypothesis Testing

🥊 Null V/S Alternative Hypothesis

Parameter Null Hypothesis Alternative Hypothesis
Definition A null hypothesis is a statement in which there is no relation between the two variables. An alternative hypothesis is a statement in which there is some statistical relationship between the two variables.
What is it? Generally, researchers try to reject or disprove it. Researchers try to accept or prove it.
Testing Process Indirect and Implicit Direct and Explicit
p-value Null hypothesis is rejected if the p-value is less than the alpha-value; otherwise, it is accepted. An alternative hypothesis is accepted if the p-value is less than the alpha-value otherwise, it is rejected.
Notation \(H_0\) \(H_1\)
Symbol Used Equality Symbol (=, ≥, ≤) Inequality Symbol (≠, <, >)

🌴 Tree VS Regression Models

tree-vs-regression-models

Source: www.freecodecamp.org

Tree based models and Regression models are widely used Machine Learning models. So more you know about them is better for you. Also, many concepts from these models are borrowed by advance Machine Learning models like Gradient Boosting, XGBoost, etc.

These models are also great choice for :fontawesome-user-tie: interviewers so from these models they ask many interview questions. This blog mainly focuses on tree based models.

Clustering Algorithms in ML

Applications of Clustering

  • Customer Segmentation: To show personalized ADs to customers.
  • Data Analysis: Perform analysis to each cluster after performing clustering on the whole dataset.
  • Semi Supervised Learning: Google Photos uses this technique to identify person's face and put them into a separate folder.
  • Image Segmentation: You can create segments in photos to represent different objects in the photo.

KMeans