Module #3 Assignment


Module #3 Assignment 


The images below is the compiled result of working code that was entered into RStudio. 
I was tasked with finding the mean, medium, and mode under Central Tendency for each set. 
I was also tasked with finding the range, interquartile, variance, and standard deviation under Variation for each set.

Set#1:  10, 2, 3, 2, 4, 2, 5
Set#2:  20, 12, 13, 12, 14, 12, 15


Set#1:  10, 2, 3, 2, 4, 2, 5
At first I found the computations for Set1. The mean was 4, the median was 3, and the mode was 2. This is for Central Tendency. The code I put into RStudio had the basic forms of the function as well as the calculations with parameters. I used both forms for better understanding as well as demonstrating how they can both lead to the same answer. mean(Set1) and mean(Set1, trim = 0, na.rm = FALSE) are effectively the same when there are no missing values and no trimming is needed. I also want to mention that mode does not have a basic function like mean and median. I had to create a function for this. Based on my previous experience with Rstudio I was able to create a function that got me the mode of 2. I am going to break down and explain everything in this function below. 

find_mode <- function(x) {
  uniq_values <- unique(x)
  uniq_values[which.max(tabulate(match(x, uniq_values)))]
}
mode_value <- find_mode(Set1)
print(mode_value)

unique(x): This extracts the unique values from the dataset x. It ensures you're working with distinct values.
match(x, uniq_values): This finds the position of each element in x within the unique values list.
tabulate(): This counts the occurrences of each unique value.
which.max(): This identifies the position of the most frequent value (the mode).
The function then returns the unique value that occurs most frequently.

I then found the computations for Set1 for the Variation. The range was 8, the Interquartile Range was 2.5, the variance is 8.33, and the Standard Deviation is 2.89. For Range the basic form of the function did not provide the right answer so I made my own function: print(max(Set1, na.rm=TRUE)-min(Set1, na.rm=TRUE)). This gave me a proper answer of 8 instead of "2 10". For Interquartile I used the basic function IQR(Set1) and var(Set1) for Variance. For Standard Variation I also used the basic function sd(Set1).


Set#2:  20, 12, 13, 12, 14, 12, 15
Now I will be discussing the computations of Set2. The mean was 14, the median is 13, and the mode 12. This is for Central Tendency. For Variation, the Range was 8, the Interquartile Range was 2.5, the variance was 8.33, and the Standard Deviation was 2.89. I used the same code that I used for Set1 to find the answers for Set2. I used the basic form of functions, functions with parameters, and my own version of functions to get the right answer. 


Compare and Contrast


Statistic
          

              
Set1
                        

                        Set2
Mean                4.0                         14.0
Median                3.0                         13.0
Mode                   2                            12
Range                   8                              8
Interquartile Range (IQR)                2.5                           2.5
Variance              8.33                         8.33
Standard Deviation              2.89                         2.89

I created a table with the different computational answers for Set1 and Set2. As you can see from the table, Set1 and Set2 have the same Standard Deviation, Variance, IQR, and Range. What's different is the Mean, Median and Mode. Set2 has a higher Central Tendency than Set1, indicating larger values than Set1. The Variation is the same for both sets. The reasoning for this is because the distribution of the data values in both sets are the same. 




























Comments

Popular posts from this blog

Module #12 Assignment

Module #10 Assignment