“Instructions for the Major part of assignment , the word file worth 18% of your final grade you submit to Turnitin.**Overview:** You need to submit a word file with the answers to 10 questions the first 8 are about the dataset the last question is a paraphrasing task (refer to pages 3 to 6)**BUS105 Computing Assignment-King’s Own Institute Australia.**

You will use your dataset and the automatic dataset summarizer to get the descriptive statistics that are used questions 1 to 5 and the inferential statistics that are used in question 6 to 8. to check you have correctly obtained your dataset check both p-values are correct when you

The word count can be less than 1500 words if you are giving answers that demonstrate you have understood the material.

Summary of the datasets (question 1 to 8 given on pages 3 to 6 are about the datasets)

**Dataset 1**

University XYZ gives out a survey to students in a statistics course

The survey questions were

1) Do you think the course is useful and do you understand why?

2) How many videos have you watched ?

The questions and the students’ answers are a dataset

**Dataset 2**

University XYZ gives out a survey to students in a statistics course

The survey questions were

1) What style of Youtube video do you prefer, chatty or direct ?

2) Are you scared of maths

How many videos did you watch ?

The questions and the students’ answers are a dataset

**Dataset 3**

Business XYZ is using videos to replace meetings to maintain social distancing The duration of the video (in seconds) and engagement score is recorded for many videos The engagement score is low if people only watch the first part of the video.

**Question 1**

Paste dataset 1 into the dataset summarizer

a) Paste in the descriptive statistics into the word file. The descriptive sample statistics let you investigate the relationship between the variables “course useful?” and “number of videos watched?” using the sample

b) Use the output in part (a) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (choose one)

Difference between sample means –

Difference between sample proportions –

correlation coefficient r

**Question 2**

Paste dataset 2 into the dataset summarizer

a) Paste in the descriptive statistics into the word file. The descriptive sample statistics let you investigate the relationship between the variables “Preferred style?” and “Scared of maths?” using the sample

b) Use the output in part (a) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (chose one)

Difference between sample means –

Difference between sample proportions –

correlation coefficient

**Question 3**

Paste dataset 3 into the dataset summarizer

a) Paste in the descriptive sample statistics and the scatter plot into the word file. The descriptive statistics let you investigate the relationship between the variables “Duration?” and “Engagement score?” using the sample

b) Use the output in part (a) to describe the relationship between the two variables, your discussion must use one of the following sample statistics

Difference between sample means –

Difference between sample proportions –

correlation coefficient r

c) Predict the engagement score of a video with duration 600.

**Question 4**

Use the output for question 1a

a) Just considering the people that do not find the course useful find the z score of the sample mean if you assume the population mean is µ=5 and the population standard deviation is σ=3

b) Just considering the people that do find the course useful find the z score of the sample

mean if you assume the population mean is µ=5 and the population standard deviation is σ=3

**Question 5**

a) Just considering the people that prefer the chatty style of video find a 90% confidence interval for the proportion of people that are scared of maths

b) Just considering the people that prefer the direct style of video find a 90% confidence interval for the proportion of people that are scared of maths

**Question 6**

Paste dataset 1 into the dataset summarizer

a) Paste in inferential statistics that measure evidence for the claim there is a relationship between the variables “course useful?” and “number of videos watched ?” if you consider the whole population

b) Make suitable comments about the output in part (a)

c) Go back to the dataset summarizer and scroll down , Paste in the output for question 6 c given below the inferential statistics and fill in the blank, your number has to be a summary of the sample that would have a lower p-value

**Question 7**

Paste dataset 2 into the dataset summarizer

a) Paste in computer output that measure evidence for the claim there is a relationship between the variables “preferred style ?” and “scared of Maths?” if you consider the whole population **Hint:** inferential statistics measure evidence for a claim.

b) Make suitable comments about the output in part (a)

c) Go back to the dataset summarizer and scroll down , Paste in the output for question 7c given below the inferential statistics and fill in the blanks, your number has to be the summary of a sample that would have a lower p-value

**Question 8**

Paste dataset 2 into the dataset summarizer

a) Paste in computer output that measures evidence for the claim there is a relationship between the variables “Duration?” and “engagement score?” if you consider the whole population**Hint:** inferential statistics measure evidence for a claim.

b) Make suitable comments about the output in part (a)

c) If another sample had a higher correlation would you expect the pvalue to be lower or higher ?

**Question 9**

Briefly discuss the sample report given in the link below , in particular discuss the data , how the data was analysed and the main message of the report (you need to click download, logging in will not work) and discuss how it is communicated. Do not cut and paste text and use a computer to randomly change the words

**Question 10**

Give a quick comment about the discussion of p-values given in the link, For each

**Upload the word file to the Turnitin link on moodle**

**Instructions for the excel file**

This is worth 2% of your final grade

you have to use the excel commands discussed below and not the dataset summarizer However you should check that your summaries are the same as the output from the dataset summarizer you used in the word file.

If you have different information you will get at most 1 out of 2

You need to cut and paste just your dataset into a new excel file and follow the 4 instructions below, DO NOT use a cover page for the excel file, you must check that you have the correct sample

Note that you can still do this at home even if you do not have excel, just use google sheets

A) Select all of dataset 1 and use excel Pivot Table commands (or google sheet pivot table commands) to find appropriate sample statistics that let you investigate the relationship between the fields (variables) “course useful?” and “number of videos watched?”

B) Select all of dataset 2 and use excel Pivot Table commands (or google sheet pivot table commands) to find appropriate sample statistics that let you investigate the relationship between the fields (variables) “preferred style?” and “scared of maths?”

C) Select all of dataset 3 and use excel commands to make a graph that lets you investigate the relationship between the fields (variables) “duration?” and “engagement score?”