**Subject Code & Title :** BUS105 Computing **Assessment Type :** Assignment

“Instructions for the computing assignment word file worth 18% of your final grade and excel file worth 2% of your final grade**Overview :** Materials that must be used in the assignment – these are provided on moodle

i.An excel file with the datasets for all the students , each student must follow the instructions and get 4 data sets using their student number , each student will have different data sets

ii. 3 types of data set summarizer, an excel file that automatically summarizes a data set with 2 quantitative variables an excel file that automatically summarizes a data set with 2 categorical variables

an excel file that automatically summarizes a data set with 1 categorical and 1 quantitative variables**BUS105 Computing Assignment – king’s own institute Australia.**

The following material briefly discusses where to find the material for the assignment

**Submission :**1 Students must submit a word file (worth 18%) to Moodle AND an excel file (worth 2%) to moodle

2 The word file needs to be submitted to the Turnitin link – instructions are given on page 6

3 The word file needs a cover page

4 The word file needs the answers to 9 questions given in full detail later in this document a vital part of answering the question is using the data set and the data set summarizer.

5 The excel file needs to be submitted to the assignment dropbox,

6 The excel file should have the student’s 3 datasets and summaries NOT made by the automatic data set summarizer – summarize the data set using Pivot Tables and the scatter plot. (Instructions for submitting the excel file are given on page 8)

The Computing assignment also consists of 5 preparation quizzes worth 1% each these preparation quizzes are on moodle.

“Instructions for Major part of assignment, the word file worth 18% of your final grade you submit to Turnitin.

**Overview :**

You need to submit a word file with the answers to 9 questions – the first 8 questions are about the data sets, the last question is a paraphrasing task

You will use your datasets and the automatic data set summarizer to get the descriptive statistics that are used in questions 1 to 5 and the inferential statistics that are used in question 6 to 8. To check you have correctly obtained your data set check both p-values are correct when you investigate both categorical variables (question 6 to 8). There will be videos on moodle explaining to check you have properly obtained your sample

**Summary of the datasets (questions 1 to 8 are about the datasets)**

**Data set 1**

Version 1 of a pregnancy test results

The variables

Are “Reality, pregnant or not pregnant?” and “Test result, positive or negative?”

**Data set 2**Version 2 of a pregnancy test results

The variables

Are “Reality, pregnant or not pregnant?” and “Test result, positive or negative ?”

**Data set 3**Daily flight cancelations at airline ABC The variables are “Which Country“

Country A or Country B

Number of tests in one week

Number of people needing tech support

**Question 1 :**

a) Paste data set 1 into an appropriate data set summarizer Paste in the descriptive statistics into the word file. The descriptive sample statistics let you investigate the relationship between the variables “Reality, pregnant or not pregnant” and“Test result, positive or negative” using the sample, This lets you check the accuracy of version 1 of the virus test

b) Use part a Describe the relationship between the two variables using one of the following numbers, choose the correct option

1. The difference between sample means –

2. The difference between sample proportions –

3. The correlation coefficient

Your description of the relationship between the variables should also describe the relationship using plain English

c) Paste data set 2 into an appropriate data set summarizer

Paste in the descriptive statistics into the word file. The descriptive sample statistics let you investigate the relationship between the variables “Reality, Pregnant or not pregnant” and““Test result, positive or negative” using the sample , This lets you check the accuracy of version 2 of the virus test

d) Use the answer in c) part to describe the relationship between the two variables using one of the following numbers, choose the correct option

1. The difference between sample means –

2.The difference between sample proportions –

3. The correlation coefficient r

Your description of the relationship between the variables should also describe the relationship using plain English

e) Which version of the virus test is better ? , version 1 or version 2? Give a reason for you answer,you can use the answer to part b) and d) as a way of deciding which version is better, you do not have to decide which is worse false positives or false negatives

**Question 2 :**

Paste the first two variables of data set 3 into a data set summarizer

a) Paste the descriptive sample statistics below. The descriptive statistics let you investigate the relationship between the variables “Which country?” and “Number of tests ?” using the sample

b) Use the answer to part a) to describe the relationship by using one of the following numbers,select the correct option

i. The difference between sample means –

ii. The difference between sample proportions –

iii. The correlation coefficient r

You should also describe the relationship in plain English

c) Paste in the graph that shows the predicted shape of the histograms if the variables are normally distributed and compare the centres and the spreads.

d) Suppose you know the quantitative variable is normally distributed for both groups, make a

comment about part c)

**Question 3**

Paste the last two variables of data set 3 into a data set summarizer

a) Paste in the descriptive statistics into the word file. The descriptive sample statistics let you investigate the relationship between the variables “Number of tests ?” and “Number of people needing tech support?” using the sample. Obviously paste in the graph as well.

b) Looking at the graph does there appear to be one linear relationship or two linear relationships?

c) Repeat part a) but this time only paste in the information from country A , so still select the columns “Number of tests” and “Number of people tech support?” but do NOT select any of the rows that are from country B

d) Using the output from part c) Describe the relationship between the variables using one of the following numbers, select the correct option

*The difference between sample means –

*The difference between sample proportions –

*The correlation coefficient r

Your description of the relationship should also include some plain English.

e) Using the information in part c) Write an equation that lets you predict the number of people needing tech support Y given the number of tests.

f) Use the information in part (d) to predict number of people needing tech support if the number of tests is 1000

**BUS105 Computing Assignment – king’s own institute Australia.**

**Question 4**

Note that you need the output from question 2 to answer this question

**a) Just considering the information from country A**

i) What is the estimate of the population mean number of tests

ii) What is the standard error of this estimate?

**b) Just considering the people country B**i) What is the estimate of the population mean number of tests?

ii) What is the standard error of this estimate?

**Question 5**

Note that you need the output from question 1 to answer this question

a) For version 1 of the test find a 95% confidence interval for the proportion of pregnant women that test positive

b) For version 2 of the test find a 95% confidence interval for the proportion of pregnant women that test positive

**Question 6**

Paste data set 1 into an appropriate data set summarizer

a) Paste in the computer output that measures evidence for the claim there is a relationship between the variables “Reality, Pregnant or not pregnant ?” and ““Test result, positive or negative” if you consider the whole population

b) Comment on the confidence interval

c) Comment on the p value

**BUS105 Computing Assignment – king’s own institute Australia.**

**Question 7**Paste the first two variables of dataset 3 into an appropriate dataset summarizer

a) Paste in inferential statistics that measure evidence for the claim there is a relationship between the variables “Which country ?” and “Number of tests?”if you consider the whole population

b) Comment on the confidence interval

c) Comment on the pvalue

**Question 8**

Paste the last two variables of data set 3 into an appropriate data set summarizer

a) Paste in computer output that measure evidence for the claim there is a relationship between the variables “number of tests?” and “number of people needing tech support?” if you consider the whole population

**Hint:** inferential statistics measure evidence for a claim.

b) Just using the information from country A, Paste in computer output that measure evidence for the claim there is a relationship between the variables “number of tests?” and “number of people needing tech support?” if you consider the whole population

c) Which case has a lower standard error the output from part a) or the output from part b)

d) In both part a) and part b) the computer is trying to find a single linear relationship between the variables, based on your previous work in which case is the output trustworthy?

e) Comment on the confidence interval in part b)

f) Comment on the pvalue in part b)

**Question 9**Paraphrase one or more of the concepts in of one or more of the videos from the list on the next page and explain how the concept (or concept) is useful in business . A total of 400 words is enough. An easy way to keep the Turnitin match is to give a brief overview of a few different videos it is easier to use your own words when you give a brief overview.

Include screenshots of the video and explain how the image helps explain the message in the video

As an example of screenshot from a video 2 semesters ago

An example of explaining the screenshot

The video was talking about how inferential statistics involves taking a sample to make an estimate of the population and in the screenshot above you can see a lady getting a sample of the chips to test the fat and salt content and this can be used to make an estimate of the whole populaiton.

**Instructions for the excel file**

This is worth 2% of your final grade

you have to use the excel commands discussed below and not the dataset summarizer

However you should check that your summaries are the same as the output from the dataset

summarizer you used in the word file.

If you have different information you will get at most 1 out of 2

**BUS105 Computing Assignment – king’s own institute Australia.**

You need to cut and paste just your dataset into a new excel file and follow the instructions below, DO NOT use a cover page for the excel file, you must check that you have the correct sample

Note that you do not have to use excel to make summaries you can use google sheets

A) Select all of data set 1 and use excel Pivot Table commands (or google sheet pivot table commands) to find appropriate sample statistics that let you investigate the relationship between the fields (variables) “Reality, pregnant or noy virus?” and “Test result, positive or negative?”

**BUS105 Computing Assignment – king’s own institute Australia.**

B) Select all the first two variables of data set 3 and use excel Pivot Table commands (or google sheet pivot table commands) to find appropriate sample statistics that let you investigate the relationship between the fields (variables) “Which country?” and “Number of tests in one week ?”

C) Select the last 2 variables of dataset 4 and use excel commands to make a graph that lets you investigate the relationship between the fields (variables) “Number of tests in one week?” and “number of people that need tech support ?”

D) Upload the excel file with the pivot tables and scatter plot to the assignment drop box