# R Studio Dateset Assignment

Unit 2 Assignment: Basic Statistics and Visualizations in R

Unit Outcomes:

Associate a business problem with data that could be used to address it.

Distinguish between data that is relevant to a business problem and data that is not.

Explain formulas, statistical methods, and simulations used for business data analytics.

Course Outcome:

IT527-2: Describe the quality and formatting of datasets used in investigating business problems.

Purpose

Analytics begins with basic statistical analysis and the ability to visualize data in ways that answer questions. This Assignment will require you to implement several functions and techniques in R Studio to analyze and visualize data, and interpret the results.

Assignment Instructions

The Unit 2 Assignment will give you an opportunity to practice some of the analytics skills you learned in your Reading this week, and also to reflect on that learning. To fulfill the Unit 2 Assignment, complete the following steps:

Download the Loan Applicants comma separated values (CSV) file from Course Documents. Import this data set into a data frame called Loans in R Studio. Place a screenshot of the data successfully loaded in R Studio into a Word document and label it appropriately.

Create histograms of the Number of Missed/Late Payments and Monthly Income attributes. Place screenshots of these two histograms into your Word document, labeled appropriately. Explain how a data analyst would interpret these histograms. What do they mean when you look at them? Refer to section 6.3 of the textbook if you need help.

Create a boxplot of the reliability attribute. Place a screenshot of this boxplot in your Word document. Explain how a data analyst would interpret the boxplot. What does it mean when you look at it? Refer to section 6.5 of the textbook if you need help.

Create a standard Pearson correlation between all attributes except Applicant ID and Make Loan. Place a screenshot of your correlation matrix in your Word document and explain which two sets of variables are the most strongly correlated and how you know. Be sure to use the correlation matrix results in your explanation.

Run an independent t-test using the Make Loan attribute as the two-factor group attribute, and Credit Score as the independent attribute. Place a screenshot of your t-test results in your Word document and write an explanation of the results. What does the t-test result mean in terms of Credit Score’s influence on Make Loan?

Assignment Requirements

Prepare your Assignment submission in Microsoft Word following standard APA formatting guidelines: Double spaced, Times New Roman 12-point font, and one-inch margins on all sides. Include a title page, table of contents, and references page. You do not need to write an abstract. Label all tables and figures. Cite sources appropriately both in the text of your writing (parenthetical citations) and on your references page (full APA citation format).

For more information on APA style formatting, go to APA Style Central under Academic Tools of this course.

Name your Assignment document according to this convention: Unit2-<yourname>.docx. Submit your completed Assignment to the Unit 2 Dropbox by the deadline.

