The goal of the project is to develop a statistical model for a specific business problem and provide a thorough analysis of it.Possible topics: prediction of prices for a specific stock/bitcoin/commodity/futures/options; prediction of a country’s GDP level using market indicators; prediction of auto sales; analysis of bonds liquidity, etc.You should choose a topic that interests you, construct a solid multivariable regression model by brainstorming which factors can be relevant to prediction, obtain data and use R to carry out statistical analysis.

It is expected that you work with a sufficiently large data set originally to find which factors are significant for the model (I expect 20-25 factors). Including categorical variables along with continuous variables is strongly encouraged.

Check the assumptions of the model by carrying out residual analysis (see the R video on that in particular) and provide recommendations if there are significant violations. Test relevant hypotheses. Remove insignificant variables and analyze the new model using R tools.I can provide guidance and advice during the course of project, if needed. Please ask your questions during class/office hours or email.You will need to submit a report covering the background information (the topic of research), the objective of your research, model description with following rigorous statistical analysis and conclusions.

There is no requirement on the size of the final report but typically it should be 10-15 pages + Appendix with R code. Make sure to indicate all additional resources you use, including the data sources, in the reference list. Include your R code in the Appendix of the report, that is in the pdf file.The cover page should contain your name, an abstract, and the statement:”I have neither given nor received help (apart from the instructor) to complete this assignment” with your signature.

An abstract is a concise summary of a research paper or entire thesis. It is an original work, not an excerpted passage. An abstract must be fully self-contained and make sense by itself, without further reference to outside sources or to the actual paper. I require the abstract to be no more than 400 words. The abstract is essentially a short presentation of our work that you can use, for example, during the job interview process.

Be very critical about all work you submit. Presentation counts!Upload the files on Canvas (a pdf file with the final report and all necessary data and .R files).The instructor reserves a right to ask any student about the solution process and if it’s clear that the student doesn’t understand the material, they will receive a grade 0 on the assignment. Copying the solutions or parts of them from others/posting them online will cause serious issues and will be reported to the Office of Academic Integrity.


The rubric:Cover page: 1 ptAbstract: 5 ptsBackground information: 10 ptsDescription of factors (20-25 requested): 10 ptsDiscussion of the assumptions of the model backed up by graphs and analysis: 30 ptsRecommendations in case of violation of the assumptions: 10 ptsStatistical analysis (hypothesis testing): 25 pts Proper citation and reference list: 5 ptsPresentable source files and code uploaded on Canvas: 4 ptsProject grade is the sum of the points above.