Sections content description:
Intro/Background: Here is it important to describe the context of your problem, previous studies…then state your aim/motivation. Your aim could be deductive, whereby you have a hypothesis which you would like to test using one of the data
mining methods which we will be covering during the span of the course. You may also have an inductive aim, whereby you will use
one of the covered methods to induce new hypotheses based on novel patterns identified by the algorithm. (Note: If you would like
to use a method which is not covered in this course, please consult with your instructor first).
Data set should be from data.gov
Methods: In this section you will describe the preprocessing steps required to convert your data to the desired format and then
describe the algorithm which you used, what are the algorithmic steps? Discuss the efficiency of the algorithm, how computationally
expensive is it both in terms of space and time complexity?
(Use simple algorithms like Linear regression or random forest regression only)
Results: State your results, your results will be listed in tables in the appendix, you will state the most important results in this section.
Discussion: Discuss your findings by interpreting the results, comparing your results to those of other studies (if applicable), and
identifying any limitations/weaknesses associated with your project.
Conclusion: Restate what your mining accomplished and what knowledge it added if any, then discuss what future work can be
done to build on your work.
We want to use a very simple data set, where we import the data, show some simple graphs, use test & train and apply 1 single algorithm like linear regression or random forest regression and draw some conclusion.