What is Required/Marking Criteria
- Description of Problem
This can be broken down into two parts:
1) What is the data set? Where did it come from (reference), what else do you know about it? (Background reading).
2) What do you propose to do with it? Is it a matter of classification, something more exploratory?, e.g. clustering
Please note that your problem does not have to be highly original. For example you could choose to analyse hurricane data as a classification problem. The classification system is well known, but what matters is how you go about it, not what you find out.
- Analysis of the Data
1) What do you propose to do with the data?
2) What technique do you propose to use and why?
3) How do you propose carrying out your analysis?
- Interpretation of Results
OK, your data mining software has spat out the results, but what do they mean?
Option 2
Download the dataset dataset2.sav. The dataset contains (fictionalised) first year students along with their first year performance. Your brief is to carry out an analysis of entry qualifications in order to advise the admissions tutor how to achieve an intake of 200 students who are likely to achieve a 90% pass rate in year 1. You will need to produce a report with accompanying analysis to support your findings.