Questions Questions and More Questions
A successful data mining and data modelling analysis poses several questions. What data will be readily available and how will it stream? How will information be collected both in technological and physical terms? Exactly what data will be collected? How will data be held? Where will the information be held?
Once these questions have been answered, we focus our attention on mining, modelling, and reporting methods. How will we obtain data from the storage facility? How will we create designs and exactly what of? How will we access the information designs and reports? On exactly what will we report?
Conversion of Data into Useful Information:
Turning the data into beneficial details requires:
- Recognizing the concerns.
- Putting together the information sets.
- Building models.
- Verifying designs.
- Interpretation of the outcomes.
- Automation of the delivery.
In some cases, aggregated data may be kept rather than source data. All of these elements affect the data modelling exercise and the ultimate modelling software requirements.
Businesses and Its Concerns:
The majority of businesses desire to know vital information about customers at every point of time, for instance:
- Lifetime value
- X sell and upgrade potential
- Acquisition cost
- Channel preferences
- Purchase behaviour patterns
For this reason modelling tools and strategies have to be used. These can be divided into two groups:
- Theory driven
- Information owned.
The Two Groups of Modelling Tools:
Theory-driven modelling tool:
Theory owned modelling (hypothesis testing) attempts to corroborate or disprove preconceived ideas. Theory driven modelling tools require the user to specify the majority of the model based upon anticipation and then checks to see if the design is valid.
Information-owned modelling tool:
Information owned modelling tools instantly create the design based upon patterns they find in the information. This also has to be tested prior to being accepted as valid.
Data Modelling and Conclusion:
Designing is an iterative process with the last model usually being a combination of prior knowledge and newly found information. The engine tools and techniques consist of statistical strategies, data driven tools, cluster analysis-tests, factor analysis, analysis of variance, CHAID (Chi-square Automatic Interaction Detector), decision trees, direct regression, and visualisation tools.