BarnRaisers


6 essential steps to the data mining process

Posted on October 01, 2018 by Rob Petersen

Data Mining Process Illustration

Data mining process is the discovery through large data sets of patterns, relationships and insights that guide enterprises measuring and managing where they are and predicting where they will be in the future.

Large amount of data and databases can come from various data sources and may be stored in different data warehousess. And, data mining techniques such as machine learning, artificial intelligence (AI)  and predictive modeling can be involved.

The data mining process requires commitment. But experts agree, across all industries, the data mining process is the same. And should follow a prescribed path.

Here are the 6 essential steps of the data mining process.

1. Business understanding

In the business understanding phase:

  • First, it is required to understand business objectives clearly and find out what are the business’s needs.
  • Next, assess the current situation by finding the resources, assumptions, constraints and other important factors which should be considered.
  • Then, from the business objectives and current situations, create data mining goals to achieve the business objectives within the current situation.
  • Finally, a good data mining plan has to be established to achieve both business and data mining goals. The plan should be as detailed as possible.

2. Data understanding

  • The data understanding phase starts with initial data collection, which is collected from available data sources,  to help get familiar with the data. Some important activities must be performed including data load and data integration in order to make the data collection successfully.
  • Next, the “gross” or “surface” properties of acquired data need to be examined carefully and reported.
  • Then, the data needs to be explored by tackling the data mining questions, which can be addressed using querying, reporting, and visualization.
  • Finally, the data quality must be examined by answering some important questions such as “Is the acquired data complete?”, “Is there any missing values in the acquired data?”

3. Data preparation

The data preparation typically consumes about 90% of the time of the project. The outcome of the data preparation phase is the final data set. Once available data sources are identified, they need to be selected, cleaned, constructed and formatted into the desired form. The data exploration task at a greater depth may be carried during this phase to notice the patterns based on business understanding.

4. Modeling

  • First, modeling techniques have to be selected to be used for the prepared data set.
  • Next, the test scenario must be generated to validate the quality and validity of the model.
  • Then, one or more models are created on the prepared data set.
  • Finally, models need to be assessed carefully involving stakeholders to make sure that created models are met business initiatives.

5. Evaluation

In the evaluation phase, the model results must be evaluated in the context of business objectives in the first phase. In this phase, new business requirements may be raised due to the new patterns that have been discovered in the model results or from other factors. Gaining business understanding is an iterative process in data mining. The go or no-go decision must be made in this step to move to the deployment phase.

6. Deployment

The knowledge or information, which is gained through data mining process, needs to be presented in such a way that stakeholders can use it when they want it. Based on the business requirements, the deployment phase could be as simple as creating a report or as complex as a repeatable data mining process across the organization. In the deployment phase, the plans for deployment, maintenance, and monitoring have to be created for implementation and also future supports. From the project point of view, the final report of the project needs to summary the project experiences and review the project to see what need to improved created learned lessons.

These 6 steps describe the Cross-industry standard process for data mining, known as CRISP-DM. It is an open standard process model that describes common approaches used by data mining experts. It is the most widely-used analytics model.

Do these 6 steps help you understand the data mining process? What is your organization’s readiness for date mining?

Leave a Reply


  • Subscribe to our content by email

    Join Our Mailing List
    Email:
    For Email Marketing you can trust
  • Strategic Digital Marketing (Rob Petersen | Contributing Author)

  • 166 Case Studies Prove Social Media ROI – Over 150,000 downloads

  • How we work. Over 500,000 views

  • Rutgers Business School – MBA Faculty

  • Affiliate Disclosure

    I am an affiliate for some of the products featured on this website. This means if you purchase them through a link or banner, I earn a commission. I recommend these products because of my regard for them, not because of the commission. Whether or not you decide they are right for your business and purchase them is completely up to you.

  • SEMrush

    SEMrush
  • Categories

  • Follow Me

  • Privacy Policy

    We do not share personal information with third-parties nor do we store information we collect about your visit to this blog for use other than to analyze content performance. We are not responsible for the republishing of the content found on this blog on other Web sites or media without our permission. This privacy policy is subject to change but will be updated,
  • About

    BarnRaisers builds brands with proven relationship principles and ROI. We are a full service digital marketing agency. Our expertise is strategy, search and data-driven results.



↑ Top