Definition of Evaluation



Printer Friendly Version Definition of Evaluation Problem Analysis Criterion-Referenced Measurement Formative Evaluation Summative Evaluation Evaluation is “the process of determining the adequacy of instruction and learning” (Seels & Richey, 1994, p.54). The purpose of evaluation is to provide accurate information and feedback on the results of the training in order for decision makers to make a determination on the success of a program. Three major justifications for evaluating training are; evaluation can help improve future programs, can determine whether a program should be continued or dropped or justify the existence of a program (Kirkpatrick, 1994). The instructional designer is involved with several types of evaluation throughout the course of a project, which include problem analysis, criterion-referenced measurement, formative evaluation and summative evaluation. In order to evaluate the success of a program the instructional designer must first have goals and objectives in order to compare the results against. Problem Analysis Problem analysis or needs assessment are terms often used interchangeably within the field of instructional technology. Problem analysis is a systematic process used by the instructional designers to gather information from many sources to determine if there are performance problems (Rossett, 1987). One model often used by instructional designers when conducting problem analysis is Allison Rossett’s Needs Assessment Model (1987). During the problem analysis the instructional designer is seeking information to identify the optimal performance, actual performance, feelings, causes and solutions (Rossett, 1987). With this information the instructional designer can identify gaps between the optimal and actual performance to identify needs. Criterion-Referenced Measurement Criterion-referenced measurement involves techniques used to determine mastery levels of the learner for a pre-determined goal (Dick and Carey, 2005). Criterion-referenced materials are often presented to the learner in the form of a test. Criterion-referenced tests (CRT) are used by the instructional designer to measure the learners’ mastery of knowledge, skills or attitudes as opposed to norm-referenced tests, which compare the level of learning among the learners. CRT(s) evaluate both the learners’ progress and the quality of instruction (Dick and Carey, 2005). The objectives of CRT(s) are determined by the instructional designer based on the results of the performance analysis. There are four basic types of criterion-referenced tests created by Instructional designers; entry behaviors tests, pretests, practice tests and posttests (Dick & Carey, 2005). Entry behaviors tests are given to learners’ prior to instruction. Results of entry behaviors tests are used by the instructional designer to determine if the learner has the skills needed to begin the instruction. Pretests are also given prior to instruction. From the results of the pretest the instructional designer can determine if some or all of the skills have already been mastered by the learner. If the instructional designer finds that all skills have been mastered, then instruction is not needed. Practice tests are created to allow the learner to practice and receive feedback on their level of understanding. Practice tests also allow learners’ to rehearse new knowledge and skills so they can judge for themselves his/her level of understanding and skill (Dick and Carey, 2005). The results of practice tests also are used by the instructional designer to determine if the learners’ are acquiring the intended knowledge and skills, if the instruction is clustered appropriately and if the pace of instruction is appropriate. Finally, posttests are criterion-referenced tests given to the learners after instruction. The results of posttests provide the instructional designer with the information needed to answer questions about the learners; has he/she achieved the terminal goal? Posttest results also provide the instructional designer with information of the effectiveness of instruction and areas of instruction that may need to be revised. Back to top Formative Evaluation Formative evaluation is first conducted during the development stage of instruction (Scriven, 1967). Research conducted by Cronback (1975) and Scriven (1967) suggested that levels of achievement with new learning materials being developed was relatively low. As a result of these studies, a more in-depth form of evaluation, known as formative evaluation was developed. The purpose of formative evaluation, thus, is to collect data to determine what changes need to be made to improve instruction, before it is implemented within a system. When conducting formative evaluation the instructional designer uses a systematic approach as the framework to guide him/her through three basic phases of evaluation; one-on-one, small-group and field trail evaluation (Dick & Carey, 2005). During the one-on-one phase of evaluation the instructional designer works one-on-one with learners’ to acquire data for needed revisions. Once revisions have been made to the instruction, small group evaluation is conducted. The data collected during this phase is used by the instructional designer to determine the effectiveness of the changes made from the one-to-one evaluation and to identify any additional problems that may have been missed. The final phase of evaluation, field trial, is conducted to evaluate the effectiveness of changes made from small-group evaluation and determines if the instruction can be implemented in the environment and context for which it was intended (Dick & Carey, 2005). While, conducting formative evaluation will add time to the project and cost in additional resources it can be the determining factor on whether instruction is successful. If obvious errors are not caught until the instruction is rolled out the cost is much greater and the confidence of management in the capabilities of the instructional designer will be compromised. Summative Evaluation Summative evaluation is conducted to evaluate the effectiveness of instruction after a program hasbeen implemented. It answers the question, have the goals or needs been met? While the summative evaluation plan is often developed by the instructional designer, the actual evaluation is often conducted by an outside neutral party, so the results remain objective and unbiased. The instructional designer gathers information on the effectiveness of the instruction and uses this information to make decisions about utilization (Seels & Richey, 2994). There are several models that instructional designers can use to guide their summative evaluation plan including: Donald Kirkpatrick’s Four Levels of Evaluation (Kirkpatrick, 1998) and Daniel Stufflebeam’s CIPP Evaluation Model (1998). Stufflebeam’s CIPP Evaluation Model (1998) is a comprehensive model often used by instructional designers for summative evaluation. This model provides a systematic framework as the instructional designer works through each phase of evaluation as identified by Stufflebeam (1998). Context Evaluation – determines if the goals and objectives have been met and if they are accurately aligned with the problems identified in the needs assessment. Input Evaluation – assess how well aligned the strategies, activities and materials are in support of the goals and objectives. Process Evaluation – is the ongoing systematic monitoring of the program. Product Evaluation – focuses on the outcomes of the program and measures the achievement of long and short-term goals. The CIPP evaluation model is focused on the client. The purpose of the CIPP model for summative evaluation is to help potential clients determine if there is a “need” for the product. Back to top

Printer Friendly Version

Definition of Evaluation

Problem Analysis Criterion-Referenced Measurement Formative Evaluation Summative Evaluation

Evaluation is “the process of determining the adequacy of instruction and learning” (Seels & Richey, 1994, p.54). The purpose of evaluation is to provide accurate information and feedback on the results of the training in order for decision makers to make a determination on the success of a program. Three major justifications for evaluating training are; evaluation can help improve future programs, can determine whether a program should be continued or dropped or justify the existence of a program (Kirkpatrick, 1994).

The instructional designer is involved with several types of evaluation throughout the course of a project, which include problem analysis, criterion-referenced measurement, formative evaluation and summative evaluation. In order to evaluate the success of a program the instructional designer must first have goals and objectives in order to compare the results against.

Problem Analysis

Problem analysis or needs assessment are terms often used interchangeably within the field of instructional technology. Problem analysis is a systematic process used by the instructional designers to gather information from many sources to determine if there are performance problems (Rossett, 1987).

One model often used by instructional designers when conducting problem analysis is Allison Rossett’s Needs Assessment Model (1987). During the problem analysis the instructional designer is seeking information to identify the optimal performance, actual performance, feelings, causes and solutions (Rossett, 1987). With this information the instructional designer can identify gaps between the optimal and actual performance to identify needs.

Criterion-Referenced Measurement

Criterion-referenced measurement involves techniques used to determine mastery levels of the learner for a pre-determined goal (Dick and Carey, 2005). Criterion-referenced materials are often presented to the learner in the form of a test. Criterion-referenced tests (CRT) are used by the instructional designer to measure the learners’ mastery of knowledge, skills or attitudes as opposed to norm-referenced tests, which compare the level of learning among the learners. CRT(s) evaluate both the learners’ progress and the quality of instruction (Dick and Carey, 2005). The objectives of CRT(s) are determined by the instructional designer based on the results of the performance analysis.

There are four basic types of criterion-referenced tests created by Instructional designers; entry behaviors tests, pretests, practice tests and posttests (Dick & Carey, 2005). Entry behaviors tests are given to learners’ prior to instruction. Results of entry behaviors tests are used by the instructional designer to determine if the learner has the skills needed to begin the instruction.

Pretests are also given prior to instruction. From the results of the pretest the instructional designer can determine if some or all of the skills have already been mastered by the learner. If the instructional designer finds that all skills have been mastered, then instruction is not needed.

Practice tests are created to allow the learner to practice and receive feedback on their level of understanding. Practice tests also allow learners’ to rehearse new knowledge and skills so they can judge for themselves his/her level of understanding and skill (Dick and Carey, 2005). The results of practice tests also are used by the instructional designer to determine if the learners’ are acquiring the intended knowledge and skills, if the instruction is clustered appropriately and if the pace of instruction is appropriate.

Finally, posttests are criterion-referenced tests given to the learners after instruction. The results of posttests provide the instructional designer with the information needed to answer questions about the learners; has he/she achieved the terminal goal? Posttest results also provide the instructional designer with information of the effectiveness of instruction and areas of instruction that may need to be revised.

Formative Evaluation

Formative evaluation is first conducted during the development stage of instruction (Scriven, 1967). Research conducted by Cronback (1975) and Scriven (1967) suggested that levels of achievement with new learning materials being developed was relatively low. As a result of these studies, a more in-depth form of evaluation, known as formative evaluation was developed. The purpose of formative evaluation, thus, is to collect data to determine what changes need to be made to improve instruction, before it is implemented within a system.

When conducting formative evaluation the instructional designer uses a systematic approach as the framework to guide him/her through three basic phases of evaluation; one-on-one, small-group and field trail evaluation (Dick & Carey, 2005).

During the one-on-one phase of evaluation the instructional designer works one-on-one with learners’ to acquire data for needed revisions. Once revisions have been made to the instruction, small group evaluation is conducted. The data collected during this phase is used by the instructional designer to determine the effectiveness of the changes made from the one-to-one evaluation and to identify any additional problems that may have been missed. The final phase of evaluation, field trial, is conducted to evaluate the effectiveness of changes made from small-group evaluation and determines if the instruction can be implemented in the environment and context for which it was intended (Dick & Carey, 2005).

While, conducting formative evaluation will add time to the project and cost in additional resources it can be the determining factor on whether instruction is successful. If obvious errors are not caught until the instruction is rolled out the cost is much greater and the confidence of management in the capabilities of the instructional designer will be compromised.

Summative Evaluation

Summative evaluation is conducted to evaluate the effectiveness of instruction after a program hasbeen implemented. It answers the question, have the goals or needs been met? While the summative evaluation plan is often developed by the instructional designer, the actual evaluation is often conducted by an outside neutral party, so the results remain objective and unbiased.

The instructional designer gathers information on the effectiveness of the instruction and uses this information to make decisions about utilization (Seels & Richey, 2994). There are several models that instructional designers can use to guide their summative evaluation plan including: Donald Kirkpatrick’s Four Levels of Evaluation (Kirkpatrick, 1998) and Daniel Stufflebeam’s CIPP Evaluation Model (1998).

Stufflebeam’s CIPP Evaluation Model (1998) is a comprehensive model often used by instructional designers for summative evaluation. This model provides a systematic framework as the instructional designer works through each phase of evaluation as identified by Stufflebeam (1998).

Context Evaluation – determines if the goals and objectives have been met and if they are accurately aligned with the problems identified in the needs assessment.
Input Evaluation – assess how well aligned the strategies, activities and materials are in support of the goals and objectives.
Process Evaluation – is the ongoing systematic monitoring of the program.
Product Evaluation – focuses on the outcomes of the program and measures the achievement of long and short-term goals.

The CIPP evaluation model is focused on the client. The purpose of the CIPP model for summative evaluation is to help potential clients determine if there is a “need” for the product.