Classroom Assessment Research Paper

Around the end of the 1980s, the traditional conception of teachers’ classroom assessment roles began to change. Previously, teachers’ classroom assessment responsibilities were narrowly focused on summative decisions narrowly related to grouping, grading, and selecting students. In the late 1980s, close examination of what teachers do and what decisions they are called on to make in their classrooms increased substantially views of teachers’ classroom lives, responsibilities, and assessments (Jackson 1990, Wittrock 1986). A synthesis of this research identified three generalizations that provided a useful perspective of the realities of the classroom teachers’ classroom.

First, classrooms are both academic and social environments that teachers must master and understand to successfully instruct and interact with their students. Many teacher decisions are dependent on the social and academic knowledge they acquire about their students. Second, classrooms are busy, interactive, and ad hoc settings that call on the teacher to make many and varied decisions. Third, although many nonteachers view classrooms as a unified whole, teachers know that such uniformity is illusionary. In classrooms, teachers continually deal with a range of individual student concerns and issues. Understanding the implications of these three classroom realities produces a broad domain for assessments.

Given the richness and complexity of classrooms, a more realistic description of classroom assessments is the process of collecting, synthesizing, and interpreting information to aid teachers in making classroom decisions. While the overriding purpose is to make decisions, classroom assessment involves many different decisions and contexts. In particular, classroom assessments focus on learning about students at the start of school, planning and delivering instruction to students, and formally assessing student learning. Note that each of these three assessment focuses is dependent on the collection, synthesis, and interpretation of assessment. These three focuses are used to structure the discussion of classroom assessments below.

1. Classroom Assessment At The Start Of School

The beginning days of school are important for both teachers and students. In the first few weeks of school the teacher and students must get to know and understand each other so that they can be organized into a classroom learning community. The activities in the early few days of school set the stage for how well students will behave, attend, and learn during the school year (Airasian 2001, Stiggins 1997). In the first early days of school, teachers have their antenna up, observing, listening, mentally recording, and assessing their perceptions of the students. In order to know how to group, teach, motivate, manage, accommodate, and reward students, the teacher must learn their particular characteristics.

Many forms of planned and unplanned, formal and informal, sources of information contribute to teachers’ perceptions of their students, e.g., ‘on the fly’ observations, hearsay from the school grapevine, and prior teachers’ comments, as well as information such as school records, formal assessment results, and performance in the classrooms. Two increasingly important areas teachers want to know about are student disabilities and medication. Note that although cognitive information is important to all teachers, the classroom society requires that affective and psychomotor student characteristics also must be identified.

Integrating the pieces of formal and informal information gathered in the first two or three weeks, the teacher forms a description or perception of each student and the class as a whole. This provides the teacher with the kind of nitty-gritty information needed to make the classroom function effectively (Good and Brophy 1997). Because teachers cannot spend a great deal of time assessing their students at the start of school, the validity and reliability of the initial student assessments are important. Two major validity concerns during start-of-school assessments are labeling students based on stereotypes and treating students’ cultural or language differences as if they were student deficits (Oakes and Lipton 1999). The main concern of reliability is that teachers obtain sufficient and recurring information before labeling students. These important initial teacher assessments are often overlooked as an important and influential form of classroom assessment.

2. Classroom Assessment In Planning And Delivering Instruction

Teacher classroom decisions about planning and delivering instruction encompass a variety of issues. There are, for example, many considerations that teachers must recognize and assess in order to successfully plan lessons for their students. Student characteristics vary with student readiness, attention span, prior subject knowledge, attitude toward school, disabilities, and other characteristics that must be considered when planning instruction. Similarly, school and classroom resources range from textbooks to copying machines to sophisticated laboratory apparatus, and can hamper or enhance lesson planning. Time is a critical factor in planing, as every teacher knows. Also, teacher characteristics such as subject matter knowledge, physical limits, and preferred teaching style influence planning. A significant amount of assessment is involved in decision making for valid and viable lesson plans (Wragg 1997).

Once relevant information about the student, the teacher, and the instructional resources are identified, the teacher’s task is to synthesize and decide how to construct a set of instructional plans containing educational objectives, instructional materials, teaching strategies, and assessment procedures (Airasian 2001). Different objectives call for different forms of instruction and assessment, and teachers must be able to teach students in more than one way. Objectives indicate the outcomes of student learning. Higher level objectives include cognitive processes such as application, analysis, synthesis, and evaluation. Lower level objectives emphasize rote memorization. Because educational objectives are developed before instruction begins, teachers often must make a decision to adapt objectives, materials, and instructional strategies to suit student readiness and needs. Suitable strategies to accommodate students with disabilities also must be planned. The decisions teachers make to match instruction to objectives and make appropriate instruction for students with disabilities improve the validity of their instruction and assessment.

During teaching, the teacher is concerned with decisions and assessments to determine how well the instruction is progressing. Planning and instructing assessments are integrally related; the processes constantly cycles from planning to delivering to revising to planning and so on. There is a logical, continuous, and natural link between the two processes.

Oral questioning is the most common form of instructional assessment because it best fits the flow of instruction (Airasian 2001). During instruction, teachers ask questions for many reasons, to reinforce important points, to maintain students’ attention, to assess student learning, and to promote deeper processing of important information. Teachers use both convergent questions that have a single correct answer and divergent questions that have more than one appropriate answers. Lower level questions tap recall and memorization while higher level questions tap processes more complex than recall. Classroom questioning strategies can be improved by asking questions related to important objectives, avoiding overly general questions, distributing questions among many students, allowing sufficient ‘wait time’ before calling on students, stating questions clearly to avoid confusion, probing student responses with follow-up questions such as ‘why’ or ‘explain your answer,’ and remembering that oral questioning is a social process in which student answers should be treated with respect, regardless of the quality of the answer.

3. Assessments Of Formal Learning

3.1 General Aspects Of Formal Assessments

Formal assessment is the culmination of planning and delivering instruction. It focuses on the extent to which students have learned from instruction. There is an important difference between good teaching and effective teaching. Good teaching refers to what teachers do during planning and delivering instruction. Effective teaching refers to whether students have learned from their instruction. Formal assessments are concerned with the effectiveness of learning from instruction. Formal assessments are also called summative assessments and commonly include tests, projects, term papers, lab reports, portfolios, performances, products, and final examinations. These are assessments that can have important consequences for students and therefore are taken seriously by students, parents, and teachers (Black 1998).

A fair and valid formal assessment includes information and skills similar to those presented in instruction. While the type of assessment strategy chosen to assess students depends on the nature of instruction, all types should represent the objectives and instruction presented. Factors such as the age of the students, the subject matter assessed, and the length of time for testing, all impact the length of formal assessments.

Obtaining fair and valid formal assessments involves alignment among objectives, instruction, and assessments, providing students with good instruction, and selecting appropriate strategies to assess learning. Formal assessments gather valid and reliable samples of student performance and use them to make generalizations about general student learning. The most important preparation for formal assessment is a good teacher. Students should also be familiar with the assessment item formats and be given a review session prior to the assessment.

If these factors are not met, invalid assessment results can occur. Other practices that diminish assessment validity are: failure to develop assessments based on objectives and instruction, failure to assess all the important objectives taught, failure to select item types that prevent students from showing their full performance, including topics or objectives not taught, including too few items to obtain adequate assessment reliability, and using tests to punish students. Further, the success of formal assessment can be undone if the test questions are faulty or confusing. Poorly constructed or unclear assessment questions do not provide students a fair chance to show what they have learned from instruction, and consequently, diminish assessment validity.

3.2 Types Of Formal Classroom Assessments

There are many types of test items that are used in classrooms assessments. Selection items include multiple-choice, true–false, and matching items to which students respond by selecting an answer from a set of presented items. Supply items include short-answer, completion, and essay items to which students are required to create or supply their own answers. Selection items can cover many items in a short time and can be scored quickly. Supply items can be constructed quickly and permit students to provide their own constructed answers. Selection items are difficult to construct and encourage guessing, while supply items are difficult to score and cover smaller samples of instruction (Gronlund 1998).

Common guidelines for writing and critiquing test items include: (a) assess important objectives; (b) state items clearly, describing the students’ task; (c) avoid ambiguous and confusing wording and sentence structure—students should have a clear understanding of what is expected of them; (d) use vocabulary appropriate to the students being assessed; (e) write selection items that have one correct answer; (f ) provide information about the nature and form of the desired response, particularly for essay questions; (g) avoid clues to correct answers; and (h) review items before assessing students.

In assembling and preparing items for a formal assessment the following suggestions should be applied: (a) group items of the same type; (b) place selection items first and supply items last; (c) provide directions for each type of test item; and (d) diminish assessment anxiety by giving advanced notice of the assessment, providing a review session before assessment, and most of all, providing students with good instruction. Many students experience anxiety before and during testing. While it is difficult to eliminate test anxiety, these strategies can lower it. Plan accommodations for students with disabilities. Two types of accommodations should be addressed, one for test administration (e.g., having directions read to students, giving extra time) and one for the test itself (e.g., divide the test into small section, provide a sample of each test item, arrange student items from concrete to abstract).

Unfortunately, cheating on classroom assessments is a fairly common occurrence. Forms of cheating range from looking at another’s paper, bringing crib sheets into class, to other illicit strategies (Cizek 1999). No matter how or why it is done, cheating is dishonest and unacceptable. When cheaters state or imply that the work they have turned is their own, they are lying, and should be penalized. Useful strategies to discourage cheating on classroom assessments include spreading seating arrangement, careful proctoring, and movement around the classroom during testing.

Ultimately, all formal classroom assessments will be scored, usually by the classroom teacher. Scoring selection items is straightforward, efficient, and objective. Each student’s score is compared to a scoring key and an overall score is obtained. Selection items are typically scored objectively, that is, two or more independent scorers would agree on a student’s score. Supply and especially essay items tend to be more difficult and time-consuming to score. Because student responses to supply items are more lengthy and varied than those of selection items, the former are more likely to be subjectively scored. That is, scores of two or more independent scorers do not agree on the same or similar student score. Many factors influence essay subjectivity, including handwriting, spelling, neatness, and teacher fatigue. These factors are not central to the essay, but their presence influences the teacher’s perception of students’ essays and can influence the objectivity of essay scoring.

Two common essay scoring approaches are holistic and analytic. Holistic scoring provides a single score to describe the essay’s quality. Analytic scoring breaks the essay down into component parts such as organization, spelling, accuracy, and grammar and gives each component an individual score. Holistic scoring is most used in grading students, while analytic scoring is most used to correct and improve initial drafts of written responses. To ensure objectivity in essay scoring, the following steps should be followed. Define what constitutes a good essay answer before it is administrated. Tell students whether handwriting, spelling, grammar, and punctuation will count in scoring the essay. If possible, score students essays anonymously. If multiple essays are in the assessment, score all students’ answers to the first essay question before moving on to score the second essay item, and so on. Reread some of the essays a second time to determine the reliability of scoring.

In addition to selection and short-answer items, there are other important types of items that are important in classroom assessments. The most prominent of these item types is performance assessment, also referred to as authentic or alternative assessments (Mehrens et al. 1998). Performance assessments allow students to demonstrate what they know or can do in a real situation. Examples of performance assessments are essays, pronouncing a foreign word, setting up laboratory equipment, catching a ball, reciting a poem, identifying unknown chemicals, generalizing experimental data, working in cooperative groups, obeying school rules, and painting a picture. All of these performances require more than memorization and a one or two word response.

All performance assessments are developed in four steps: (a) identifying the purpose of the performance assessment; (b) stating the observable aspects of the performance, also called performance criteria; (c) selecting a suitable setting to carry out the performance assessment; and (d) scoring the quality of the performance. The key aspect of assessing performance assessments is the identification of the criteria that define a good performance. Performances are normally broken down into specific, observable criteria that can be individually assessed. Criteria should be specific and unambiguous. For example, stating ‘information is presented in a logical sequence’ is better than stating the more ambiguous ‘has organization,’ and ‘can be heard in all parts of the room’ is better than ‘speaks correctly.’ Statements of clear performance criteria are important for both holistic and analytic scoring approaches.

Multiple approaches to scoring students’ performance assessments are available, and all are based on performance criteria. Checklists, rating scales, and scoring rubrics are most commonly used to assess performance assessments. A checklist is a written list of performance criteria that the teacher uses to judge student performance on each of the criteria. Checklists allow only ‘yes’ or ‘no’ judgments of each criterion. A rating scale is a written list of performance criteria that permits the teacher more than two choices (e.g., good, fair, poor or excellent, good, fair, poor) to judge student performance of each criterion. A scoring rubric summarizes the overall performance on the criteria into holistic descriptions representing different levels of a student’s overall performance. Rubrics describe performance in a summative way, while checklists and rating scales provide specific diagnostic information about each criterion in a formative way (Airasian 2001, Goodrich 1997).

Another important addition to performance assessment is the portfolio. A portfolio is a carefully selected collection of a student’s performances that show accomplishments and improvements over time. Portfolios allow students and teachers to revisit and reflect prior work. Like any performance assessment, performance criteria are defined to identify and judge each of the individual pieces and the overall portfolio. As in all classroom assessments, the criteria should be aligned to the teacher’s objectives (Arter and Spandel 1992). The purpose of performance assessment is the same as all formal classroom assessments, to determine how well students have learned from the instruction they were provided. To improve the validity and reliability of performance assessments, teachers should select performance criteria that are appropriate for their students, observe and record student performance while it is being performed rather than at some later date, judge student performance in terms of the performance criteria not the personal characteristics of the students and, if possible, observe a student’s performance more than once.

Managing and scoring portfolios is a time-consuming activity, and teachers who attempt portfolio assessment are advised to start a portfolio with a single topic with a limited number of entries in the portfolio.

4. Grading

Grading is the formal process of judging the quality of a student’s performance. Grades are always based on teacher judgment. However, the helping relationship that teachers have with their students can make it difficult to judge them in a completely objective manner. Further, since there is no uniformly accepted teacher grading strategy, teachers must find a grading approach that they feel is fair to themselves and to the students (Brookhart 1998, Frisbie and Waltman 1992).

All grading approaches are based on comparisons. The most common grading comparisons are normreferenced and criterion-referenced grading. Normreferenced grades are determined by comparing how a given student performed compared to the performance of other test takers. Norm-referenced grading is also called grading on the bell curve. Criterion-referenced grades are determined by comparing how a student performed in comparison to pre-established standards. In norm-referenced grading not all students can attain high scores, while in criterion-referenced grading they can if they all reach the standard. Grading students based on a comparison of student performance to the teacher’s estimate of the student’s ability is not recommended because estimating ability is difficult to do accurately. Similarly, grading based on student improvement over time is also not recommended. In general, regardless of the grading approach selected, it is strongly advised that grades be based mainly on students’ academic performance.

5. Assessment Of Ethical Responsibilities

Thus far discussion has focused on the technical aspects of classroom assessment. However, it is important to recognize that teachers’ assessments have short-term and long-term consequences for students, thus requiring that teachers have an ethical responsibility to make decisions that are the most valid and reliable as possible. A number of groups in the USA have set standards for teachers’ ethical performance (American Federation of Teachers et al. 1990 and National Education Association 1992–3). Among teachers’ ethical responsibilities are: to provide students access to varying points of view; not to expose students to embarrassment or ridicule; not to exclude, deny, or grant advantages on the basis of students’ race, color, creed, gender, national origin, religion, culture, sexual orientation, or disability; and not to label students with stereotypes.

This research paper has indicated that classrooms are complex environments that call upon teachers to make many and varied decisions. The bases for these decisions derive from a wide range of formal and informal assessment information. Although it is not expected that every teacher assessment decision will always be correct, it is expected that they can provide defensible assessment evidence to support classroom decisions. This should be expected in a context in which teachers’ actions have important consequences for students.


