Commercial
Norm-Referenced, Standardized Exams
Top
(Group administered, mostly or entirely multiple-choice, “objective”
tests in one or more curricular areas. Scores are based on comparison
with a reference or norm group.)
Advantages:
- Can be adopted and implemented quickly
- Reduce/eliminate faculty time demands in instrument development
and grading (i.e., relatively low “frontloading” and
“backloading” effort)
- Objective scoring
- Provide for externality of measurement (i.e., external validity)
- Provide reference group(s) comparison often required by mandates.
- May be beneficial or required in instances where state or national
standards exist for the discipline or profession.
Disadvantages:
- Limit what can be measured to relatively superficial knowledge/learning.
- Eliminate the important process of learning and clarification
of goals and objectives typically associated with local development
of measurement instruments.
- Unlikely to measure the specific goals and objectives of a program,
department, or institution.
- "Relative standing” results tend to be less meaningful
than criterion-referenced results for program/student evaluation
purposes.
- Norm-referenced data is dependent on the institutions in comparison
group(s) and methods of selecting students to be tested in those
institutions. (Caution: unlike many norm-referenced tests such
as those measuring intelligence, present norm-referenced tests
in higher education do not utilize, for the most part, randomly
selected or well stratified national samples.)
- Group administered multiple-choice tests always include a potentially
high degree of error, largely uncorrectable by “guessing
correction” formulae (which lowers validity).
- Summative data only (no formative evaluation)
- Results unlikely to have direct implications for program improvement
or individual student progress
- Results highly susceptible to misinterpretation/misuse both
within and outside the institution
Ways to Reduce Disadvantages
- Choose test carefully, and only after faculty have reviewed
available instruments and determined a satisfactory degree of
match between the test and the curriculum.
- Request and review technical data, especially reliability and
validity data and information on normative sample from test publishers.
- Utilize on-campus measurement experts to review reports of test
results and create more customized summary reports for the institution,
faculty, etc.
- Whenever possible, choose tests that also provide criterion-referenced
results.
- Assure that such tests are only one aspect of a multi-method
approach in which no firm conclusions based on norm-referenced
data are reached without cross-validation from other sources.
Bolltom Lines
Relatively quick, easy, and inexpensive, but useful mostly where
group-level performance and external comparisons of results are
required. Not as useful for individual student or program evaluation.
Locally
Developed Exams Top
(Objective and/or subjective tests designed by faculty of the program
being evaluated.)
Advantages:
- Content and style can be geared to specific goals, objectives,
and student characteristics of the institution, program, curriculum,
etc.
- Specific criteria for performance can be established in relationship
to curriculum
- Process of development can lead to clarification/crystallization
of what is important in the process/content of student learning.
- Local grading by faculty can provide immediate feedback related
to material considered meaningful.
- Greater faculty/institutional control over interpretation and
use of results.
- More direct implication of results for program improvements.
Disadvantages:
- Require considerable leadership/coordination, especially during
the various phases of development
- Costly in terms of time and effort (more “frontload”
effort for objective; more “backload” effort for subjective)
- Demands expertise in measurement to assure validity/reliability/utility
- May not provide for externality (degree of objectivity associated
with review, comparisons, etc. external to the program or institution).
Ways to Reduce Disadvantages:
- Enter into consortium with other programs, departments, or institutions
with similar goals and objectives as a means of reducing costs
associated with developing instruments. An element of externality
is also added through this approach, especially if used for test
grading as well as development.
- Utilize on-campus measurement experts whenever possible for
test construction and validation
- Contract with faculty “consultants” to provide development
and grading.
- Incorporate outside experts, community leaders, etc. into development
and grading process.
- Embed in program requirements for maximum relevance with minimum
disruption (e.g., a “capstone” course).
- Validate results through consensus with other data.
Bottom Lines
Most useful for individual student or program evaluation, with
careful adherence to measurement principles. Must be supplemented
for external validity.
Oral
Examination Top
(An evaluation of student knowledge levels through a face-to-face
interrogative dialogue with program faculty.)
(Oral exams generally have the same basic strengths and weaknesses
of local tests, plus the following advantages and disadvantages:)
Advantages:
- Allows measurement of student achievement in considerably greater
depth and breadth through follow-up questions, probes, encouragement
of detailed clarifications, etc. (= increased internal validity
and formative evaluation of student abilities)
- Non-verbal (paralinguistic and visual) cues aid interpretation
of student responses.
- Dialogue format decreases miscommunications and misunderstandings,
in both questions and answers.
- Rapport-gaining techniques can reduce “test anxiety,”
helps focus and maintain maximum student attention and effort.
- Dramatically increases “formative evaluation” of
student learning; i.e., clues as to how and why they reached their
answers.
- Identifies and decreases error variance due to guessing.
- Provides process evaluation of student thinking and speaking
skills, along with knowledge content.
Disadvantages:
- Requires considerably more faculty time, since oral exams must
be conducted one-to-one, or with very small groups of students
at most.
- Can be inhibiting on student responsiveness due to intimidation,
face-to-face pressures, oral (versus written) mode, etc. (May
have similar effects on some faculty!)
- Inconsistencies of administration and probing across students
reduces standardization and generalizability of results (= potentially
lower external validity).
Ways to Reduce Disadvantages
- Prearrange “standard” questions, most common follow-up
probes, and how to deal with typical students’ problem responses;
“pilot” training simulations.
- Take time to establish open, non-threatening atmosphere for
testing.
- Electronically record oral exams for more detailed evaluation
later.
Bottom Lines
Oral exams can provide excellent results, but usually only with
significant – perhaps prohibitive – additional cost.
Definitely worth utilizing in “Low N” programs, and
for the highest priority objectives in any program.
COMPETENCY-BASED
METHODS
(Measuring pre-operationalized abilities in most direct, real-world
approach.)
Performance
Appraisals Top
(Systematic measurement of overt demonstration of acquired skills.)
Advantages:
- Provide a more direct measure of what has been learned (presumably
in the program)
- Go beyond paper-and-pencil tests and most other assessment methods
in measuring skills
- Preferable to most other methods in measuring the application
and generalization of learning to specific settings, situations,
etc.
- Particularly relevant to the goals and objectives of professional
training programs and disciplines with well defined skill development.
Disadvantages:
- Ratings/grading typically more subjective than standardized
tests
- Requires considerable time and effort (especially front-loading),
thus being costly
- Sample of behavior observed or performance appraised may not
be typical, especially because of the presence of observers
Ways to Reduce Disadvantages
- Develop specific, operational (measurable) criteria for observing
and appraising performance
- Provide training for observers/appraisers
- Conduct pilot-testing in which rate of agreement (inter-rater
reliability) between observers/appraisers is termined.
- Continue training and/or alter criteria until acceptable consistency
of measurement is obtained.
- Conduct observations/appraisals in the least obtrusive manner
possible (e.g., use of one-way observational mirrors, videotaping,
etc.)
- Observe/appraise behavior in multiple situations and settings.
- Consider training and utilizing graduate students, upper level
students, community volunteers, etc. as a means of reducing the
cost and time demands on faculty.
- Cross-validate results with other measures.
Bottom Lines
Generally the most highly valued but costly form of student outcomes
assessment – usually the most valid way to measure skill development.
Simulation Top
(Primarily utilized to approximate the results of performance appraisal,
but when – due to the target competency involved, logistical
problems, or cost – direct demonstration of the student skill
is impractical.)
Advantages
- Better means of evaluating depth and breadth of student skill
development than tests or other performance-based measures (=
internal validity).
- More flexible; some degree of simulation can be arranged for
virtually any student target skill.
- For many skills, can be group administered, thus providing and
excellent combination of quality and economy.
Disadvantages
- For difficult skills, the higher the quality of simulation the
greater the likelihood of the problems of performance appraisal;
e.g., cost, subjectivity, etc. (see “Performance Appraisals”).
- Usually requires considerable “frontloading” effort;
i.e., planning and preparation.
- More expensive than traditional testing options in the short
run.
Ways of Reducing Disadvantages
- Reducing problems is relatively easy, since degree of simulation
can be matched for maximum validity practicable for each situation.
- Can often be “standardized” through use of computer
programs (=enhanced external validity).
Bottom Lines
An excellent means of increasing the external and internal validity
of skills assessment at a minimal long-term costs.
SELF AND THIRD
PARTY REPORTS
(Asking individuals to share their perceptions of their own attitudes
and/or behaviors or those of others.)
Written
Surveys and Questionnaires Top
(Including direct or mailed, signed or anonymous)
Advantages:
- Typically yield the perspective that students, alumni, the public,
etc., have of the institution which may lead to changes especially
beneficial to relationships with these groups.
- Convey a sense of importance regarding the opinions of constituent
groups
- Can cover a broad range of content areas within a brief period
of time
- Results ten to be more easily understood by lay persons
- Can cover areas of learning and development which might be difficult
or costly to assess more directly.
- Can provide accessibility to individuals who otherwise would
be difficult to include in assessment efforts (e.g., alumni, parents,
employers).
Disadvantages
- Results tend to be highly dependent on wording of items, salience
of survey or questionnaire, and organization of instrument. Thus,
good surveys and questionnaires are more difficult to construct
than they appear.
- Frequently rely on volunteer samples which tend to be biased.
- Mail surveys tend to yield low response rates.
- Require careful organization in order to facilitate data analysis
via computer for large samples.
- Commercially prepared surveys tend not to be entirely relevant
to an individual institution and its students.
- Forced response choices may not allow respondents to express
their true opinions.
- Results reflect perceptions which individuals are willing to
report and thus tend to consist of indirect data.
- Locally developed instrument may not provide external references
for results.
Ways to Reduce Disadvantages:
- Use only carefully constructed instruments that have been reviewed
by survey experts
- Include open-ended, respondent worded items along with forced-choice.
- If random sampling or surveying of the entire target population
is not possible, obtain the maximum sample size possible and follow-up
with nonrespondents (preferably in person or by phone).
- If commercially prepared surveys are used, add locally developed
items of relevance to the institution.
- If locally developed surveys are used, attempt to include at
least some externally-reference items (e.g., from surveys for
which national data are available).
- Word reports cautiously to reflect the fact that results represent
perceptions and opinions respondents are willing to share publicly.
- Use pilot or “try out” samples in local development
of instruments and request formative feedback from respondents
on content clarity, sensitivity, and format.
- Cross-validate results through other sources of data.
Bottom Lines
A relatively inexpensive way to collect data on important evaluative
topics from a large number of respondents. Must always be treated
cautiously, however, since results only reflect what subjects are
willing to report about their perception of their attitudes and/or
behaviors.
EXIT
INTERVIEW AND OTHER INTERVIEWS Top
(Evaluating student reports of their attitudes and/or behaviors
in a face-to-face interrogative dialogue.)
Advantages
- Student interviews tend to have most of the attributes of surveys
and questionnaires with the exception of requiring direct contact,
which may limit accessibility to certain populations. Exit interviews
also provide the following additional advantages:
- Allow for more individualized questions and follow-up probes
based on the responses of interviewees.
- Provide immediate feedback
- Include same observational and formative advantages as oral
examinations.
- Frequently yield benefits beyond data collection that come from
opportunities to interact with students and other groups.
- Can include a greater variety of items than is possible on surveys
and questionnaires, including those that provide more direct measures
of learning and development.
Disadvantages
- Require direct contact, which may be difficult to arrange.
- May be intimidating to interviewees, thus biasing results in
the positive direction.
- Results tend to be highly dependent on wording of items and
the manner in which interviews are conducted.
- Time consuming, especially if large numbers of persons are
to be interviewed.
Ways to Reduce Disadvantages
- Plan the interviews carefully with assistance from experts
- Provide training sessions for interviewers that include guidance
in putting interviewees at ease and related interview skills.
- Interview random samples of students when it is not feasible
to interview all.
- Conduct telephone interviews when face-to-face contact is not
feasible.
- Develop an interview format and questions with a set time limit
in mind.
- Conduct pilot-testing of interview and request interviewee formative
feedback.
- Interview small groups of individuals when individual interviewing
is not possible or is too costly.
Bottom Lines
Interviews provide opportunities to cover a broad range of content
and to interact with respondents. Opportunities to follow-up responses
can be very valuable. Direct contact may be difficult to arrange,
costly, and potentially threatening to respondents unless carefully
planned.
Third
Party Reports Top
(Influences regarding student/alumni attitudes or observations on
student/alumni behaviors, made by someone other than the student
or assessor; e.g., parents, faculty, employers, etc.)
Advantages
Third-party reports tend to have attributes similar to student
self-reports, plus the following additional advantages:
- Can provide unique consumer input, valuable in its own right
(especially employers and parents). How is our college serving
their purposes?
- Offer different perspectives, presumably less biased than either
student or assessor.
- Enable recognition and contact with important, often under-valued
constituents. Relations may improve by just asking for their input.
- Can increase both internal validity (through “convergent
validity”/”triangulation” with other data) and
external validity (by adding more “natural” perspective).
Disadvantages
Similar to disadvantages to self-reports, plus…
- As with any indirect data, inference and reports risk high degree
of error.
- Third-parties can be biased too, in directions more difficult
to anticipate than self-reports.
- Less investment by third-parties in assessment processes often
means lower response rates, even lower than student/alumni rates.
- Usually more logistical, time-and-motion problems (e.g., identifying
sample, making contact, getting useful responses, etc.), therefore
more costly than it looks.
- If information about individuals is requested, confidentiality
becomes an important and sometimes problematic issue that must
be addressed carefully.
Ways to Reduce Disadvantages
- Conduct face-to-face or phone interviews wherever possible,
increasing validity through probing and formative evaluation during
dialogue.
- Very careful, explicit directions for types and perspectives
of responses requested can reduce variability.
- Attain informed consent in cases where information about individuals
is being requested.
- Coordinate contacts with other campus organs contacting the
same groups, to reduce “harassment” syndrome and increase
response rates.
- ther self-report and interview “ways to reduce…”
apply here as well.
Bottom Lines
Third-party reports are valuable in that they access important
data sources usually missed by other methods, but they can be problematic
in cost of implementation and in gaining access to respondents.
If personally identifiable information about individual students
or alumni is requested, informed consent is needed.
Portfolios Top
(Collections of multiple student work samples usually compiled over
time.)
Advantages:
- Can be used to view learning and development longitudinally
(e.g. samples of student writing over time can be collected),
which is most valid and useful perspective.
- Multiple components of a curriculum can be measured (e.g., writing,
critical thinking, research skills) at the same time.
- Samples in a portfolio are more likely than test results to
reflect student ability when pre-planning, input from others,
and similar opportunities common to most work settings are available
(which increases generalizability/external validity of results).
- The process of reviewing and grading portfolios provides an
excellent opportunity for faculty exchange and development, discussion
of curriculum goals and objectives, review of grading criteria,
and program feedback.
- Economical in terms of student time and effort, since no separate
“assessment administration” time is required.
- Greater faculty control over interpretation and use of results.
- Results are more likely to be meaningful at all levels (i.e.,
the individual student, program, or institution) and can be used
for diagnostic/prescriptive purposes as well.
- Avoids or minimizes “test anxiety” and other “one
shot” measurement problems.
- Increases “power” of maximum performance measures
over more artificial or restrictive “speed” measures
on test or in-class sample.
- Increases student participation (e.g., selection, revision,
evaluation) in the assessment process.
Disadvantages
- Costly in terms of evaluator time and effort.
- Management of the collection and grading process, including
the establishment of reliable and valid grading criteria, is likely
to be challenging.
- May not provide for externality.
- If samples to be included have been previously submitted for
course grades, faculty may be concerned that a hidden agenda of
the process is to validate their grading.
- Security concerns may arise as to whether submitted samples
are the students’ own work, or adhere to other measurement
criteria.
Ways to Reduce Disadvantages
- Consider having portfolios submitted as part of a course requirement,
especially a “capstone course” at the end of a program.
- Utilize portfolios from representative samples of students rather
than having all students participate (this approach may save considerable
time, effort, and expense but be problematic in other ways).
- Have more than one rater for each portfolio; establish inter-rater
reliability through piloting designed to fine-tune rating criteria.
- Provide training for raters.
- Recognize that portfolios in which samples are selected by the
students are likely represent their best work.
- Cross-validate portfolio products with more controlled student
work samples (e.g., in-class tests and reports) for increased
validity and security.
Bottom Lines
Portfolios are a potentially valuable option adding important longitudinal
and “qualitative” data, in a more natural way. Particular
care must be taken to maintain validity. Especially good for multiple-objective
assessment.
“Stone”
Courses Top
(Courses, usually required for degree/program completion, which in
addition to a full complement of instructional objectives, also serve
as primary vehicles of student assessment for program evaluation purposes;
e.g., Capstone, Cornerstone, and Keystone courses.)
Advantages:
- Provides for a synergistic combination of instructional and
assessment objectives.
- A perfect mechanism for course-embedded assessment of student
learning and development (i.e., outcomes, pre-program competencies
and/or characteristics, “critical indicators,” etc.)
- Can add impetus for design of courses to improve program orientation/integration/updating
information for students.
Bottom Line
“Stone” course are-perfect blends of assessment and
instruction to serve program quality improvement and accountability
goals (capstones for outcomes measures; cornerstones for pre-program
measures); and should be considered by all academic programs.
Archival
Data Top
(Biographical, academic, or other file data available from college
or other agencies and institutions.)
Advantages:
- Tend to be readily available, thus requiring little additional
effort.
- Further utilize efforts that have already occurred.
- Cost efficient
- Constitute unobtrusive measurement, not requiring additional
time or effort from students or other groups.
- Very useful for longitudinial studies
Disadvantages:
- Especially in large institutions, may require considerable effort
and coordination to determine exactly what data are available
campus-wide.
- If individual records are included, may raise concerns regarding
protection of rights and confidentiality.
- Easy availability may discourage the development of other measures
of learning and development.
- May encourage attempts to “find ways to use data”
rather than measurement related to specific goals and objectives.
Ways to Reduce Disadvantages:
- Early-on in the development of an assessment program, conduct
a comprehensive review of existing assessment and evaluation efforts
and data typically being collected throughout the institution
and its units (i.e, “campus data map”).
- Be familiar with the Family Educational Rights and Privacy Act
(Buckley Amendment) and avoid personally identifiable data collection
without permission. Assure security/protection of records.
- Only use archival records that are relevant to specific goals
and objectives of learning and development.
Bottom Lines
Relatively quick, easy, and cost-effective method. Usually limited
data quality but integral to valuable longitudinal comparisons.
Should be a standard component of all assessment programs.
Behavioral
Observations Top
(Measuring the frequency, duration, topology, etc. of student actions,
usually in a natural setting with non-interactive methods.)
Advantages
- Best way to evaluate degree to which attitudes, values, etc.
are really put into action (= most internal validity).
- Catching students being themselves is the most “natural”
form of assessment (= best external validity).
- Least intrusive assessment option, since purpose is to avoid
any interference with typical student activities.
Disadvantages
- Always some risk of confounded results due to “observer
effect;” i.e., subjects may behave atypically if they know
they’re being observed.
- Depending on the target behavior, there may be socially or professionally
sensitive issues to be dealt with (e.g., invasion of privacy on
student political activities or living arrangements) or even legal
considerations (e.g., substance abuse or campus crime).
- May encourage “Big Brother” perception of assessment
and/or institution.
- Inexperienced or inefficient observers can produce unreliable,
invalid results.
Ways to Reduce Disadvantages
- Avoid socially or ethically sensitive target behaviors, especially
initially.
- Include representative student input in process of determining
“sensitivity” of potential target behaviors.
- Utilize electronic “observers: (i.e., audio and videorecorders)
wherever possible, for highly accurate, reliable, permanent observation
record (although this may increase assessment cost in the short
run if equipment is not already available.)
- Strictly adhere to ethical guidelines for the protection of
human research subjects.
Bottom Lines
This is the best way to know what students actually do, how they
manifest their motives, attitudes and values. Special care and planning
are required for sensitive target behaviors, but it’s usually
worth it for highly valid, useful results.
|