|
1.
Introduction
Many studies have been conducted in the
area of evaluation of information systems and many conferences have been devoted
to this subject (examples of studies are, Kauffman and Weill, 1989; McKeen and
Smith, 1993b; Willcocks, 1992; Farbey et
al., 1993; Hitt and Brynjolfsson, 1994; examples of conferences are, The
15th International Conference on Information Systems research theme,
“Improving productivity and adding value through information systems”, and
the European Conference on IT Investment Evaluation, Henley-on-Thames, 13/14th
September 1994, and 11/12th July 1995). Besides hundreds of articles, numerous
books have been published (for instance, Parker and Benson, 1988, Banker
et al., 1993; Remenyi et al.,
1993; Hogbin and Thomas, 1994; Hares and Royle, 1994; Willcocks, 1994; Farbey et al., 1993; Gotlieb, 1985).
Nowadays decision makers in the area of
information systems can choose among a wide variety of methods (overviews have
been given by: Farbey et al., 1993;
Willcocks, 1994; Powell, 1992; Willcocks, 1992, Avgerou, 1995). Renkema and
Berghout refer to over sixty methods (Renkema and Berghout, 1996). However,
little effort has been put into the validation of these methods. Some methods
have been applied is cases, however, most methods are not tested at all. From a
scientific point of view this remains unsatisfactory.
The issues of investment analysis methods
are, therefore, elaborated upon in this article, using an experimental approach.
In the experiments decision makers had to establish a priority list for
information systems for a real-life case. Subsequently, a number of elements of
the evaluation method were altered in the succeeding sessions. The extent to
which the particular alteration influenced the success of the evaluation was then measured. The article starts with
a description of the outline of the experiments, the definition of success
used in the experiments, and the research conditions; then, the experiments and
their results are described. The article ends with a summary and conclusions.
2.
Outline of the experiments
The experiments were performed in two sets.
In this section the outline of both sets is described.
The first set of experiments was designed
to analyse the differences caused by varying the evaluation process steps, i.e. only the way that the evaluation
aspects were presented and weighted was altered. This procedure was designed to
answer the question as to whether even small changes in a particular evaluation
method influence the success of the evaluation and whether the proposed
experiments were suitable for the intended analysis. The two evaluation methods
that were applied for this analyses are described in appendix B.
The second set of experiments was designed
to analyse the differences caused by varying the evaluation aspects, i.e. in these experiments the contents of the
evaluation aspects was altered; and the second set of experiments was used to
determine whether the assumed strengths of the two methods of the first set of
experiments resulted in a more successful third evaluation method. The method
that was applied for this analyses is described in appendix C.
Summarised, the independent variables were the evaluation process and evaluation
aspects applied by the decision makers, and the dependent variable was the success of the evaluation. The constants
consisted of the case description used and the type of decision makers involved
in the experiments. An outline of the experiments is given in figure 1.
|
Figure
1:
Variables of the experiments
|
The case description was identical for all the experiments,
and consisted of a Newspaper Publisher which had to identify a priority list for
its eight information system proposals. The case description was based on a
real-life situation described by Van Irsel and Van Reeken (Van Irsel and Van
Reeken, 1994; Delahaye and Van Reeken, 1992).
The proposals were submitted by particular
departments, and these differed in for example, size, technology, risk, costs
and benefits. The evaluation committee was identical to the management team of
the Newspaper Publisher, and consisted of the: Managing Director, Chief
Information Officer, Head of Production, and Head of Marketing and Sales. To
ensure that information exchange was required to establish the priority list,
all members of the evaluation committee had one paragraph of proprietary
information (personal objectives and opinions).
Students in an advanced stage of a Master
of Science programme were used as decision makers. The experiments formed part
of an ongoing Information Management course available at universities in the
Netherlands
In total 47 students participated in the
first set of experiments and 66 in the second set. Together with the two test
experiments done by 8 students, 129 students took part in the experiments.
Each experiment consisted of five
subsequent activities. First, the issue of evaluation of information systems and
the case of the Newspaper Publisher were introduced. Second, the particular
evaluation method to be used was explained. Third, the students studied the case
of the Newspaper Publisher and identified a priority list. Fourth, the students
completed a questionnaire regarding their opinions about the evaluation that had
been performed. Fifth, the results produced by specific groups were discussed in
a plenary session.
The students were placed randomly in
groups of preferably four. Depending on the total number of available students,
sometimes groups of three or five had to be formed. In the situation where a
group of three was formed, the role of managing director was left out. In the
situation where a group of five was formed, a second general manager was added.
In both situations the information exchange associated with the other roles
remained unchanged.
Master of Science students cannot be
assumed that they represent the diversity of objectives that management teams
sometimes do, however, are regarded adequate decision makers in the setting of
these experiments, because:
·
They are near completion of their studies and
many of them will be involved in these types of decisions shortly.
·
Managers would also be new to the context of
the Newspaper Publisher. There are simply not enough Newspaper managers to
perform statistically representative numbers of experiments. Re-doing the
experiment with a new method, however, the same managers, appeared not to work,
because managers started to work towards their previous results.
3.
Success of the evaluation method
The experiments were designed to analyse
the effects of modifications of an evaluation method on the success of solving a case, and therefore required a measure of
success. In this research, success comprised of three measures.
The first measure of success was the priority
list that was established by the group. The text of the case was modified to
support a particular standard priority list of six of the eight information
systems (although, this standard list of six was still considered to be far from
obvious). The standard list was cross-checked with two experts, and turned out
to be identical to:
·
The average priority list of both sets of
experiments and all experiments together.
·
The modal position of all proposals over all
experiments (most frequent position in the priority list).
The average solution would have been an
obvious competing candidate for a standard solution. The fact that both average
and modal solution were identical to the standard solution is regarded to be a
confirmation of its correctness.
Only 2 out of 30 groups, captured the
standard solution exactly, which confirmed that the priority list was far from
obvious. The distribution of all priority lists resulting from all of the
experiments is given in figure 2.
|
Position
on priority list
|
Information
System Proposal
|
|
|
E
|
H
|
A
|
B
|
D
|
F
|
|
1
|
16
|
6
|
7
|
1
|
0
|
0
|
|
2
|
4
|
11
|
6
|
6
|
3
|
0
|
|
3
|
6
|
7
|
10
|
3
|
2
|
2
|
|
4
|
3
|
4
|
5
|
13
|
5
|
1
|
|
5
|
1
|
1
|
1
|
6
|
12
|
8
|
|
6
|
0
|
1
|
1
|
1
|
8
|
19
|
|
Average position
|
2.0
|
2.6
|
2.7
|
3.7
|
4.7
|
5.5
|
Figure 2:
Distribution of the information systems on the priority lists and average
position
The second measure of success used was the
degree of confidence that the decision
makers had in their priority list. In practice, where a standard priority list
is missing, confidence in the priority list is considered to be of major
importance.
The third measure of success was the
extent to which consensus was reached
in each group. In practice, information exchange is considered to be crucial in
evaluation (Avgerou, 1995). To encourage information exchange in the experiments
the students were given the roles of department heads, however, it cannot be
assumed that Master of Science students represent the diversity of objectives
that management teams sometimes do. Fortunately this also had a positive side,
because the experiment contained conflicting business objectives, and given the
fact that the students were randomly placed in groups, they were not expected to
present conflicting personal objectives.
4.
Investment portfolio method and multicriteria method (first set of
experiments)
The first set of experiments is described
in this section. In the first set of experiments two methods were tested, the
multicriteria method (MCM), and the investment portfolio method (IPF). First,
the hypotheses that were tested are explained. Then, the outcome of the
experiments is described. An explanation of the two methods is given in appendix
B.
4.1 Description of hypotheses
In general was assumed in the first set of
experiments that the IPF method is superior compared to the MCM. This was tested
using the following hypotheses.
Hypothesis 1.1: the priority lists of groups applying the IPF method would prove to be a
better match to the standard list.
This is based on the fact that the IPF
method was assumed to encourage discussions between decision makers and to offer
a better opportunity to identify the relative differences between the
information system proposals.
Hypothesis 1.2: decision makers applying the IPF method would be more confident of their
priority list.
The number of aspects that needs to be
considered of the information systems exceeds the human perceptual capacity of
approximately seven chunks (Taylor, 1975, p. 411; De Vries, 1993, p. 75). It was
assumed that the IPF method helps decision makers to reduce the number of
aspects involved without losing the essence of the proposals.
Hypothesis 1.3: groups applying the IPF method would be faster at establishing their
priority lists.
The IPF method requires a less detailed
analysis and should therefore be faster to complete. Given the fact that the
evaluation effort should be in proportion to the perceived benefits of the
evaluation, this is an important characteristic for an evaluation method.
Hypothesis 1.4: decision makers applying the IPF method could justify their priority
list better.
As the IPF method is focused on
differences between information systems, instead of on an overall assessment,
the decision makers should gain better insight and this should be noticeable
during the discussion in the plenary session.
4.2 Results of the first set of
experiments
The results of the experiments are
described in association with the hypotheses.
Hypothesis 1.1: the priority lists of groups applying the IPF method would prove to be a
better match to the standard list.
This hypothesis was falsified. Figure 3
gives an overview of the established priority lists and the standard list, and
the priority lists established using the MCM match the standard list better than
those derived using the IPF method. This difference is statistically significant
(95% certainty).
Spearman’s Rank Correlation was applied
to calculate the correlation coefficients between a particular priority list and
the standard priority list. A Spearman’s Rank Correlation of +1 implies a full
match, -1 implies a complete opposite.
|
Group
|
Position
|
Spearman’s Rank
Correlation
|
|
|
1
|
2
|
3
|
4
|
5
|
6
|
|
|
MCM
Group 1
|
E
|
H
|
A
|
B
|
D
|
F
|
1.0
|
|
MCM
Group 2
|
E
|
H
|
A
|
D
|
B
|
F
|
0.9429
|
|
MCM
Group 3
|
E
|
A
|
H
|
B
|
D
|
F
|
0.9429
|
|
MCM
Group 4
|
E
|
H
|
A
|
B
|
F
|
D
|
0.9429
|
|
MCM
Group 5
|
E
|
H
|
A
|
B
|
F
|
D
|
0.9429
|
|
MCM
Group 6
|
E
|
D
|
H
|
A
|
B
|
F
|
0.6571
|
|
IPF
Group 1
|
A
|
E
|
H
|
B
|
D
|
F
|
0.8286
|
|
IPF
Group 2
|
E
|
A
|
B
|
H
|
F
|
D
|
0.7714
|
|
IPF
Group 3
|
A
|
H
|
E
|
D
|
B
|
F
|
0.7143
|
|
IPF
Group 4
|
H
|
A
|
B
|
E
|
D
|
F
|
0.6571
|
|
IPF
Group 5
|
A
|
H
|
F
|
E
|
B
|
D
|
0.3143
|
|
IPF
Group 6
|
A
|
B
|
E
|
D
|
F
|
H
|
0.1429
|
|
Standard
priority list
|
E
|
H
|
A
|
B
|
D
|
F
|
(1)
|
|
MCM
stands for multicriteria method; IPF for investment portfolio method.
|
Figure 3:
Overview of priority lists established with MCM and IPF
Hypothesis 1.2: decision makers applying the IPF method would be more confident of their
priority list.
This hypothesis was not falsified. The
questionnaire contained a number of questions regarding the decision maker’s
confidence in the priority list. First, the straightforward question of how sure
the decision maker was of the established priority list. Second, there were many
questions regarding the success of the discussion in general, and finally, the
evaluation was observed to verify whether the task was completed seriously. The
results of the straightforward question about the confidence of the decision
maker are given in figure 4.
|
Figure 4:
Bar chart of the answer to the question: "How confident are you
about the correctness of the priority list" (in number of students
per answer category)
|
The means of the answers are statistically different with a
99% certainty.
A slight difference could also be
observed in the plenary discussion. Although, the discussion in the plenary
sessions were rather bleak: the evaluation required a lot of time and effort,
and however interested the students were, they were not interested in a new
discussion in the plenary session. Only a few of them took part in the
discussion, and therefore the outcome of the plenary discussion will not be
included in the results.
Hypothesis 1.3: groups applying the IPF method would be faster at establishing their
priority lists.
This hypothesis was not falsified. The
time every group required to complete the evaluation was registered and is given
in figure 5.
|
Group
|
Time to complete
evaluation(minutes)
|
Sufficient time?
|
|
MCM
Group 1
|
70
|
No
|
|
MCM
Group 2
|
65
|
Yes
|
|
MCM
Group 3
|
60
|
No
|
|
| |