ISSN 1566-6379

First published
in 2003

   


   

Paper 2 - Issue 1

Home Papers in this Issue Previous Issues Site Map

    .

Home
About the Journal
Scope
Editorial Board
Submission Guidelines
Call for Papers

 

For information on the European Conference on IT Evaluation, click here

Downloadable documents on this site require Adobe Acrobat Reader (free download here)

A dilemma between decision quality and confidence in the decision: experimental validation of investment analysis methods.
Egon Berghout, University of Groningen, Faculty of Economic Sciences, Groningen, Germany.
e.w.berghout@eco.rug.nl

1.                  Introduction

Many studies have been conducted in the area of evaluation of information systems and many conferences have been devoted to this subject (examples of studies are, Kauffman and Weill, 1989; McKeen and Smith, 1993b; Willcocks, 1992; Farbey et al., 1993; Hitt and Brynjolfsson, 1994; examples of conferences are, The 15th International Conference on Information Systems research theme, “Improving productivity and adding value through information systems”, and the European Conference on IT Investment Evaluation, Henley-on-Thames, 13/14th September 1994, and 11/12th July 1995). Besides hundreds of articles, numerous books have been published (for instance, Parker and Benson, 1988, Banker et al., 1993; Remenyi et al., 1993; Hogbin and Thomas, 1994; Hares and Royle, 1994; Willcocks, 1994; Farbey et al., 1993; Gotlieb, 1985).

Nowadays decision makers in the area of information systems can choose among a wide variety of methods (overviews have been given by: Farbey et al., 1993; Willcocks, 1994; Powell, 1992; Willcocks, 1992, Avgerou, 1995). Renkema and Berghout refer to over sixty methods (Renkema and Berghout, 1996). However, little effort has been put into the validation of these methods. Some methods have been applied is cases, however, most methods are not tested at all. From a scientific point of view this remains unsatisfactory.

The issues of investment analysis methods are, therefore, elaborated upon in this article, using an experimental approach. In the experiments decision makers had to establish a priority list for information systems for a real-life case. Subsequently, a number of elements of the evaluation method were altered in the succeeding sessions. The extent to which the particular alteration influenced the success of the evaluation was then measured. The article starts with a description of the outline of the experiments, the definition of success used in the experiments, and the research conditions; then, the experiments and their results are described. The article ends with a summary and conclusions.

2.                  Outline of the experiments

The experiments were performed in two sets. In this section the outline of both sets is described.

The first set of experiments was designed to analyse the differences caused by varying the evaluation process steps, i.e. only the way that the evaluation aspects were presented and weighted was altered. This procedure was designed to answer the question as to whether even small changes in a particular evaluation method influence the success of the evaluation and whether the proposed experiments were suitable for the intended analysis. The two evaluation methods that were applied for this analyses are described in appendix B.

The second set of experiments was designed to analyse the differences caused by varying the evaluation aspects, i.e. in these experiments the contents of the evaluation aspects was altered; and the second set of experiments was used to determine whether the assumed strengths of the two methods of the first set of experiments resulted in a more successful third evaluation method. The method that was applied for this analyses is described in appendix C.

Summarised, the independent variables were the evaluation process and evaluation aspects applied by the decision makers, and the dependent variable was the success of the evaluation. The constants consisted of the case description used and the type of decision makers involved in the experiments. An outline of the experiments is given in figure 1.

 

Figure 1: Variables of the experiments


The case description was identical for all the experiments, and consisted of a Newspaper Publisher which had to identify a priority list for its eight information system proposals. The case description was based on a real-life situation described by Van Irsel and Van Reeken (Van Irsel and Van Reeken, 1994; Delahaye and Van Reeken, 1992).

The proposals were submitted by particular departments, and these differed in for example, size, technology, risk, costs and benefits. The evaluation committee was identical to the management team of the Newspaper Publisher, and consisted of the: Managing Director, Chief Information Officer, Head of Production, and Head of Marketing and Sales. To ensure that information exchange was required to establish the priority list, all members of the evaluation committee had one paragraph of proprietary information (personal objectives and opinions).

Students in an advanced stage of a Master of Science programme were used as decision makers. The experiments formed part of an ongoing Information Management course available at universities in the Netherlands

In total 47 students participated in the first set of experiments and 66 in the second set. Together with the two test experiments done by 8 students, 129 students took part in the experiments.

Each experiment consisted of five subsequent activities. First, the issue of evaluation of information systems and the case of the Newspaper Publisher were introduced. Second, the particular evaluation method to be used was explained. Third, the students studied the case of the Newspaper Publisher and identified a priority list. Fourth, the students completed a questionnaire regarding their opinions about the evaluation that had been performed. Fifth, the results produced by specific groups were discussed in a plenary session.

The students were placed randomly in groups of preferably four. Depending on the total number of available students, sometimes groups of three or five had to be formed. In the situation where a group of three was formed, the role of managing director was left out. In the situation where a group of five was formed, a second general manager was added. In both situations the information exchange associated with the other roles remained unchanged.

Master of Science students cannot be assumed that they represent the diversity of objectives that management teams sometimes do, however, are regarded adequate decision makers in the setting of these experiments, because:

·         They are near completion of their studies and many of them will be involved in these types of decisions shortly.

·         Managers would also be new to the context of the Newspaper Publisher. There are simply not enough Newspaper managers to perform statistically representative numbers of experiments. Re-doing the experiment with a new method, however, the same managers, appeared not to work, because managers started to work towards their previous results.

 3.                  Success of the evaluation method

The experiments were designed to analyse the effects of modifications of an evaluation method on the success of solving a case, and therefore required a measure of success. In this research, success comprised of three measures.

The first measure of success was the priority list that was established by the group. The text of the case was modified to support a particular standard priority list of six of the eight information systems (although, this standard list of six was still considered to be far from obvious). The standard list was cross-checked with two experts, and turned out to be identical to:

·         The average priority list of both sets of experiments and all experiments together.

·         The modal position of all proposals over all experiments (most frequent position in the priority list).

The average solution would have been an obvious competing candidate for a standard solution. The fact that both average and modal solution were identical to the standard solution is regarded to be a confirmation of its correctness.

Only 2 out of 30 groups, captured the standard solution exactly, which confirmed that the priority list was far from obvious. The distribution of all priority lists resulting from all of the experiments is given in figure 2.

Position on priority list

Information System Proposal

 

 

E

H

A

B

D

F

1

16

6

7

1

0

0

2

4

11

6

6

3

0

3

6

7

10

3

2

2

4

3

4

5

13

5

1

5

1

1

1

6

12

8

6

0

1

1

1

8

19

Average position

2.0

2.6

2.7

3.7

4.7

5.5

 Figure 2: Distribution of the information systems on the priority lists and average position

The second measure of success used was the degree of confidence that the decision makers had in their priority list. In practice, where a standard priority list is missing, confidence in the priority list is considered to be of major importance.

The third measure of success was the extent to which consensus was reached in each group. In practice, information exchange is considered to be crucial in evaluation (Avgerou, 1995). To encourage information exchange in the experiments the students were given the roles of department heads, however, it cannot be assumed that Master of Science students represent the diversity of objectives that management teams sometimes do. Fortunately this also had a positive side, because the experiment contained conflicting business objectives, and given the fact that the students were randomly placed in groups, they were not expected to present conflicting personal objectives.

4.                  Investment portfolio method and multicriteria method (first set of experiments)

The first set of experiments is described in this section. In the first set of experiments two methods were tested, the multicriteria method (MCM), and the investment portfolio method (IPF). First, the hypotheses that were tested are explained. Then, the outcome of the experiments is described. An explanation of the two methods is given in appendix B.

4.1 Description of hypotheses

In general was assumed in the first set of experiments that the IPF method is superior compared to the MCM. This was tested using the following hypotheses.

Hypothesis 1.1: the priority lists of groups applying the IPF method would prove to be a better match to the standard list.

This is based on the fact that the IPF method was assumed to encourage discussions between decision makers and to offer a better opportunity to identify the relative differences between the information system proposals.

Hypothesis 1.2: decision makers applying the IPF method would be more confident of their priority list.

The number of aspects that needs to be considered of the information systems exceeds the human perceptual capacity of approximately seven chunks (Taylor, 1975, p. 411; De Vries, 1993, p. 75). It was assumed that the IPF method helps decision makers to reduce the number of aspects involved without losing the essence of the proposals.

Hypothesis 1.3: groups applying the IPF method would be faster at establishing their priority lists.

The IPF method requires a less detailed analysis and should therefore be faster to complete. Given the fact that the evaluation effort should be in proportion to the perceived benefits of the evaluation, this is an important characteristic for an evaluation method.

Hypothesis 1.4: decision makers applying the IPF method could justify their priority list better.

As the IPF method is focused on differences between information systems, instead of on an overall assessment, the decision makers should gain better insight and this should be noticeable during the discussion in the plenary session.

4.2 Results of the first set of experiments

The results of the experiments are described in association with the hypotheses.

Hypothesis 1.1: the priority lists of groups applying the IPF method would prove to be a better match to the standard list.

This hypothesis was falsified. Figure 3 gives an overview of the established priority lists and the standard list, and the priority lists established using the MCM match the standard list better than those derived using the IPF method. This difference is statistically significant (95% certainty).

Spearman’s Rank Correlation was applied to calculate the correlation coefficients between a particular priority list and the standard priority list. A Spearman’s Rank Correlation of +1 implies a full match, -1 implies a complete opposite.

  

Group

Position

 

Spearman’s Rank Correlation

 

1

2

3

4

5

6

 

MCM Group 1

E

H

A

B

D

F

1.0

MCM Group 2

E

H

A

D

B

F

0.9429

MCM Group 3

E

A

H

B

D

F

0.9429

MCM Group 4

E

H

A

B

F

D

0.9429

MCM Group 5

E

H

A

B

F

D

0.9429

MCM Group 6

E

D

H

A

B

F

0.6571

IPF Group 1

A

E

H

B

D

F

0.8286

IPF Group 2

E

A

B

H

F

D

0.7714

IPF Group 3

A

H

E

D

B

F

0.7143

IPF Group 4

H

A

B

E

D

F

0.6571

IPF Group 5

A

H

F

E

B

D

0.3143

IPF Group 6

A

B

E

D

F

H

0.1429

Standard priority list

E

H

A

B

D

F

(1)

MCM stands for multicriteria method; IPF for investment portfolio method.

 

Figure 3: Overview of priority lists established with MCM and IPF

 

Hypothesis 1.2: decision makers applying the IPF method would be more confident of their priority list.

This hypothesis was not falsified. The questionnaire contained a number of questions regarding the decision maker’s confidence in the priority list. First, the straightforward question of how sure the decision maker was of the established priority list. Second, there were many questions regarding the success of the discussion in general, and finally, the evaluation was observed to verify whether the task was completed seriously. The results of the straightforward question about the confidence of the decision maker are given in figure 4.

 

Figure 4: Bar chart of the answer to the question: "How confident are you about the correctness of the priority list" (in number of students per answer category)


The means of the answers are statistically different with a 99% certainty.

 A slight difference could also be observed in the plenary discussion. Although, the discussion in the plenary sessions were rather bleak: the evaluation required a lot of time and effort, and however interested the students were, they were not interested in a new discussion in the plenary session. Only a few of them took part in the discussion, and therefore the outcome of the plenary discussion will not be included in the results.

Hypothesis 1.3: groups applying the IPF method would be faster at establishing their priority lists.

This hypothesis was not falsified. The time every group required to complete the evaluation was registered and is given in figure 5.

 

Group

Time to complete evaluation(minutes)

Sufficient time?

MCM Group 1

70

No

MCM Group 2

65

Yes

MCM Group 3

60

No