This article was originally published in The Rational Edge in February 2004. Since it is no longer available there I have republished it here. Apologies if the picture quality is not the best. They have not survived various storages very well…
Using RequisitePro to evaluate competing bids
I was recently involved in a project at a government agency in Sweden. The project was about the development of a new weapons system for a warship. A contractor would develop this system, and the job of the agency was to specify the requirements on the system, select a contractor for the development and later when the system was developed, verify and validate it against the original requirements.
We spent quite some time on specifying the requirements so that an RFP (Request For Proposal) could be sent out, and while doing this, we started to think about how we would go about selecting the winning bid and thereby award the contract to some contractor. In this project we started looking at using IBM Rational RequisitePro to help us in the selection process. These ideas will be explained in this article.
If you send out an RFP in a competitive acquisition, you would expect to get more than one bid back from separate contractors or groups of contractors. If this is the case, you end up with the problem of selecting one winning bid. If the project is somewhat large and the acquisition is competitive you could probably also expect to be held responsible for the selection decision, possibly by one of the ‘loosing’ contractors taking you to court. If this happens, it is a good idea to have made a well-documented decision, based on objective principles that the contractors knew about before they entered their bids as responses to the RFP.
This means that you should use an objective evaluation process, and that this evaluation process should be described as a part of the original RFP.
So, we would have to set up an evaluation process, but before we do that, let’s discuss for a moment what factors and forces will shape this process to its final form.
Tradeoffs between requirements
The bids from different contractors will not be equal. They will meet your set of requirements in different fashions and probably also have a different coverage on what specific requirements they actually meet and how well they do it.
You have to be able to make decisions on how much it is worth for contractor A that they meet requirement X very well but not requirement Y, while in turn contractor B doesn’t meet requirement X and meets requirement Y very well. In short, you have to be able to make the tradeoffs between your requirements, so that you can make an objective evaluation of the contractors’ bids.
If you have come up with some of these tradeoffs beforehand, you also have the possibility to include this information in your RFP. This would give the contractors better opportunities to create a bid that suits your needs better, but it also limits your freedom once the bids are to be evaluated. At least, by coming up with your tradeoffs beforehand you have the opportunity to send them out if you wish to do so.
Lots of data
The set of requirements for the future system may be large, meaning that the evaluation process has to look at a lot of information in the bids that come in from contractors, and the evaluation process also possibly has to evaluate the bids from a large set of criteria. This means that there will be a lot of data produced by the evaluation. This large set of data should be kept in a well-structured form, so that specific questions about the evaluation can be answered easily at a later stage. Remember, you might have to defend your decision in court some day.
Team of people
Modern acquisitions usually involve not only technical aspects of the future system, but also a number of other things such as legal matters, payments and contracts, support agreements, provisioning, maintenance, training etc. This means that the work of evaluating the bids will be a collective effort of a team of specialists covering all of these disciplines.
So how would you go about solving this problem? A suggested solution will be described in this article. One part of the solution deals with the process to use for evaluation and another part deals with the tool support needed.
Just using some tool to do the evaluation is not enough. We also have to think about how the work should be performed, or in other words, what process to use.
There are two major roles involved in the evaluation process, the Evaluator and the Evaluation Manager.
The Evaluator is a person that is responsible for evaluating the bids from a specific perspective. These perspectives may be, for example, contracting and legal matters, technical fit, fit into maintenance organization etc. Since one person probably does not have satisfactory competence in all of these fields, there is probably going to be more than one evaluator involved, so that their combined competence covers all the areas. There could also be more than one evaluator for a specific competence area, meaning that we could have a situation where a specific criterion is assigned to more than one evaluator.
The Evaluation Manager is the person that is responsible for executing the evaluation project. He or she will coordinate the efforts of all evaluators and also combine all of the evaluations into a combined total score for each bid.
You probably have some requirements that every contractor must meet, in order to be considered at all. An example could be requirements imposed by laws and regulations or safety standards. If a bid does not meet these requirements it would be disqualified from future consideration. Therefore we don’t have to go on considering it, so an early ‘quality-check’ of all bids should be done before they are evaluated.
Please note that it would be wise of you to have as few disqualifying requirements as possible. You would not want to disqualify any bids if you do not really need to do so.
Setting up criteria
After having checked the bids against the disqualifying shall-requirements, you hopefully have more than one bid left. If this is the case, you have to find some way of prioritizing between them.
Therefore I propose that you set up a hierarchical tree of criteria. An example of the top levels of such a hierarchical tree is shown in Figure 1. In any real-life project, the tree of criteria will of course be much deeper and wider than this simple example, but I feel that this is enough to get the idea across.
Figure 1 Example of criteria tree (top levels)
When the issues that needs to be evaluated for each bid is split up into a tree like this it is quite easy to designate specific people to make the evaluations of specific branches of the tree so that the whole tree is covered by suitable experts.
The next step is to think about how important each branch of the tree is. Every criterion in the tree should be fitted with a number indicating its relative weight compared to its siblings.
Figure 2 shows us the same tree of criteria as in Figure 1, but now with the relative weights added. Since the weight is a relative number, the scale of it is arbitrary, just as long as we use the same scale for every ‘family’ of criteria. In the example of Figure 2, I have used a percentage scale (the total weight of all families add up to 100), but any scale could have been used.
Figure 2 Example of criteria tree with weights (top levels)
As noted above, it may be a good idea to provide at least parts of your prioritization to potential contractors already when the RFP is sent out. If you wish to do so, it is the information contained in the criteria tree, including weights that should be sent out (at least the higher levels of it).
Criteria vs. Requirements
The tree should be developed deeply enough so that each leaf in the tree represents evaluation of a single thing.
Does this mean that you should have one criterion for every requirement in your requirement set? I would say that the answer in most cases is no. There could certainly be cases where you have a criterion represent a single requirement, but if your requirement set is somewhat complicated, you will probably find that it can save you some work if the evaluators are asked to evaluate a set of related requirements together. This means that a single criterion is related to many requirements.
Furthermore, you could also have situations where some of your leaf criteria do not relate to any requirements at all. You would probably like to have room for subjective judgments by experts, so that you don’t loose the opportunity to consider other factors than hard requirements.
This means that the set of criteria is a set that is not the same as the set of requirements, but the relationships between these sets is interesting to keep track of.
Now it is time to do the actual evaluations for all of the leaf criteria in the tree that was developed previously. This means that a skilled evaluator will go through the bids and assign scores for each bid for every leaf criterion assigned to him or her.
As we noted above, it is important to have good documentation on why you chose a specific bid before another. That means that the evaluators must not only provide their evaluations, but also document their rationale for giving such an evaluation.
For the evaluation manager, there are two things to keep in mind as the evaluation work is done:
First, it is important that the evaluators do not know the relative weight assigned to each criterion. Instead, they should only be provided with the list of criteria to evaluate, and any information on how to judge it. The reason for not providing them with the weighting information is that if they know it, they could let the weight influence their decision. We would like their evaluation to simply be an evaluation on how well each bid meets each criterion, no matter how important that criterion is to us.
Second, the evaluation manager must think about the scale of the scores that the evaluations should result in. It is important that the evaluations result in scores that are of the same scale in the whole tree. What specific scale we use is not important, just as long as it is the same everywhere. For example we could use a scale of scores from 1 to 10 or percentages, just as long as we are consistent.
We could also envision a system of ‘transfer functions’, meaning that we provide the evaluators with a set of responses and beforehand determine how many score points that each answer would give. This would give us more flexibility in how the evaluation is done, but also give us more work.
There could be situations where more than one evaluator is asked to evaluate a specific criterion. If that were the case, we would have to come up with a system to combine their individual evaluations into a single combined score. In the project I was in we decided on using the average of the evaluations, but also let the system warn us if their evaluations were too far apart. If they were too far apart, it could be fixed by the overrides that are described below.
Figure 3 shows us an example of how our previous criteria tree can be used to evaluate two competing bids. Evaluation scores for two bids are assigned to each leaf node. We can also see that the criterion “Req 1” is evaluated by two persons, therefore we have a small hierarchy of evaluations for that criterion. To get to a single evaluation for Req 1 we have used the averaging mentioned above. The system might warn us that the evaluations of the second bid for Req 1 are very far apart, which would be a warning that we have to investigate why the two evaluators have such different views on how the second bid meets Req 1.
Figure 3 Criteria tree with evaluations
So, now we have made the evaluations of each leaf criterion, but we have to have a way of combining these evaluations into a total score for each bid that can be compared to the total score of the other bids.
Since we have a numbered score for each leaf criterion, and also a number telling us how important each criterion is, we simply postulate the following rule:
For a specific bid, the score for a specific criterion is the weighted average of the scores of all its children. The average is weighted by the relative weight of those children.
By using this rule recursively from the bottom of the tree up to the top, we eventually come up with a total score for the bid by looking at the calculated score for the root node.
If we follow this rule for the evaluations of the first bid in Figure 3, we would find that the score of the “Price” criterion is 6 and the score of the “Delivery date” criterion is 8. This means that the score for the criterion “Commercial issues” can be calculated as:
Following the same pattern, we could calculate the score of “Adherence to specific reqs” to 5.4, “Technical solution to 6.04, “Business commitment” to 5.8 and finally the calculated score of the root criterion would be 6.172
Now, we could do the same calculation for the other bid and compare that total calculated score with this value and easily see which one has the highest total score.
Now we have come up with a total score for each bid. Does this mean that we automatically should choose the bid with the highest score? I would say no, not necessarily.
We have to go back and look at the calculations and evaluations to see if anything seems wrong. There could very well be situations where we don’t agree with a score (calculated or evaluated) somewhere in the tree and we might feel inclined to change it. Therefore, in the project I was in, we made it possible to override the calculated values. Of course, any such override would require documentation on the rationale of that override, just as evaluations requires documented rationales.
Technically, the overrides work in the way that the evaluation manager would assign his/her own score to a specific criterion in the tree. When the aggregation is done, the system would start to look for scores from the top, and when one is found, it would not go further down the tree.
In Figure 4, we can see how the evaluation manager has changed the score of the criterion “Business commitment” for the first bid to 4, instead of the calculated 5.8 as described above. This means that the aggregation will disregard any scores for the first bid below this point, and use the value of 4 instead.
Figure 4 Criteria tree with scores and overrides
Another example of an override is shown in Figure 4. As you might remember from above, the combined evaluations of the second bid for “Req 1” are very far apart. After consulting with the evaluators, the evaluation manager finds that he or she does not agree with the averaged number, so he or she simply changes the calculated average of 5.5 to 7. This means that the calculation does not go below this point and uses the value of 7 instead.
When the evaluation manager has changed the scores, and documented the rationales behind these changes, the aggregation can simply be done again, now giving us a slightly modified total score. This process (change evaluations, recalculate) can be repeated as many times as wanted, and finally you have a total score that you would agree with. At this point, but not before, you select the bid with the highest total score, and thereby your evaluation is finished.
The overrides should not be used too much. They are introduced as a way of changing results that are obviously wrong, but they should not be used to steer the results of the evaluation in the direction of a specific bid. If you use them that way, the whole purpose of using an objective way of evaluation bids falls apart.
On the market, there are some specialized bid evaluation tools that would enable a process like the one described above. For the project that I was in, however, we saw another solution than using those tools. The reasons for this decision was that we already had a requirements database in IBM Rational RequisitePro and did not want to introduce another database with all the synching problems we would have between them. We also felt that that the price of these tools was too high. Instead we decided on extending our RequisitePro database to also fit the needs of the evaluation.
The RequisitePro database can be used to keep track of all of the information about the evaluation. The structure of RequisitePro requirement types, attributes and traceabilities that are used is shown in Figure 5.
Figure 5 RequisitePro project structure
The requirement types shown in Figure 5 are explained in Table 1.
|Criteria||CR||These requirements form a hierarchy of criteria as explained in this article. The hierarchy is formed by using the hierarchical feature of RequisitePro|
|Bid||B||Each requirement of this type represents a bid|
|Evaluation Score||EV||These requirements represent the evaluations given by evaluators|
|Calculated Score||CLC||These requirements reqpresent the scores calculated from the evaluation scores in the higher levels of the criteria hierarchy. Each calculation will produce a hierarchy of calculated scores that mirrors the hierarchy of criteria.|
|General Requirement||This represents the real system requirements that the criteria are there to evaluate. These requirements can be of any type.|
Table 1 RequisitePro requirement types
The attributes shown in Figure 5 are explained in Table 2
|Name||Criteria||A descriptive name of the criterion|
|Weight||Criteria||Holds the relative weight of the criterion as described above|
|Evaluator||Criteria||Indicates what evaluator is responsible for making the evaluations of this criterion and any criteria below this one.|
|Score||Evaluation Score||Holds the actual score given by the evaluator.|
|Rationale||Evaluation Score||A textual description that the evaluator provides, indicating why he or she has given this specific score.|
|Evaluator||Evaluation Score||Indicates who made the evaluation.|
|IsSuspect||Evaluation Score||As explained above, if two evaluators make an evaluation of the same criterion for the same bid and they are very far apart, the total evaluation is considered to be suspect.|
|IsSuspect||Calculated Score||When you make an aggregation and create calculated scores, any calculated scores that have at least one suspect evaluation score as input (I e any calculated score higher up in the tree on the same branch) is also considered to be suspect.|
|Score||Calculated Score||The calculated value for the calculated score|
|Name||Bid||A descriptive name of the bid.|
Table 2 RequisitePro attributes
The functionality to do the calculations for the aggregation is not available in RequisitePro. Therefore an extension had to be created. This extension uses the COM API of RequisitePro to do the aggregation calculation, and store the results by means of creating a hierarchy of calculated scores that mirrors the hierarchy of criteria. After the calculation is done, the Evaluation Manager can browse through the result by examining a RequisitePro view.
With the aggregation extension, we would have all the functionality that is absolutely necessary to do the evaluation, but we also saw the need to make further extensions to ease the communication between the Evaluation Manager and the Evaluators. The reason for this is that the Evaluators did not have access to RequisitePro, and even if they did, they could not be asked to use the RequisitePro user interface to enter their evaluations in the database. Therefore, we also created spreadsheets in Excel to import and export evaluations. These spreadsheets had VBA macros that interfaced to the RequisitePro API to create the RequisitePro requirements and attributes from the data in the spreadsheet.
Further explanations of the usage of RequisitePro for these needs can be found in the section Appendix – Simple add-in to IBM Rational RequisitePro below, where a simple example of the aggregation functionality is also provided.
So, what are the benefits of doing the evaluation this way?
First, I feel that there are quite a lot of benefits of not introducing another database of information into the project. The evaluation information is tightly connected to the requirements and relationships should be kept between these two sets of information. If we were to use another tool than RequisitePro for the evaluation effort, we would run into synchronization issues between these two tools.
Second, since there is a high probability that you would have to publicly defend your decision (possibly in court), it is a very good idea to keep track of all the information related to the evaluation. The process described above forces the evaluators to document rationales for their evaluations and goes on combining these evaluations in a well-documented way. All of this information is stored in one place in the database, so you don’t run any risks of loosing the information.
Furthermore, the somewhat objective way of combing subjective evaluations into a total score gives the evaluation a ‘scientific’ flavor, which is easy to defend. It can also, as noted above, be provided beforehand to the contractors in the RFP.
Progressive Acquisition or Waterfall
There has been some work done about transforming the traditional waterfall approach of acquisition into an iterative fashion that is called Progressive Acquisition. How does the ideas presented in this article relate to Progressive Acquisition?
I would say that the ideas in this article are valid no matter if you do a traditional waterfall approach or if you use the more modern Progressive approach. For either of these approaches, you would still have to evaluate bids from contractors. In the waterfall approach you would probably do one evaluation early on in the project, and for the Progressive Acquisition you would probably do more than one evaluation, possibly one for every increment. These evaluations can be performed in the fashion described in this article.
For the help in developing the ideas in this article, I must give a lot of credit to Dean Fowler and Erik Djerf at the Defence Material Administration in Sweden. Furthermore, I’d like to thank Nils Kronqvist for all of the help with VB issues in the development of the RequisitePro add-in and also Mats Rahm for proofreading this article.
 By family I mean the set of child-criteria to a specific parent criterion.
 Of course, they should also be provided with the RFP and the bids that they should evaluate.
 In this example, a scale of 1-10 is implied.
 There could be many reasons for doing this. Perhaps the first evaluator has misunderstood something, or the evaluation manager trusts the second evaluator a lot more than the first one.
 In Figure 5, I have used UML Classes to represent RequisitePro requirement types, UML Attributes to represent RequisitePro attributes and UML dependencies to represent possible RequisitePro traceabilities. Furthermore, the attributes that are stereotyped as <<name>> means that this information is not stored as a RequisitePro attribute, but instead in the name-field of the requirement(s) of the specific type.
 You can find more information about progressive acquisition in the following references:
- WEAG TA-13 Acquisition Programme. Guidance on the Use of Progressive Acquisition, Version 2. November 2000.
- Progressive Acquisition and the RUP: Comparing and Combining Iterative Processes for Acquisition and Software Development, Giles Pilette, The Rational Edge, November 2001
- Progressive Acquisition and the RUP, R.Max Wideman, The Rational Edge, December 2002