Dataset underlying the publication titled Improving Decision Quality in High-Tech System Design: An In-Depth Study Leveraging Industry Expertise

Contact information: 
k.nizamis@utwente.nl
g.m.bonnema@utwente.nl
Department of Design, Production and Management, Faculty of Engineering Technology
University of Twente
7522 NB Enschede
The Netherlands

License: CC-BY 4.0

**General information**
This dataset contains all the data underlying the publication mentioned in the title, as well as supplementary figures analyzing answers on decision quality based on survey respondent company position and experience level. The data consists of survey results, case study results and the supplementary figures. All data was collected at an industrial partner company connected to the research.

The survey asked industry experts for their insights on their decision-making processes and on decision-making in general. Respondents all replied to a general request for filling in the survey that was distributed among the whole systems engineering department of the company. No selection was done (except for the systems engineering department) and participation was on voluntary basis. The case studies contain twelve cases of decisions (with both good and bad outcomes) within a development project that is representative for the company. Each case is a description of the decision, its outcomes, and the factors influencing the link between decision and outcomes. Cases were studied by the researcher in consultation with several systems engineering experts from the company. 


**Data description and details**
This dataset contains 7 files:

1. Case_studies.csv 
Contains the 12 decisions studied. Each entry has a judgment on the outcomes, a short summary, a description of the decision, the recognized factors that influenced the link between decision and outcomes, and the recognized outcomes. All data is generalized, meaning that all company confidential information was removed manually.

2. Case_studies_factors_summary.csv
Contains all the factors found in the case studies mentioned in Case_studies.csv, the frequency of these factors (how often they were found in the 12 cases in Case_studies.csv), and a judgement on whether the factor had a positive or negative influence on the link between decision and outcomes. Only the judgment per factor is new information compared to Case_studies.csv.

3. Survey_including_coding_categorization.csv
All survey data from all 93 participants. Also includes categorization information (for every participant a focus area and experience level) and the codes that capture the raw open answers. The first row contains the original questions or the descriptions of the contents of the columns. The data in this file was edited to remove company confidential information, meaning that the company name and certain references to their processes have been replaced with general terms. The answers to question 4 are completely replaced by generalized descriptions. The answers to question 21 (contact information of participants) were removed.

4. SurveyQ12_per_category.png
All categories of factors capturing 2 or more answers of question 12 of the survey mapped to the different focus areas and experience levels of participants. This shows which category mentioned certain answers (captured by the codes) how often. All data used to create this image can be found in Survey_including_coding_categorization.csv. This is a processed result and does not contain new information compared to Survey_including_coding_categorization.csv.

5. SurveyQ14-Q18_per_category.png
All answers to questions 14 to 18 from the survey displayed per category of focus area and experience level. This is supplementary to figure 6 in the publication. All data used to create this image can be found in Survey_including_coding_categorization.csv. This is a processed result and does not contain new information compared to Survey_including_coding_categorization.csv.

6. SurveyQ19_per_category.png
All codes capturing 2 or more answers of question 19 from the survey mapped to the different focus areas and experience levels of participants. This shows which category mentioned certain answers (captured by the codes) how often. All data used to create this image can be found in Survey_including_coding_categorization.csv. This is a processed result and does not contain new information compared to Survey_including_coding_categorization.csv.

7. SurveyQ20_per_category.png
All codes capturing 2 or more answers of question 20 from the survey mapped to the different focus areas and experience levels of participants. This shows which category mentioned certain answers (captured by the codes) how often. All data used to create this image can be found in Survey_including_coding_categorization.csv. This is a processed result and does not contain new information compared to Survey_including_coding_categorization.csv.


**Data collection and processing**
The dataset was generated using two main methods: a survey and a set of case studies. 

= Survey =
A survey on decision quality was developed and tested with a core team from the partner company connected to the research. The survey was then sent to everyone in the systems engineering department. No selection was done, participation was on a voluntary basis. The survey was completed by 93 respondents.

Coding of open answers was done in the following ways:
For Q12:
Coding was done by two researchers in parallel, both knowledgeable on the subject. The process consisted of the following steps:
	1. Code separately by both researchers
	2. Evaluate in a discussion with both researchers and another researcher from the same chair
	3. Redo the coding
	4. Compare codes between the two researchers and agree on a definitive coding scheme

For Q19 and Q20:
Inductive open coding was done in 5 steps:
	1. Split the answers and create initial codes
	2. Re-evaluate and merge similar codes
	3. Review by a second research knowledgeable on the subject
	4. Adjust based on the review
	5. Remove confidential information

Rule based determination of focus area and experience levels of participants was done in the following ways:
Design focus area 	= Product/system focus OR Subsystem or (sub)function focus OR System architect
Teams focus area	= Group leader OR SE program manager OR Supersystem focus OR Customers' sites focus OR Integration focus
Unknown focus area	= All others
Senior experience level = >10 years at the company AND >10 years SE AND >5 product development cycles
Junior experience level = (<1 year at company OR <1 year SE OR 0 product development cycles) AND (NOT Product/system focus AND 0 structured decision meetings chaired)
Mid-level experience	= All others

After this automatic assignment of focus areas and experience levels, we went through the data and manually adjusted 19 experience levels based on the respondents' own descriptions of their role and experience in the company. This was to better account for contextual factors. In most cases (16 out of 19) this resulted in a respondent getting the Senior label instead of Mid-level.

= Case studies =
Cases had to fit four constraints: 
	1. The outcomes of the decisions had to be clear at the time of the study, so that they could be analyzed 
	2. The cases had to be recent enough so that inquiry was still possible and people could reliably recall details
	3. The set of cases had to include a range from clearly positive outcomes to clearly negative outcomes
	4. Supporting data such as the proposal and meeting notes had to be accessible

We analyzed each decision and looked at three things: 
	1. What was decided and what was the context? 
	2. What was the eventual outcome? 
	3. What factors influenced the link between decision and outcome? 

The decisions were all taken between at 1.5 years and 3.5 years before the time of study. Judgment of the outcomes was done by a senior systems engineer in consultation with other systems engineers that had been involved.

To summarize, the analysis of the cases consisted of the following steps:
	1. Selection of cases together with a senior systems engineer knowledgeable on the subject
	2. Analysis of the decision proposal and the decision meeting notes
	3. For each case writing down the context, the decision, and the outcome (where findable)
	4. For each case writing down recognized factors linking decision and outcome by the main author and judging the factors for either a positive or negative influence on the outcome
	5. Checking the correctness of all descriptions with the senior systems engineer in a discussion
	6. Updating the data based on the discussion and follow-up inquiries between the senior systems engineer and other stakeholders of the decision
	7. Finalizing the data set of each case by removing confidential information
	8. Grouping similar factors


