The VM* Wiki

Official documentation for the VM* family of model manipulation languages.

User Tools

Site Tools


vmtl:evaluation

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
vmtl:evaluation [2015/08/23 15:50]
rvac
vmtl:evaluation [2016/02/12 13:40]
rvac
Line 3: Line 3:
 ====== Empirical Evaluation ====== ====== Empirical Evaluation ======
  
-The learnability of VMTL has been experimentally evaluated and compared to that of the [[https://​www.eclipse.org/​henshin/​|Henshin]] and [[https://​www.eclipse.org/​epsilon/​|Epsilon]] model transformation languages. As of August 2015, a detailed discussion of this evaluation'​s results is under review for publication. This page will be updated to include it as soon as possible. In the interest of reviewers and of those looking to replicate our experiments or verify our analysis methods, we provide a complete replication package.+===== VMTL =====
  
-===== Replication Package =====+The learnability of VMTL has been experimentally evaluated and compared to that of the [[https://​www.eclipse.org/​henshin/​|Henshin]] and [[https://​www.eclipse.org/​epsilon/​|Epsilon]] model transformation languages. As of February 2016, a detailed discussion of this evaluation'​s results is under review for publication. This page will be updated to include it as soon as possible. In the interest of reviewers and of those looking to replicate our experiments or verify our analysis methods, we provide a complete replication package.
  
-The replication package can be downloaded as a ZIP archive [[http://​www2.compute.dtu.dk/​~rvac/​files/​experiments/​Replication%20Package%20-%20An%20Empirical%20Assessment%20of%20Model%20Transformation%20Language%20Learnability.zip|here]].+==== Replication ​Package ====
  
-It consists of the experimental material used for each experiment, the collected and encoded data, and statistical analysis scripts for analyzing the data. The experimental material and data are published under the [[http://​creativecommons.org/​licenses/​by/​4.0/​|Creative Commons Attribution 4.0 International License]], and the statistical analysis scripts are published under the [[http://​opensource.org/​licenses/​MIT|MIT License]].+The replication package can be downloaded as a ZIP archive [[http://​people.compute.dtu.dk/​~rvac/​files/​experiments/​Replication%20Package%20-%20An%20Empirical%20Assessment%20of%20Model%20Transformation%20Language%20Learnability.zip|here]]. ​It consists of the experimental material used for each experiment, the collected and encoded data, and statistical analysis scripts for analyzing the data. The experimental material and data are published under the [[http://​creativecommons.org/​licenses/​by/​4.0/​|Creative Commons Attribution 4.0 International License]], and the statistical analysis scripts are published under the [[http://​opensource.org/​licenses/​MIT|MIT License]].
  
-==== Experimental Material ​====+=== Experimental Material ===
  
 We have conducted our evaluation in the form of two questionnaire-based experiments we refer to as Experiment 1 and Experiment 2. The replication package includes three PDF documents for each experiment, one for every applied treatment. The treatments present questions regarding the evaluated transformation languages in different orders, as a mitigation measure against learning effects. This measure is particularly relevant in the case of Experiment 1, where the same questions are asked for each transformation language. Participants were randomly assigned to one of the treatments. We have conducted our evaluation in the form of two questionnaire-based experiments we refer to as Experiment 1 and Experiment 2. The replication package includes three PDF documents for each experiment, one for every applied treatment. The treatments present questions regarding the evaluated transformation languages in different orders, as a mitigation measure against learning effects. This measure is particularly relevant in the case of Experiment 1, where the same questions are asked for each transformation language. Participants were randomly assigned to one of the treatments.
Line 23: Line 23:
 The replication package includes **answer keys** for the multiple-choice comprehension questions. The replication package includes **answer keys** for the multiple-choice comprehension questions.
  
-==== Data ====+=== Data ===
  
 The experimental data is presented as a comma-separated values (CSV) file consisting of the following columns: The experimental data is presented as a comma-separated values (CSV) file consisting of the following columns:
Line 30: Line 30:
 | Experiment | experiment identifier | 1, 2 | | Experiment | experiment identifier | 1, 2 |
 |Id | participant identifier, unique within an experiment | positive integers | |Id | participant identifier, unique within an experiment | positive integers |
-| UML | participant'​s self-assessed UML knowledge | 1, 2, 3, 4, 5 | +| UML | participant'​s self-assessed UML knowledge | 1 -- 5 | 
-| OCL | participant'​s self-assessed OCL knowledge | 1, 2, 3, 4, 5 | +| OCL | participant'​s self-assessed OCL knowledge | 1 -- 5 | 
-| MT | participant'​s self-assessed model transformation knowledge | 1, 2, 3, 4, 5 | +| MT | participant'​s self-assessed model transformation knowledge | 1 -- 5 | 
-| Programming | participant'​s self-assessed programming knowledge | 1, 2, 3, 4, 5 |+| Programming | participant'​s self-assessed programming knowledge | 1 -- 5 |
 | Order | the order in which a transformation language was presented to the participant | 1, 2, 3 | | Order | the order in which a transformation language was presented to the participant | 1, 2, 3 |
-| Language | transformation language | Epsilon, Henshin, VMTL | +| Language | the evaluated ​transformation language | Epsilon, Henshin, VMTL | 
-| Score | number of correct comprehension question answers | 0, 1, 2, 3 |+| Score | number of correct comprehension question answers | 0 -- 3 |
 | Time | time in seconds required to complete the comprehension task | positive integers | | Time | time in seconds required to complete the comprehension task | positive integers |
 | Difficulty | participant'​s subjective evaluation of the difficulty of completing the comprehension task | 1 -- 5 | | Difficulty | participant'​s subjective evaluation of the difficulty of completing the comprehension task | 1 -- 5 |
 | Effort | participant'​s subjective evaluation of the effort required to complete the comprehension task | 1 -- 5 | | Effort | participant'​s subjective evaluation of the effort required to complete the comprehension task | 1 -- 5 |
  
-==== Statistical Analysis ​====+=== Statistical Analysis ===
  
 The replication package contains three statistical analysis scripts written in the [[https://​www.r-project.org/​|R programming language for statistical computing]]. The individual scripts contain code for analyzing comprehension results, cognitive load ratings, and task completion times, respectively. The replication package contains three statistical analysis scripts written in the [[https://​www.r-project.org/​|R programming language for statistical computing]]. The individual scripts contain code for analyzing comprehension results, cognitive load ratings, and task completion times, respectively.
  
-The scripts have a similar structure. They include descriptive statistics computations (mean, standard deviation), data visualizations (box plots, bar plots), and hypothesis tests (Fisher'​s exact test, ANOVA, the Wilcoxon signed rank test).  ​+The scripts have a similar structure. They include descriptive statistics computations (mean, standard deviation), data visualizations (box plots, bar plots), and hypothesis tests (Fisher'​s exact test, ANOVA, the Wilcoxon signed rank test). 
 + 
 +===== VMQL ===== 
 + 
 +The learnability of VMQL has also been experimentally evaluated and compared to that of the [[http://​www.omg.org/​spec/​OCL/​|Object Constraint Language]] for the task of querying business process models expressed using the [[http://​www.omg.org/​spec/​BPMN|Business Process Model and Notation (BPMN)]]. The results of this user experiment indicate that VMQL surpasses OCL in terms of the evaluated task metrics: query comprehension and query production. VMQL also imposes a lower cognitive load on its users. The full results of this experiment are described in the paper [[http://​dx.doi.org/​10.1145/​2492437.2492441|Querying business process models with VMQL]] by Harald Störrle and Vlad Acretoaie. 
 + 
 +==== Replication Package ==== 
 + 
 +The replication package can be downloaded as a ZIP archive [[http://​people.compute.dtu.dk/​~rvac/​files/​experiments/​Replication%20Package%20-%20Querying%20Business%20Process%20Models%20with%20VMQL.zip|here]]. It consists of the experimental material and the collected and encoded data. The replication package is published under the [[http://​creativecommons.org/​licenses/​by/​4.0/​|Creative Commons Attribution 4.0 International License]]. 
 + 
 +=== Experimental Material === 
 + 
 +We have conducted our evaluation in the form of a questionnaire-based experiment. The replication package includes two PDF documents, one for every questionnaire version. The two questionnaire versions present questions regarding the evaluated query languages in different orders, as a mitigation measure against learning effects. Participants were randomly assigned to one of the questionnaire versions. 
 + 
 +The questionnaires consist of: 
 + 
 +  * a **comprehension** task requiring participants to select the correct plain English descriptions of eight query specifications from a list of options; 
 +  * a **production** task requiring participants to express four queries described in plain English using VMQL and OCL; 
 +  * a **cognitive load assessment** task requiring participants to rate the difficulty and effort associated to using VMQL and OCL; 
 +  * a **demographics** section collecting data about participant'​s background and self-assessed technical skills. 
 + 
 +The replication package includes **answer keys** for the multiple-choice comprehension questions. 
 + 
 +The experimental data is presented as a comma-separated values (CSV) file consisting of the following columns: 
 + 
 +^Column^Definition^Values^ 
 +| Experiment | experiment identifier | 1, 2 | 
 +| ID | unique participant identifier | positive integers | 
 +| Language | the evaluated query language | VMQL, OCL | 
 +| Task A Q1-Q8 | correctness of an answer to one of the eight comprehension questions | 0 (incorrect),​ 1 (correct) | 
 +| Task A Total Score | number of correct comprehension question answers | 0 -- 8 | 
 +| Task B Q1-Q4 | score assigned by the experimenter to a production task answer (higher scores indicate better answers) | 0 -- 10 | 
 +| Task B Total Score | total score obtained by a participant for the production task | 0 -- 40 | 
 +| Difficulty Task A | participant'​s subjective evaluation of the difficulty of completing the comprehension task | 1 -- 5 | 
 +| Effort Task A | participant'​s subjective evaluation of the effort required to complete the comprehension task | 1 -- 5 | 
 +| Difficulty Task B | participant'​s subjective evaluation of the difficulty of completing the production task | 1 -- 5 | 
 +| Effort Task B | participant'​s subjective evaluation of the effort required to complete the production task | 1 -- 5 |
vmtl/evaluation.txt · Last modified: 2016/02/12 13:40 by rvac