Mining Software Modeling Practices In Open Source Software Projects

Abstract: Context: In modern software development, software modeling is considered as an essential part of the software architecture and design activity. The Unified Modeling Language (UML) has become the de facto standard for software modeling in industry. Surprisingly, there are few empirical evidences on the use of UML and a lack of evidence-based guidelines for applying UML in software development. Objective: As a first step toward synthesizing practical guidelines for the use of UML, this thesis focuses on collecting a large set of OSS projects that use UML. Subsequently this thesis offers observations on the use and impacts of using UML in OSS projects. Method: We combine techniques from repository mining and image classification in order to successfully identify more than 24 000 open source projects on GitHub that together contain more than 93 000 UML models. A quantita- tive analysis and a large-scale survey have been carried out across this set of projects. Result: The results show that UML is used in OSS projects and in those projects that use UML, UML helps new contributors and is generally perceived as supportive. The most important motivation for using UML seems to be to facilitate collaboration, as teams use UML during communication and planning of joint implementation efforts. We hope researchers in the field will find data and findings from this thesis a valuable source for their empirical studies.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.