From Problem Space to Solution Space
Table of Contents
In particular, current research in the area of model driven engineering (MDE) is primarily concerned with reducing the gap between problem and software implementation domains through the use of technologies that support systematic transformation of problem-level abstractions to software implementations. — France and Rumpe (France and Rumpe 2007)
Software engineering typically distinguishes between problem space (or problem domain) and solution space.1 The present document's objective is to clarify these two very important concepts, and to relate them to the practice of model-driven approaches.
Spaces and Levels of Abstraction
The problem space concerns itself with a business area or any other field of expertise a software system needs to be developed for, whereas the solution space is made up of a set of technological choices with which a software system can be designed and implemented, and atop of which it will execute.2 As we move from problem space towards solution space, the abstraction level is progressively lowered until the machine is ultimately reached. Figure 1 illustrates this idea.
Figure 1: Phases of the software development lifecycle, abstraction levels and language types. Source: Author's drawing based on Berg et al.'s (Berg, Bishop, and Muthig 2005) image.
One of the core objectives of MDE is to enable a smoother transition between abstraction levels, easing the gap between them. France and Rumpe lay out the motivation (emphasis ours):
A problem-implementation gap exists when a developer implements software solutions to problems using abstractions that are at a lower level than those used to express the problem. In the case of complex problems, bridging the gap using methods that rely almost exclusively on human effort will introduce significant accidental complexities.3 (France and Rumpe 2007)
Hence, MDE promotes the use of modeling languages (cf. Modeling Languages and Their Purposes) at the appropriate level of abstraction for the task at hand; its ultimate goal is to allow software engineers to create a cascading set of abstractions of an arbitrary depth that closely matches software engineering activities over the various phases of the software development lifecycle, with each abstraction described by an adequate modeling language — all the way to the general purpose programming language. This process is referred to as model cascading, and it is implemented by means of model refinement (cf. Model Transform Applications).
Whilst conceptually straightforward, model cascading poses awkward practical challenges because the existence of multiple models, possibly conforming to multiple metamodels, and representing disparate viewpoints leads to a need to keep all views integrated, synchronised and consistent4 — a task of increasing difficulty, as MDE moves away from the simpler unidirectional model of transformations towards more complex topologies.5 As we've already seen (cf. Modeling Languages and Their Purposes), similar synchronisation and integration challenges are also present in the relationship between models and source code — the traditional destination of the model refining process.6
Difficulties in synchronising source code with models can be avoided if full code generation is targeted, a task considered feasible by some — such as Jörges et al. (Jörges 2013) (p. 33) — and unfeasible by others, such as Greifenberg et al. (Greifenberg et al. 2015b), who state: "The prevailing conjecture, however, is that deriving a non-trivial, complete implementation from models alone is not feasible." From experience, we lean more towards Greifenberg et al. in this regard. The alternative is to use partial code generation, but then there is a requirement for one or more integration strategies to allow handcrafted and generated code to coexist. Here, Greifenberg et al.'s survey of integration mechanisms is extremely helpful (Greifenberg et al. 2015b) (Greifenberg et al. 2015a).
In our personal opinion, largely borne out of practical experience, model synchronisation remains a complex subject with thorny problems — both engineering and theory-wise — and one which is particularly difficult to address at a large, industrial scale. For these reasons, the present work recommends relying mainly on the simpler forward-only topology, with minimal use of cascading, and resorting to well defined integration strategies; and to adopt more complex approaches solely when well-defined use cases emerge.
Complexity notwithstanding, we have thus far only scratched the surface of the solution space. The next section identifies its key components and their properties.
The Structure of the Solution Space
There are a few nuances to add to the simplified picture described in the previous section because the underlying process is of a fractal nature.7 That is, by looking in more detail at each step on our abstraction descend, we will likely find inside it yet another abstraction ladder. Consider the solution space. Within it, the literature typically defines two key concepts: Technical Spaces (TS) and Platforms. Figure 2 illustrates how they relate to each other and to the problem and solution spaces. These two concepts are of vital importance to us, so read them for a detail analysis, including a discussion of the challenges they present.
Figure 2: Problem space, solution space, acrshortpl:ts and platforms. Author's drawing based on Brambilla et al.'s image (Brambilla, Cabot, and Wimmer 2012) (p. 13)
Technical Spaces
Kurtev et al. proposed Technical Spaces (TS) in their seminal paper (Kurtev, Bézivin, and Aksit 2002), defining them as follows: "A technological space is a working context with a set of associated concepts, body of knowledge, tools, required skills, and possibilities." Mens and Van Gorp subsequently updated the language and tightened the notion by connecting it to metametamodels: "A technical space is determined by the metametamodel that is used (M3-level)." (Mens and Van Gorp 2006) Examples of TS include MDE itself, XML, Java and other such programming languages.
In (Bézivin et al. 2003), Bézivin et al. outlines their motivation: "The notion of TS allows us to deal more efficiently with the ever-increasing complexity of evolving technologies. There is no uniformly superior technology and each one has its strong and weak points." The idea is then to engineer bridges between technical spaces, allowing the importing and exporting of artefacts across them. These bridges take the form of adaptors called "projectors", as Bézivin explains (emphasis ours):
The responsibility to build projectors lies in one space. The rationale to define them is quite simple: when one facility is available in another space and that building it in a given space is economically too costly, then the decision may be taken to build a projector in that given space. There are two kinds of projectors according to the direction: injectors and extractors. Very often we need a couple of injector/extractor to solve a given problem. (Bézivin 2005a)
TS are a useful — if somewhat imprecise8 — conceptual device and bridging across them has been demonstrated to work in practice (Bézivin et al. 2003). However, our position is that to fully fulfil their promise, an extraordinary engineering effort is required to model all significant features from existing TS, to expose them to modeling languages and to keep those models updated. As we shall see in the next section, much of the same challenges apply to platforms.
Platforms
The term platform is employed within the software engineering profession in a broad a variety of contexts, from hardware to operative systems, compilers, IDE like the Eclipse Platform9, virtual machines providing programming environments such as the JVM and the CLR, and in numerous other cases. It is also a core term within MDE, and a foundation upon which many other concepts build, so it is important to arrive at a clear understanding of its meaning.
The literature often uses the MDA definition as a starting point, stated as follows:
A platform is the set of resources on which a system is realized. This set of resources is used to implement or support the system. In the context of a technology implementation, the platform supports the execution of the application. Together the application and the platform constitute the system. (Group 2014) (p. 9)
From a software engineering standpoint, a platform is often seen as mechanism for reuse and abstraction, but MDE goes further and considers as particularly useful those that are "semantically rich" and "domain-specific", \marginpar{Versus MDE} made up of "prefabricated, reusable components and frameworks [because they] offer a much more powerful basis than a 'naked' programming language or a technical platform like J2EE." (Völter et al. 2013) (p. 15)
Figures 2 and 3 explore these ideas by depicting the relationship between platforms and TS. From this perspective, TS provide the raw building materials and platform developers leverage their technical expertise to, in the words of Brambilla et al., "combine them into a coherent platform" (Brambilla, Cabot, and Wimmer 2012) (p. 13). By sitting atop a platform, software engineers can abstract themselves from lower-level implementation details and focus on the problem at hand.
Figure 3: Platforms and associated concepts. Source: Author's drawing based on Stahl et al.'s image [cite:@volter2013model (p. 59)
In the presence of code generation, a tempting alternative may appear to be to bind the building blocks directly against a modeling language. Experience has however demonstrated the pitfalls of this approach, and here we are once more faced with the familiar theme of a need to raise the abstraction level. In practice, the building blocks found in TS are at too low a level to make them suitable for direct integration with a modeling approach because, as already discussed (cf. Section The Structure of the Solution Space), bridging the abstraction gap becomes increasingly difficult as the gap widens. Stahl et al. agree, but focus instead on the converse, stating that "[the] platform has the task of supporting the realization of the domain, that is, the transformation of formal models should be as simple as possible. […] Clearly, the easier the transformations are to build, the more powerful is the platform." (Völter et al. 2013) (p. 61) France and Rumpe follow the same line of reasoning, positing that abstractions such as platforms are key, because "[the] introduction of technologies that effectively raise the implementation abstraction level can significantly improve productivity and quality with respect to the types of software targeted by the technologies." (France and Rumpe 2007)
Unfortunately, not all is positive. On the same paper, France and Rumpe leave a decidedly stark warning about the challenges created by the very same process: "[the] growing complexity of newer generations of software systems can eventually overwhelm the available implementation abstractions, resulting in a widening of the problem-implementation gap." In other words, modeling languages close to a platform can only remain relevant if they are continually kept up to date with the constant changes to the platforms they depend on, or else risk becoming obsolete. This is a very difficult problem to tackle.
An obvious way to mitigate issues that arise from the constant platform churn is to decouple platform-dependent concepts from those that are independent of a target platform. This partitioning — originally popularised within MDA but now rightfully considered a part of mainstream MDE — does not directly address the underlying causes but does have the advantage of reducing the overall impact surface. As a result, by classifying models with regards to their dependence on a platform, we arrive at the notion of Platform Independent Model (PIM) and Platform Specific Model (PSM). In (Völter et al. 2013) (p. 20), Stahl et al. explain that "[…] concepts are more stable than technologies […]. The PIM abstracts from technological details, whereas the PSM uses the concepts of a platform to describe a system." A secondary advantage of this approach is that a single PIM can be mapped to multiple PSM, as demonstrated by Figure 4.
Figure 4: Mapping between a acrshort:pim and three acrshort:psm. Source: Author's drawing based on Stahl et al.'s image (Völter et al. 2013) (p. 20)
However, when one looks at these elegant solutions in more detail, the literature enters once more difficult terrain. First and most significantly, there are still looming challenges in establishing just what exactly a platform is. Bézivin explains the matter rather eloquently (emphasis ours):
There is a considerable work to be done to characterize a platform. How is this related to a virtual machine (e.g. JVM) or to a specific language (e.g. Java)? How is this related to a general implementation framework (e.g. DotNet or EJB) or even to a class library? How to capture the notion of abstraction between two platforms, one built on top of the other one? The notion of a platform is relative because, for example, to the platform builder, the platform will look like a business model. One may also consider that there are different degrees of platform independence, but here again no precise characterization of this may be seriously established before we have an initial definition of the concept of platform. (Bézivin 2005b)
Secondly, there is the question of how the mappings are to be achieved. In the same paper, Bézivin suggested employing Platform Definition Models (PDMs) as a way to bridge this gap — that is, the use of models to describe the capabilities of platforms. This and several other ideas informed research, which became very active and produced a number of localised solutions, for example in the context of MDA (Wagelaar and Jonckers 2005) and XML (Neubauer 2016). Nonetheless, a general approach to the problem remains illusive, as Anjorin et al. explain (emphasis theirs):
Although there exist numerous strategies and mature tools for certain isolated subtasks or specific applications, a general framework for designing and structuring model-to-platform transformations, which consolidates different technologies in a flexible manner, is still missing, especially when bidirectionality is a requirement. (Anjorin et al. 2012)
Their work provides an informed summary of the state of the art on this regard, as well as proposing a promising direction for such a generalised framework; nevertheless, substantial research and engineering work remains, in order to address all of the issues highlighted above.
Thirdly, there are those who question the need to make PSM explicitly visible, asking whether they are not best seen as a conceptual device and a (hidden) implementation detail. Stahl et al. report that "[p]ractical project experience has hitherto proved that this simplification [of foregoing explicitly visible PSMs] is usually more useful than the additional degrees of freedom gained with PSMs." (Völter et al. 2013) (p. 24) According to them, such a simplification permits more efficient development and reduces the thorny issues around model synchronisation, particularly from lower to higher levels of abstraction — i.e., the propagation of changes from PSM to PIM (cf. Section The Structure of the Solution Space).10
In light of all of these difficulties, and even taking into account the Pragmatism Principle, one is nevertheless forced to conclude that Bézivin's words of warning still to loom large over the field: "Answering the question of what is a platform may be difficult, but until a precise answer is given to this question, the notion of platform dependence and independence (PSMs and PIMs) may stand more in the marketing than in the technical and scientific vocabulary." (Bézivin 2005b)
These stimulating words complete our sketch of the solution space and its challenges. Our attention shall now turn "upwards" once more, towards the bigger picture, as we investigate the interaction between MDE and the various methodologies and processes used for the development of software systems.
Bibliography
Footnotes:
For a treatment of the subject in a system's engineering context, see Chapter 14 of Wasson (Wasson 2015) (p. 135).
In the words of Groher and Völter's: "The problem space is concerned with end-user understandable concepts representing the business domain of the product line. The solution space deals with the elements necessary for implementing the solution, typically IT relevant artifacts (sic)." (Groher and Voelter 2009).
Whittle et al. define accidental complexity as (emphasis ours): "[…] where the tools introduce complexity unnecessarily (Whittle et al. 2017)." What is meant by unnecessarily, of course, is left as an exercise to the reader.
Much has been written in the MDE literature about model synchronisation and integration, but it lays beyond the scope of the present study. The interested reader is directed to Giese et al. (Giese, Hildebrandt, and Neumann 2010) for an introductory overview of model integration and model synchronisation (Section 2, State of the Art), and to Hettel et al. (Hettel, Lawley, and Raymond 2008) for an analysis of acrshort:rte in the context of acrshort:mda, but largely applicable to acrshort:mde in general. Czarnecki and Helsen's acrshort:mt Feature Model is also relevant (Section "Source-Target Relationship" (Czarnecki and Helsen 2006)
Diskin et al. see beyond simple cascading and speak instead of networks of models (emphasis ours): "A pipeline of unidirectional model transformations is a well-understood architecture for model driven engineering tasks such as model compilation or view extraction. However, modern applications require a shift towards networks of models related in various ways, whose synchronization often needs to be incremental and bidirectional." (Diskin et al. 2014)
The alternative to code generation is model-based execution, either via an interpreter or compilation. It is, however, outside the remit of the present work. For a treatment of the subject in the context of acrshort:uml, see Mellor et al. (Mellor, Balcer, and Foreword By-Jacoboson 2002)
This is to be expected, given that abstractions can be composed of other abstractions by means of Stachowiak's mapping feature (cf. Why Model). In particular (emphasis his): "[models] are models of something, namely, [they are] reflections, representations of natural and artificial originals, that can themselves be models again." (Stachowiak 1973)
In the words of Bézivin et al. (Bézivin et al. 2003) (emphasis ours): "Although it is difficult to give a precise definition of a Technological Space, some of them can be easily identified, for example: programming languages concrete and abstract syntax (Syntax TS), Ontology engineering (Ontology TS), XML-based languages and tools (XML TS), Data Base Management Systems (DBMS TS), Model-Driven Architecture (MDA TS) as defined by the OMG as a replacement of the previous Object Management Architecture (OMA) framework."
These are, in effect, merely a variation of the RTE problem described in Modeling Languages and Programming Languages.