Raise your level of abstraction
We made a UML profile and custom code generator to work inside EA to create full C++ programs. We were basically trying to re-create the very fine Rose RT in EA after IBM decided to kill Rose RT. (Rhapsody did not seem attractive.) As with Rose RT, our basic paradigm is state machines stimulated by asynchronous messages. We have classes with ports and state machines. We like defining components that correspond to executables or libraries and model the options for them (threading, priority, etc.). EA is the editor of diagrams and elements and we use an EA C# plug-in to generate complete C++ code. It generates a lot of the code that people would otherwise have to type. This failed, however, for a few reasons:
1. Iterating through the model with an EA C# plug-in is too slow for most developer's patience. Even the fastest machine with a local mdoel database takes too long.
2. The version control shows differences when there are none.
3. Getting the latest version of an entire model is often broken. It takes more than one try to make sure all elements get loaded. This is from a TFS server.
4. Differencing is not easy to look at.
5. EA (and other, general UML tools presumably) are so general that they can't know to maintain certain connections between elements. For example, if I change ports or information about the type of a port used by a class then changes are not propagated to instances of that class. This is a complete mess. Something more domain specific is needed.
Where we at now is this:
1. We have backed a lot of the information back into a regular C++ IDE. Only port, protocols and state machines are in EA. Since there is less model in EA now it is faster and less likely to fail. No one ever tries to do real version control in there. All passive classes are back into the regular IDE.
2. We need to remove the rest of the model from EA. Maybe simple, custom text files is the way to go. They can be easily differenced unlike XMI. This is kind of like making a distributed, real-time, message-based text model.
3. Maybe later add some visualization on top of them. That's all we really wantedf out of EA anyway - a graphical editor. This is a hard as state machines can be quite tricky.
This might do 80% of what we would want out of a commercial too, and be better to control. So, the question is, has anyone else done full code generation in a commerical tool and then found that to be a pain and moved back out? What did you do? Any other ideas?
Jim, a few options that might help.
Hope that helps, apologies if I misunderstood.
Thank you for the suggestions. There is a wide range of choice and I'm really backing up to thinking about just what exactly we are trying to do. Looking at these different approaches helps. Thank you.
I meant to add also that the most important thing is good version control and differencing of the model since it is now the source of information. We need a good way to see what has changed between any two version of an element or package.
Another important thing is how the tooling scales with model size. I wonder how fast commercial pakages like Artisan Studio, MetaEdit and MagicDraw are when the model size is non-trivial. EA was too slow which was a big surprise.
MetaEdit+ is fast and scales well. For some actual objective numbers, see
where MetaEdit+ was 20 times faster than Eclipse or Ruby. Sadly nobody else has taken us up on that challenge yet: it would be great to have comparable figures for other tools.
There were a couple of good talks at Code Generation 2010 about what to do when your UML tool hits scalability and performance problems. Both talks actually were cases where that tool was MagicDraw, but I got the impression that many of the issues were more to do with the language and organization than the tool. Karsten Thom's blog has a nice writeup including those talks:
On the versioning side, MetaEdit+ offers a new approach that avoids many of the problems that arise when trying to apply code-oriented VCS practices to modeling. There's an article here that explains the ideas: http://www.metacase.com/papers/Mature_Model_Management.html.
Very interesting. I almost completely understand your suggestions about version control. When using Rose RT instead of hand code everywhere, we too found that we had reduced need for version control other than as a shared repository.
It's hard to dismiss the need for visual differencing of models. People need to know what changed between archived versions. However, the idea of generating a readable change summary along with the code is a great idea. It going to be fun to think about what that could look like versus the actual code generated. (As a compromise we are using source control on the generated code just so people can theoretically see what changed even though no one really looks at it. Its more so we can go get a particular version without running into EA.)
I also agree that if we had a proper design, then we would reduce the amount of simultaneous editing of elements by different teams. Proper separation and interfaces would avoid a lot of that. People still mired in a badly coupled source-code design still need file-level branching and they go crazy with it. It's those people that ask for all this branching, differencing and merging.
I haven't thought this all the way through, but thank you for the very interesting ideas!
> ...actual objective numbers, ... Sadly nobody else has taken us up on that challenge yet: it would be great to have comparable figures for other tools.
I think it's important to take into account the wide variety of usage scenarious here. E.g., our code generator runs on an in-memory model, too, and therefore is fast, but the time it takes is not relevant in a modification cycle from model, transformation, generation, compilation, packaging, deployment and restart within some server. Surely a slow tool in the chain can increase the problem, so it's good to have fast ones. And I do agree that organisation of the model is extremely important, as are augmenting features like hot-partial redployment, caching, partial build etc.
Concerning the blog post
> Code generation performance comparison
I stumbled over the "simpler" in the statement:
> ...faster and simpler than the combination of the ATL and MOFScript generators, and IMHO this would continue to be true even for much more complicated generators.
We have a ratio of aorund 1:40 from domain classes to technical classes in our M2M transformation, which eliminates a huge amount of redundancy; in fact we introduced the M2M step because we weren't able to maintain our direct templates anymore. So I think the value of M2M highly depends on what you are trying to do alltogether.
I'm by no means against M2M: I think pretty much everything depends on what you are trying to do. In the blog post, we were turning one model class into one code class, and I think the intermediate M2M steps seem to unnecessarily add complexity there. If you want 40 code classes from one model class, that's a different situation :).
Leaving the topic of modeling and M2M behind, and just focusing on your resulting code, what is it that demands so much complexity and duplication? It sounds like the platform / framework you're generating into makes life unnecessarily complex, if it takes 40 classes to express something you can say in one model class. But obviously it's a great case for MDD!
First, sorry, I confused numbers. It's not 1:40 from domain model to technical model, but from domain model to implementation classes. It was the old ratio for M2T before we introduced the M2M. Now tt's something like 1:8 on average in M2M and something like 1:5 in M2T, depending on domain class stereotype & properties.
It roughly goes like this: each domain model class is translated into corresponding implementation classes in the UI, interaction, core and state (DB) layer. Core layer contains interface and (typically one, sometimes a few) implementations. Interaction layer contains one or few factories (creating new) and editors (updating existing), several retrievers (queries with filters), plus some tiny technical helpers. UI layer contains adapters from core + interaction classes to enriched UI capable classes, state layer contains DB specific bindings and a DB-less inmemory implementation; additionally there are OCP "classes", i.e. aggregates of instances (e.g. a retriver plus a result set plus an instance which represent together an browser with a detailview).
Of course a single implementation class would do, but then all the complexity is in this class - a little nightmare again.
We'd be interested to hear why Rhapsody did not seem attractive. In our experience it does not seem to have had the problems you identify with EA (which is on our list of possible alternatives to Rhapsody), with the exception of the last one. We prefer to keep as much as possible in the model rather than pushing code out to a C++ IDE.
We are a smaller group within a very large company that is trying to get software modeling in for areas other than control systems for which we have good use of Matlab-Simulink-Stateflow. Our area is embedded, real-time, distributed applications (teams of machines). So, here were some of our thoughts at the time:
So, those were our thoughts at the time. These may not have been mature thoughts. I would also now wonder about version control of models and scalability. I would be heavily trying that out before buying. Of course, Steven's post above sort of changes some of that.
(Interestingly, we have a hard time explaining to the Simulink people why we don't just use Simulink. It's pretty hard to convince them that general, application software development is not going to work that way.)
I thought that Rhapsody does indeed keep classes and instances consistent. If you delete a port on a class, does it really not remove that port from instances?
By the way, why don't you share that list and the rationale? :)
> anyone else done full code generation in a commerical tool and then found that to be a pain and moved back out?
yes, we did full codegen in a variety of projects and found much pain, but since it's our own too we usel, we of course tried to improve it over the years with various means.
1.) Frontends/Version Control
Concerning the modelling frontend, the customers and the teams preferences simply differ. In the beginning, we wanted to focus on the transformation and use 3rd party modellers, but still haven't found a solution that makes everyone happy. Therefore, to some degree involuntarily, we support several frontends now, each with pros and cons. These are:
- XML based, fairly good readable and maintainable by hand, this is the preferred choice for purely inhouse developed packages; version control and refactoring is actually not a problem to us, each developer uses his favorite editor and is more than satisfyingly fast. This works because the models are quite "DRY", i.e. VC diff actually finds the difference, and refactoring often affects only a single or very few places. Sure, the XML does not contain graphical information.
- A "JSON"-Style variant, it's textual, too, but not as clumsy as XML. It's not used very much, but it has some "lovers".
- eclipse EMF/UML frontend; this once was a strong requirement in a project, but wasn't even used by the customer who wanted it, because the eclipse UML editor was too slow and buggy
- a DIA binding (oss diagram editor) which stores it's output also as (rather humanly *un*readable) XML; it's graphical, but it is also a litle nightmare with VC, since even a "save" without any real changes modifies all coordinates in the 10th position after the decimal point, that's some thousand modifed lines per checkin :-(
- a Web frontend developed with our tool, which allows concurrent working on the same model in a db, which allows to do all the nice things you can do with a model (recent changes, parallel atribut versions with comment and validity date etc.); it's our preferred recommendation for customers, but not the primary choice for "hardcore developers" (too much clicking, too much validation etc.)
All these frontends or frontend adapters are based on a technology called "Object Construction Plans" (OCP, see www.xocp.org), which is an open source component developed by us.
A side effect of this technology is, that we combine all these frontend approaches in a project, so we can define e.g, some classes in XML, some packages in dia and some others in eclipse. E.g., an abstraction layer is modelled in the Web UI and used for communications with the customer, whereas derived detailed classes are defined in XML.
2.) Compilation time
please see also my other reply to Steven
To allow comparison, can you share some numbers?
- How many classes do you have (on each level)?
- Which levels are there (domain specific, universal, technical, source code, runtime)?
- How long is long in seconds?
- How many further properties/configuration are put into the transformation?
- What is the generated code able to do? (and what not, like UI?)
3.) Something more domain specific...
> ...if I change ports or information about the type of a port used by a class then changes are not propagated to instances of that class.
Do you mean instances of that class on model level? I don't quite get the problem here.