Raise your level of abstraction
We made a UML profile and custom code generator to work inside EA to create full C++ programs. We were basically trying to re-create the very fine Rose RT in EA after IBM decided to kill Rose RT. (Rhapsody did not seem attractive.) As with Rose RT, our basic paradigm is state machines stimulated by asynchronous messages. We have classes with ports and state machines. We like defining components that correspond to executables or libraries and model the options for them (threading, priority, etc.). EA is the editor of diagrams and elements and we use an EA C# plug-in to generate complete C++ code. It generates a lot of the code that people would otherwise have to type. This failed, however, for a few reasons:
1. Iterating through the model with an EA C# plug-in is too slow for most developer's patience. Even the fastest machine with a local mdoel database takes too long.
2. The version control shows differences when there are none.
3. Getting the latest version of an entire model is often broken. It takes more than one try to make sure all elements get loaded. This is from a TFS server.
4. Differencing is not easy to look at.
5. EA (and other, general UML tools presumably) are so general that they can't know to maintain certain connections between elements. For example, if I change ports or information about the type of a port used by a class then changes are not propagated to instances of that class. This is a complete mess. Something more domain specific is needed.
Where we at now is this:
1. We have backed a lot of the information back into a regular C++ IDE. Only port, protocols and state machines are in EA. Since there is less model in EA now it is faster and less likely to fail. No one ever tries to do real version control in there. All passive classes are back into the regular IDE.
2. We need to remove the rest of the model from EA. Maybe simple, custom text files is the way to go. They can be easily differenced unlike XMI. This is kind of like making a distributed, real-time, message-based text model.
3. Maybe later add some visualization on top of them. That's all we really wantedf out of EA anyway - a graphical editor. This is a hard as state machines can be quite tricky.
This might do 80% of what we would want out of a commercial too, and be better to control. So, the question is, has anyone else done full code generation in a commerical tool and then found that to be a pain and moved back out? What did you do? Any other ideas?
Hi. Thank you very much for sharing.
We have a model consisting of passive classes (those that can be translated directly into C++ files), active classes (classes with state machines and ports/protocols) and components (representing either libraries or executables). The user selects a component and "generate code" to start the process. At this point we look at class instances that are contained in the component. We then look at these to see what other classes they contain or depend on. This includes any classes used in the protocols of any ports and any other classes they may depend on. In other words, we recursively search for all necessary units to generate to build that component. Properties of the component also affect the generation some elements to set threading and priorities.
In the case of executables, the entire thing is generated and represents the final real-time, multi-threaded, middleware-using (ports) application. We do not support building UI application completely, but people can incorporate "external" classes that have been made in a UI editor into the build.
In EA, the model iterating takes a long time. I could look into how to make it faster and I have run a profiler, but the thing that takes a long time which I have no control over is iterating the connectors of a model element (dependencies, generalizations, etc.) This takes around 8 times the time of iterating the attributes, for example. (I believe the problem is the internal design of the model database. The connectors must not be directly referenced in the elements themselves. EA must have to go look in a separate connector table and do a full search of that table to retrieve all connectors for a given model element. This is just a guess from looking at some of the SQL query examples.) As an example, to iterate just 20 or 30 classes can take a minute or so. I can't really give a fair number to compare with. The end result is that our users find the time too large when making simple changes. We cannot build up an optimization and need to do the full iteration each time because we have no control over the tool.
Note that the real problem is the clumsy editing and the version control. Not being able to reliably retrieve (yes, we have to get latest version like 3 times before it works and then we are unsure) and detect changes in our model makes it unusable.
The other problem is that if we define a class with ports, and that class is an instance in other classes (like as parts of another class). If we then remove or add ports to the class, the instances are not consistent with the class. The added or removed ports of the class are not reflected in all the instances. People have leftover ports in their instances. In source code this happens all the time and is normal. If someone changes a base class interface then they need to change all uses of that base class as well. However, in a modeling environment we expect the tool to do that and Rose Rt used to do it just fine. It was just another problem that I was constantly fixing people's model for them when this problem happened.
FYI, as an experiment, we are now building up our own C++ classes for the object model and are heading toward a human-readable XML format for the serialization and storage of the model. We are going to build our own graphical front-end for the object model using Qt for those that do not want to edit XML. (We are also thinking that the code generator may parse and mix in user versus generated code, but not sure.) We are going to try these on a small scale to learn how it goes.
I'm sorry if this is information overload and not too clear. I had hoped to attend the conference this year and make a decent presentation of this to share, but i just could not make it.
Thanks for giving those details, Jim. 2 seconds per class does indeed seem slow - the MetaEdit+ tests above were about 4 milliseconds per class. Of course your generation is more complex, but like you say the real problems are probably in the internal design of EA's model database, at least for this case.
You mention defining a class with ports, and the instances not reflecting changes made later to the ports of the class. In that case, MetaEdit+ would update the instances to show the new ports (or remove the ones that were deleted). If the ports were visible in the instance symbol and relationships could be connected to them, in the case of the old ports the tool maker has a difficult decision: what to do with the relationships connected to the now-deleted ports. If those relationships were summarily deleted, that removes useful information from the models, and makes it harder for the modeler to intelligently sort the situation out. In MetaEdit+ such relationships remain, but are now shown directly connected to the instance rather than to the port (which has been deleted). The relationships still remember the port, so they can show a warning symbol and give the modeler the information he needs to correct the situation. An added bonus is that if the port is added back, the relationships are fine again anyway. It would be interesting to hear what your thoughts are on this approach, and whether that's what Rose Rt used to do.
Building your own human-editable XML format and Qt graphical front end sounds like a lot of work. 20 years ago you wouldn't have had much choice, but now there are language workbenches around to give you the same results in a fraction of the time. You'll also save significantly in costs and maintenance. I'm not saying you should use MetaEdit+, but you should certainly look at these tools. Even if you want your own tool in the end, you can use them to try out different ways of representing the information graphically, to make the best possible language for your modelers.
It's a shame you couldn't make the Code Generation conference this year: you could have attended my 'Have your language built while you wait' session, where there are 11 tools and their experienced users waiting for you to describe your domain / object model, and to built you a language for that in their tool - in 20 minutes! To learn more about the tools there would also be the hands-on sessions for MetaEdit+, MPS and OOMEGA. Actually, you really ought to come! And no, I don't get commission from Mark :)
"The user selects a component and "generate code" to start the process. At this point we look at class instances that are contained in the component. [...] In other words, we recursively search for all necessary units to generate to build that component."
You are essentially talking about "software composition": build larger "units" by combining lower-level units. This looks like a good job for AtomWeaver, as the modeling approach it uses, ABSE, is based on composition and reuse. With ABSE, you build code generation yourself. The good side is that you have full control. The not-so-good side is that you'll have more work setting up.
Have a look at our entry on the upcoming Language Workbench Challenge for an introduction to ABSE and a real-world example of software composition: http://www.atomweaver.com/lwc2012/LWC2012_AtomWeaver_Submission.pdf
"Note that the real problem is the clumsy editing and the version control."
AtomWeaver model editing is surely cleaner, but we do not have model versioning yet. Again, you can refer to Steven's answer on why model versioning is not so critical.
"The other problem is that if we define a class with ports, and that class is an instance in other classes (like as parts of another class). If we then remove or add ports to the class, the instances are not consistent with the class."
That's not a problem with AtomWeaver, as it maintains a live model, that is, any changes are immediately propagated.
Thanks for sharing these details. As for speed, in a fairly complex scenario I think it is crucial to have an efficiently navigable model. If following a link is at the cost of executing some SQL that's a bottleneck, as is writing and rerading lots of intermediate files.
Unfortunately you are in the C++ world, otherwise the OCPs would provide a human readable XML at no cost, if there's a Java class model existing for representing the model.