Today I want to share some thoughts with you about data modeling in ARIS, because this topic is frequently discussed here in the community and also is subject of many internal discussions.
The ARIS method provides a broad range of diagram and object types that can be used for data modeling. This leads to a very different usage of parts of our methodology and the outcome of projects with the same objective looks quite different. This flexibility is not a big problem for the customers, on the contrary they like it because they don’t have to follow restrictions and conventions, but it leads to a problem for IDS Scheer as a tool vendor. If we offer such a flexible data modeling method there is the expectation, that we also offer supporting functionalities, accelerators and best practices. This is nearly impossible for a huge number of projects with different modeling conventions.
With respect to this, ARIS development and IDS consultants had long discussions to develop an integrated modeling approach to represent an information architecture in ARIS, mainly covering the requirements from Enterprise Architecture and SOA field.
Very important is the basic concept of three different abstraction layers in ARIS, relevant for data modeling (similar to OMG’s CIM, PIM and PSM):
- conceptual – used for classical business and enterprise modeling, independent of any automation or execution aspects
- logical – describes a more technical view with the target to automate processes and implement IT architectures, but normally platform and technology independent
- physical – used to model the implementation aspects, platform and technology dependent
The following figure shows a part of the ARIS meta model (taken from our SOA methodology) and how the diagram and object types are assigned to these three levels (to reduce complexity not all relationships between the object types are displayed).
Now let’s have a more detailed look at the three levels:
Conceptual level
On conceptual level, business terms and business objects are modeled. Here it is important to distinguish between terms and objects.
Terms define only the grammar that can be used to name objects, diagrams etc. including concepts like homonyms and synonyms. For the modeling of business terms in ARIS the object type Technical Term should be used (in a Technical Terms Model). Terms must not be used as input or output objects of process steps, because a term is not an information object! This grammar of course can be used for all elements in ARIS, not only data elements.
Business objects are used to model the data flowing in or out of a process step. They can be related to each other and structured hierarchically, for example to detail them down to an attribute level. Example:
- Purchase Order Purchase
- Order Header
- Contract Number (ID)
- Purchase Order Position
- Contract Number (ID)
- Order Header
But keep in mind, everything is a business object.
Business objects may have a state, but they might be implemented later without this state. Business objects without a state generate new business objects, e.g. Customer, New Customer and Old Customer are different objects.
To model business objects in ARIS, the object type Cluster should be used exclusively, for example to attach them to process steps in EPC. The business object hierarchies should be modeled with an IE Data Model.
Logical level
On this level, the logical data models are described. Currently, different notations are used here, e.g. ERM and UML. The preferred way to describe logical data models is to use Entity types and Attributes in IE Data Models. eERM diagrams should not be used any longer, they should be transformed into the IE Data Model format (this is possible without a loss of information).
One way to create the logical data model is top down by transforming the elements from the conceptual level to the logical level. For this purpose our new transformation framework can be used. This further keeps the relation between source element (Business Object) and target element (Entity Type or Attribute).
It is also possible to import a logical data model or to model it manually as part of an as-is analysis and to map these elements to the corresponding business objects at conceptual level. The transformation based and the mapping approach should not be mixed. Of course you can also focus on modeling the logical level but without connection to conceptual elements.
Clusters may also be used at the logical level, but only to define views as partial (or scoped) views to complex logical items.
In future, with our new UML Designer (including good profile support), UML will play an important role to describe logical data structures.
Physical level
The physical data level can be described either using the relational (DDL) or object-oriented (e.g. XSD) paradigm. Elements of the relational description are tables, fields, relations, and constraints. The object-oriented description contains simple and complex types as well as the relations between them.
Although many design tools do not support a graphical description of the physical layer and work directly with the corresponding DDL/XSD files, we think it makes sense to have diagrams in ARIS for that purpose (Table Diagrams for relational description and a UML profile for XSD representation). Only if these elements are part of the ARIS repository it is possible to do analyses or to transform and map with other elements in our repository. But if there is an external (outside of ARIS) source for physical data description, we consider this source as the master for the data.
Conclusion
With the introduced abstraction layers we hope to make the data modeling in ARIS more transparent as in the past. The different other ways offered by our methodology will be available in the future, but new functionality in our standard products (transformations, imports, exports, search…) will be based on the object and diagram types described above.
I’ve ignored to discuss in this article how a data exchange between IT systems has to be modeled in ARIS. This is currently subject of discussions within the IDS Enterprise Architecture and SOA team. When we have a result it will be brought to you directly.
I’d appreciate to get response here in the community and assure you to take your requirements into account for our future activities.
Hello!
It will be a great Day when there will be a tool in which all three models will be linked. When changes in one model will cause a change in another. I understand this is not an easy task, (and it's not the word) but when it will be resolved, it will be a crucial point.
Uwe,
Not sure I agree with your proposals here.
Conceptual layer is all about semantics so Technical Terms are good. However, Clusters are not necessarily identical to individual business objects. They represent whatever information is needed to get the job done. This could be a non-normalised set of any data you like. The structure of the data is defined at the Logical layer.
Logical layer is where decisions are made about information structures and relationships. There is no guarantee that there is a 1-1 relationship between Clusters and Entities. For example: New Customer could become "Prospect" and Customer and Old Customer could both become "Customer" (this is pretty typical in CRM databases). IE Data Models are ideal for this. I would not recommend UML as there is no way I can see to clearly differentiate logical and physcial attributes.
Physical layer describes the implementations of the data. There can be many physical versions of a single logical entity. For example: there could be 3 different CRM systems with different physical data representations of the Customer entity. Some of these could be predefined in packages like SAP or Siebel. UML Class Diagrams are good for this. The classes can be used to generate database schemas and XSD's. Or schemas can be reverse engineered and mapped to the logical models.
A strategic modelling tool like ARIS must be able to support and manage all these mappings.
I have used ARIS to do this and had to use the Attribute Group object as the logical attribute to enable mappings to multiple physical attribues in different databases. This worked well since the same attributes could be used on Class diagrams.
Hope this makes sense,
Richard
Hi Richard,
I principally agree with your statements, but in detail I have in some points a different perspective.
Of course Techncial Terms can be used for the modeling of conceptual layer, but as the object name says they are terms and I would distinguish between a term giving an element a name and the element itself.
We had long discussions if we should use the Cluster at conceptual level or if it would be better to introduce a new object type "Business Object". The decision for Cluster was based on pragmatical reasons. Currently two object types a widely used by our customers: Technical Term and Cluster. We did not want to use Technical Term anymore because this object type should only be used to define the modeling grammar and so we choosed the Cluster, so that at least as much customers as possible do not have to change their methodology.
We also want to avoid that different object types are used for the same objective. As long as only the manual modeling is concerned the customer should be free to choose any methodology he prefers. But if functional support in our standard products is required we have to concentrate on one convention. If we want to provide a transformation between conceptual and logical level (including round-trip support) it would make things much too complicated, if we had to care for different methodologies expressing the same aspects.
Regarding logical level, many of our customers are using UML here and I think with special profiles the differentiation between logical and physical attributes is possible. But even if we might use UML late a persistence layer for logical level we would provide a IE Data Model like UI for the modeling.
Regarding physical level I completely agree with you, with next major release (including UML 2.x support) we plan to enhance the usage of our new transformation technology to support the mappings you mention.
Regards
Uwe
As someone who has made extensive use of ARIS for modeling data I am very glad to see some much needed attention being paid to this area of the product.
In general, I like the model you have put together. I agree with Richard that Technical Terms (which I usually rename to Business Terms) are a much better choice for representing data in process models and modelling at the conceptual level. One of the things that has always been a problem, over more than 30 years of data modeling, is that the formal Data Model tends to change over time. For instance, when data is identified early in a process modeling effort it may not be clear whther the data being used is an entity, attribute, or object. When the defintion changes as the data model is refined the objects need to be changed in the process models. The use of Technical Terms addresses this issue very well. It also addresses the use of composite data from multiple sources as Richard mentioned and the use of things that are not easily represented in a more formal model like derived attribtues and composite attributes. Technical Terms can also be mapped to any more formal data objects (objects, entities, tables, atributes) at the logical and physical level.
ARIS is one of the few tools that allow the definition of attributes (Technical Terms) outside of a more formal container (entity, object, table) and allow them to be used in the process models. I hope this will not go away as I see it as a competitive advantage for ARIS.
A couple of things to remember from someone who has been through this exercise before:
1) Mapping of attributes between models needs to be many-to-many. Lots of vendors get this wrong by using a one-to-many or even a one-to-one mapping which just doesn't work in the real world.
2) There is no automated way to go from Conceptual to Logical to Physical models. Design of models is a process which requires the intelligence of a designer. It can be supported by automated assistance and supported by patterns but it can not be fully automated.
3) Data flow is an important part of the picture, I hope this gets addressed soon as I see a lot more need for this than data modeling (although both need to be improved).
4) There needs to be a more compact representation of the data model where attritbues are shown inside of the Entity box. I have been able to address this by creating a custom filter but it is a lot of work and not entirely satisfactory.
Rick
This is something which I am expecting from quite long.
Thanks Uwe, for sharing your thoughts. You have very well documented the view of Data Modeling in ARIS.
As most of my concerns has already got some space in the discussion, it would be great if we can drill down to one step down to the various scenarios where we can use the Aris as a Data Modeler.
First of all, I also has the same understanding like the other fellow members of the group that Technical Terms model suits best for the Conceptual Level. It suits best to define your business terms & Data Dictionary conceptually. We should use Clusters, Entities etc by using the IE or eERM model types for Logical level. Attribute Allocations models, I feel should land up at Physical level with some other model types like UML, DTD etc.
In one of the discussion for Automated scripts as a result of Data modeling in ARIS, we found that such basic features are missing their place in ARIS.
@Uwe: It would be great if you can also share your ideas about the suitability of various data objects. I really appreciate your words.
Thanks
~Parveen
Hi all,
only a short answer today because I will be out of office next two weeks and am a little busy at the moment.
I understand that many of you have concerns about the usage of Clusters at conceptual level and like to use Technical Terms instead.
Important for that discussion is the fact, that IDS wants to establish only one object type (or methodology) for each level. The reason is simply to define a best practice that is followed in the majority of projects and that we'd like to support with extended functionality. We will not support many different objects types with enhanced features due to effort reasons but also to reduce complexity of the tool.
So we have to avoid that different objects types are used. E.g. on conceptual level Technical Terms and Clusters are applicable. Most of our customers are using one of these two types. If we would introduce a new object type like Business Object, we would have to force all of customers to switch to the new methodology. So we decided to prefer Cluster at conceptual level. The reasons for this decision:
1. Both elements a applicable, none of them is better for conceptual level
2. A term defines only how an element should be named, but a term is not the element itself
3. On logical level we want to support ERM with Entity Types and we want to have also only one method on this level, so the cluster was free to be used elsewhere
Of course it is also an alternative to use a new object type at conceptual level (with the same behavior Technical Terms or Clusters are offering) and to use Clusters only to define views on logical level. The usage of ERM would remain.
Do you think this is an meaningful alternative?
Regards
Uwe
Uwe,
I disagree with your contention that both are applcable but neither is better. In the current functionality there are many more things that can be done with Technical Terms than can be done with Clusters. Tehcnical Terms can do almost everything that can be done with Clusters. The glaring exception being the ability to create an assignment from a Technical Term to a data model (IE or ERM).
Also, if you are planning on using Clusters as the primary modeling object at the conceptual level what model type are you planning on using to define them? Currently Clusters do not have their own model type.
I'm not saying that either alternative would not work but I think that one does provide much better functionality, and it is functionality that does not exist in other tools.
Rick
Hi Uwe,
I am sorry but I am totally agree with Rick. Even I am also feeling the same.
Technical terms is the better option than the clusters for the Conceptual layer. I think the people in the discussion got the right edge of ARIS, I am not sure why IDS want to limit the power of ARIS as a Data Modeler.
Other way round I feel it would be great if we work to get ARIS as data modeling tool, the core area of any business process. This helps to view your Business process from top to down.
Regards
~Parveen
Hi Rick and Parveen,
now I'm back from vacation and being ill some time.
First of all I want to thank you for the intensive discussion, this is what I intented to get with my article. It's a pity that not more members are participating :-(
But I'd like to comment some points from your last answers.
1. In common I'm missing you statement regarding my argument that IDS tends to support only one kind of data modeling for each layer to reduce the complexity. Do you agree at least with that? If yes we can go deeper into the discussion Technical Term vs. Cluster.
2. I agree with Rick that the usage depends from the functional support. I have to admit that there is no functional support for clusters at the moment and that we have to deliver it to have a chance to convince you. Or expressed in a different way: without a very good functional support I would not expect you to use Cluster in the future.
But on the other hand we have to state, the the functional support for Technical Terms is not really broad.
3. I disagree with the opinion, that we would limit the power of ARIS as data modeling tool if the mentioned functional support will be available.
4. One point I was also missing in your articles is your statement to my argument, that a Technical Term should be used (and this is already intended by it's name) to define the grammar that should be used to name elements in ARIS, not only data elements, also others like Functions or Events. What do you think about that?
In any case we will keep your valuable input in mind during the further development of ARIS towards data modeling.
Regards
Uwe
Hi Rick,
First of all I really appreciate your help for this discussion. its coming very fruitful.
To follow up with your comments, what I want to say is:
1). I personally very much appreciate the IDS efforts to recommend only one kind of Data Modeling approach, which is quite inline with the rest of world. Even I also want to follow and stick to one approch, reason being the consistent environment and being on the same page with the external world. I am totally agree with this approach.
2). For the second point I think you got our message about the functional support of "Clusters".
3). Regarding the power of ARIS as data modeling, we found lots of scope about its implementation. The only thing we are concerned about is that how we communicate those out of view scope areas to IDS. Since you are not supporting those features...indirectly that is a limit for us to use ARIS as data modeler.
4). As per my understanding Technical Terms should be used to conceptualize the terms for which you feel of providing detailed information to avoid any ambiguity. e.g Encrypted Code..now what exactly Encrypted Code means to you, for that I feel you should use technical terms.
I understand your concern as well, as you also need to draw a line somewhere and to focus on core business area.
Quite Curious about the reply :)
Regards
Parveen
Uwe,
Welcome back from vacation, and I hope you are feeling better. It's great to get the opportunity to pariticpate the continuing evolution of ARIS.
On your first point I also agree with Parveen as to the need to have a single modeling approach at each level. Anything else would be impractical from IDS Scheers standpoint and confusing the customers.
On the second point; obviously, if there is new functionality coming to support additional capabilities around the cluster object that would be very welcome. This has always been a problem in the data area in terms of the getting the object capabilites to match up with the model types that can be assigned.
On point three, my areas of concern are primarily around the ability to represent things in the Data domain without using a traditional data modeling method. There are several areas where this can be important. First, the process models often get developed before any data models are created. When modelers start placing input and output data on the process model they don't know whether what they are referencing is an entity, attribute or relationship and frankly don't really care. I've spent way too much time over the course of my career making changes to process models as the data model evolves I want the definitions to be independent of each other. I can currently do that with Tech Terms. Also, much of the data that the business really cares about, and wants to show on the Process Model, is not normally captured on a data model . This includes things like composite and derived attributes. This also can include things like synonyms and homonyms. As long as the new functionality allows me to do these kinds of things like I can currently do with Tech Terms that will be great.
It sounds like perhaps Tech Terms will still be available to do some of this (like the homonyms and synonyms and providing a standard defintion of a term) but that Clusters are going to be enhanced to provide additional capabilities that would be better served to use as inputs and outputs on EPCs. That works for me.
Looking forward to checking out the new functionality and seeing some examples.
Rick
Hi Rick,
understood your message and fully agree. To change the methodology but not offering functional support would lead to disappointment and low acceptance. So let's see when we are ready to release the new features around information architecture (I'm only one person influencing the development of this topic, but the thread here in ciommunity gives me good arguments for our internal discussion) and how they will be accepted. And keep in mind that the 'old' methodology will remain available in ARIS. You may use further Technical Terms, the only point is that we don't want to develop new functionality around them in the future.
Regards
Uwe
Hi Uwe, Rick and Parveen
I have just found this thread and I am very interested in the concepts that you have been discussing, especially in terms of catering for derived and composite attributes.
I would tend to disagree with the use of Technical Terms (TTs) for use within the conceptual level, but this view has more to do with how TTs have been used within our environment, which includes the declaration and modelling of WRICEF requirements for SAP using TTs (i.e. each Conversion, Extension is a technical term) to the modelling of high-level information/data requirements (i.e. Customer Data, Product Data etc) where detailed information or data models do not exist.
Our interest at this stage is to find a custom object that would allow us to model more detailed information items at a level above the logical data model level, but that has far more detail than would be included at the conceptual data model space. Essentially, we want to try and model the groupings of information (i.e. data from various underlying enties come together to provide context and meaning to end users) and to model those derived/calculated complex fields (i.e. data processed to produce new information) which are not necessarily stored within an data store or documented and retained within the organisations body of knowledge.
In our discussions at our client, we have discussed the continued use of the Technical Terms to define information/data requirements at the higher process level, but we need to use something that is more defined, that is more reference-able, reusable and can be standardised across the organisation. Some mention has been made around the use of a COT (Complex Object Type) object which could handle the complexity that we are trying to represent, but I am struggling to find any relevant documentation which might explain or provide more detail on this object.
Some of our requirements, and therefore challenges, include the fact that we will need to store complex meta data around the object that we use to define Information Elements (i.e. derived/calculation information/data items) and Information Entities (logical groupings of information and data elements). This meta data may vary depending on how the information is being consumed or produced across the various processes, and therefore, the ability to store meta data in a bridge or association mechanism between the information item and the multiple processes will be a significant challenge to us. In order to cater for this meta data requirement, the use of the TT becomes difficult due to the shared and pervasive nature with which this object has been used within the current methodology.
I am therefore hoping that you guys would be able to provide some insight into this topic and hopefully I can spark some renewed interest in the related subject matter.
And just to add fuel to the fire, I would agree that ARIS really needs to develop improved data modelling capabilities. We have had huge issues importing our initial data model (designed in PowerDesigner) into ARIS. The features that ARIS currently uses to represent data is insufficient and the usage of the resultant data model components is extremely cumbersome and not user friendly. I believe the capabilities of ARIS within this space are dwarfed by the like of Microsoft Visio and presents huge challenges. The only saving grace for the product within our client has been ARIS's dominant position as a process modelling tool which our client wants to use to manage its Enterprise Architecture going forward. Therefore ARIS really needs to develop this capability in order to reaffirm it's dominance in this space and gear the company towards increased client embedding and engagment.
Looking forward to your comments and feedback.
Robert,
Sorry it took so long to respond, I hadn't looked at this post in a while.
Given the current capabilities of ARIS I think the Technical Term is the only object that will allow you to do the things you describe. You can certainly create new symbols based on the TT to distinguish the various uses. For your representation of groupings/calculations using data elements from other sources you could take a look at the Data Cluster object. It is similar in nature to an SQL View which allows you bring data together from existing tables. I have always been disappointed with the Data Cluster though as it has a very limited implementation in terms of allowed assignments.
For importing of data models did you look at the 3rd party product Toolbus by Reischmann? I had some success with it bringing ERwin models into ARIS.
Also, rumor has it that in the near future you will be able to do much more customization of the ARIS meta model. This should be coming to a service release in the near future, no date has been promised.