maandag 15 oktober 2012

Improved Readability - The use of submodels in Semantic data modeling

Within Semantic modeling we have so called Scenarios. They are there simply because of readability. It is proven in practice that people are better able to read and understand a semantic model firstly due to its hierarchical notation and secondly because of the use of Scenarios. Scenarios typically zoom into logical submodels and show the links to the relevant objects in other submodels.

Let's have a look at the Microsoft example database AdventureWorks.

AdventureWorks is a bicycle manufacturing company . It's employees produce these bicycles themselves but since 2000 they also purchase sub components. They sell to customers all over the world.

The overall Semantic model looks like this:

Logically there are 4 main submodels: HR, Production, Purchasing and Sales. Let's zoom into each of these concepts.



A central concept Person is used to support the concept of Employees (used in Purchasing, Production and Sales) and Customer Persons. Per Person phone nrs and creditcard information is stored. Employees are modelled in a hierarchical way keeping track of each employees manager. Furthermore the Departments of the company and their employees are stored. History is kept of Payments and Job Candidates.



The central data object is Product along with its Bill of Materials using different Units of Measure. Products are (sub)categorized and there are different models with illustrations and descriptions per culture. Products are produced according to WorkOrders which are routed accros Locations. The system keeps track of the Inventory. Photos and Documents are stored for each product. A history is logged of Product Costprices, Reviews, Transactions and Listprices. Product prices are listed as so called Special Offers with a certain validity period.




Vendors supply product components which link into the bill of materials of the end product. The system contains all the purchase order details. The employees and products involved in the Purchasing process are also shown in this picture.



The central data object here is Customer and all of its Sales order details (quantities, prices and reasons). Customers are found in different territories grouped in Countries/Regions. The system is able to handle multiple currencies, exchange rates and tax rates. The employees and products involved in the Sales proces are also shown in this picture.

Entity Relationship modeling

Just to compare both methods a picture of the ER model of AdventureWorks.

donderdag 4 oktober 2012

Useful Design Patterns - lists & multiple self referring trees

Useful Design Patterns

The cost and time to design new data models can be brought down when combining a number of typical solutions into a new one. Reusing certain design patterns is advisable because it is very efficient to have solution features which quickly lead to a reliable result.

In this message we discuss :

-(Multiple) List(s)
-(Multiple) Self Referring Trees
-A bit about Time driven modelling.


Let's assume we need a concept country for our solution. The most simple format we can think of is a data object called Countries. It enables us to list all countries of the world and select from them.

Multiple Lists

Now suppose we like to limit the country choice list per continent for a bit more user convenience. A typical solution would be to add a new data object Continent.
After selecting a continent we can subselect the countries and only show these to the user. It also enable users to filter data by contintent for reporting purposes.

Self referring tree + List

We all know reality is more complex and therefore let's assume we need a concept called Region to store data on. This can be organised in a simple self referencing tree whereby we can model the regiontypes like continent, country, state, department,etc in a simple list.

Here we see that nested regions like [Earth,America,USA,Texas,Houston] can be simply stored in one data object which refers to itself.

Multiple self referring trees and link lists

When we only require geographical system this would be sufficient. However when we want regions to be member of multiple Regional groupings we need a bit more complex construction.

The data object NestedRegionTypes contains different regional grouping systems like:

Dutch system

  • Country
  • Province
  • Community

French System

  • State
  • Region
  • Departement
  • Arrondisement
  • Kanton
  • Commune

Intercontinental Country Organisations

  • APEC
  • G7
  • etc

European Country Organisations

  • EU
  • Benelux
  • etc

American Country Organisations

  • OAS
  • etc

Having these systems in place each Region has a mandatory geographical type so we can always find it by using the geographical ordering discussed above :
  • -USA (country) of North America (continent)
  • -France (country) of Europa (continent)
  • -Netherlands (country) of Europa (continent)
  • -Texas (state) of USA (country)
  • -Haute Provence (Region) of France (country)
  • -Utrecht (province) of Neth)erlands (country
  • -etc

Now on top of that each region (for example a country like USA) can be member of multiple other regional grouping systems (e.g. APEC, G7, NAFTA) by entering them in the data object RegionLink.
  • APEC,Japan
  • G7,USA
  • G7,France
  • NAFTA,Mexico
  • EU,France
  • EU,Netherlands


One can now add grouping systems of any type to the database and directly use them without any programming effort. For example the collection of ''Kantons speaking a particular language in Switzerland'' can be added in two steps.

First create a grouping system of lingual division in the dataobject NestedRegionTypes.

Language spoken
  • German
  • French
  • Italian
  • Romansh

Secondly link the relevant Swiss Regions to their lingual group in the dataobject RegionLink

  • Geneva, French
  • Graubunden, Romansh
  • Ticino, Italian
  • Schaffhausen, German

More time driven flexibility

Finally, to add a bit more flexibility, when membership to grouping systems vary in time you can give them a start- and enddate attribute to filter out the non-current regions at any give moment.

Let's take as an example the Commonwealth of Nations established in 1931, today a group of 54 countries. The entries in the RegionLink data object look like:
  • Commonwealth, Australia, 1931, today
  • Commonwealth, SriLanka, 1948, today
  • Commonwealth, Tanganyika, 1961, 1964
  • Commonwealth, Ireland, 1931, 1949
As you can see in 1940 SriLanka was not a member whereas today Ireland is no longer a member, so the list varies in time.

That's it for now, hope you found any of the solutions useful for your application.