1 of 7

Modeling

There can be many reasons for making an Ampersand model. Just a few examples are:

You want to study the conceptual structure of something, to gain a better understanding or to acquire the jargon of a topic. Business analysts do it when starting on a new assignment. In a new environment with new people, unknown habits and a lot of catching up to do, they pick some documentation, courseware, legislature or other jargon-laden documentation and make an Ampersand model. Making sense of piles of new material goes much faster if you try to make an Ampersand model on the side.
You want to design an application. You make a model of the data, the rules and make one service for every service your store provides.
You want to detect flaws in an architecture. Make an ampersand model of that architecture, and model every rule you want checked. Populate your model with data from the real architecture, and you can measure every flaw as it occurs.
Your build-team wants clear working instructions, but there is no time to make a functional specification. You make your Ampersand model and derive the work instructions from the functional specification and the prototype that you have made.
You need correct software. Develop your software in Ampersand and use the rules to establish correctness.
Your assignment is to cleanse a polluted database. Try to find the rules that define what clean data is, and run your data through Ampersand.

This chapter about modeling discusses different ways of using Ampersand. Each section serves a different purpose (e.g. how to make a data model, how to make a legal analysis, etc.), using the same tool (Ampersand). These sections can be used as study material.

The ways of using Ampersand are not exhaustive. There are most certainly other reasons for making an Ampersand model, because a good conceptual analysis precedes practically every problem that is worth solving.

Model

An information system is modeled by designing every subsystem separately.

Steps

To specify an information system, take the following steps: 1. Agreement on the domain language; 2. Agreement on rules; 3. Validate rules; 4. Define services.

What is a model about?

An ampersand model describes the rules, relations and concepts that define a business system. Using this specification, a software system can be built that can hold structured information as a set of facts. Based on the rules, the set of facts can be checked automatically to detect violations of rules.

In an Ampersand model, services can be defined too, enabling the definition of changes to the set of facts.

Domain Driven Design

Why bother?

Domain modeling is used for various purposes:

To explore unknown domains. The act of making domain models is an efficient way to familiarize yourself with a new domain and to unravel anfractuous jargon.
To define the boundaries of individual services. A domain model should make these boundaries clear, to develop components as independently as possible.
To avoid misunderstandings over terminology in a project with multiple stakeholders. A terminology list does not always suffice in practice. A domain model should consolidate consensus and make misunderstandings explicit and debatable before they do harm.
Learning. A learner must familiarize herself with new words and specific phrases. Making a domain model improves both the speed and depth of learning.

The Ampersand way of domain modeling is classic in the sense that a conceptual model is made of a domain of your choice. It differs from other approaches in specific ways:

Your model can be interpreted by a computer to create specific useful artifacts, such as a data model, a prototype implementation, or documentation;
Your model has an interpretation in natural language, which you can use to calibrate and uniform the language of the business.

Steps

Here is a summary of the things you do.

Start with an informal analysis of the business, defining
1. stakeholders, such as: sales rep, student, DBA, inspector, etc.
2. relevant areas of expertise, such as insurance, security, legal, management, etc.
3. business functions, such as invoicing, applying for a job, computing rates, etc.
In essence, this step yields three lists. You obtain them by studying documentation and talking to people as you scope your work.
Formalize concepts and relations to reconstruct the language of the business. Validate that language with your stakeholders, creating consensus over terms and phrases. Your goal is to define the smallest agreed language in which agreements can be expressed.
Use this language and adapt it when needed:
1. to generate visual representations of your domain model (letting your computer do the drawing work);
2. to resolve language-based misunderstandings in your team;
3. to generate a database, either for prototyping or production purposes;
4. to define, verify and validate services that constitute your application;
5. to audit designs;
6. to generate documentation (letting your computer do a lot of writing);
7. other purposes, that arise incidentally.
In essence, this step yields a conceptual model of your domain. In some cases, however, the act of modeling is more important than having the model.
draw boundaries to define bounded contexts. Identify reusable patterns. Identify entities, attributes, aggregates and services for each bounded context. Assign developers, stakeholders, product-owners to develop each bounded context further.

Informal analysis

Formalize concepts and relations

Use and maintain the model

Drawing boundaries

Data modeling

When a data model serves to build an information system, it must ensure that all data that is needed in practice can be represented in the database. So you need a practical modeling technique based on actual data. By using real-life samples of data, you can decide which data elements to include or leave out in the new model and be reasonably confident that you don't leave any gaps.

In this section we will systematically extract concepts and relations based on data from a spreadsheet. The result of this analysis is an Ampersand model, which you can use to generate a data model for you.

Example

Let us start by looking at an example:

Since Ampersand works with relations, it must represent this table as relations. Three relations can do the job in the following manner:

POPULATION firstname[President*Name] CONTAINS
  [ ("1", "Abraham")
  , ("2", "Barack")
  , ("3", "Calvin")
  , ("4", "Dwight")
  ]

POPULATION lastname[President*Surname] CONTAINS
  [ ("1", "Lincoln")
  , ("2", "Obama")
  , ("3", "Coolidge")
  , ("4", "Eisenhower")
  ]

POPULATION birth[President*Date] CONTAINS
  [ ("1", "February 12, 1809")
  , ("2", "August 4, 1961")
  , ("3", "July 4, 1872")
  , ("4", "October 14, 1890")
  ]

Extract relations from tables

In our example, each row in the spreadsheet represents a president. So, the source concept of each relation is President. Each column represents a different relation. So we can use the name of each column as relation name. Then, we invent names to describe the content of each column: Name, Surname, Date.

When things get bigger, it is useful to draw the relations on a whiteboard or in your notebook. This helps you keep overview. Here is how it is done:

This drawing shows every relation als a line, drawn from source to target. The arrowhead in the middle is only to remind the reader of which is the source and which is the target concept. If you point the arrowhead from source to target, you will always know how the relation is defined.

Adapt as needed

Suppose we have a second table, which also has information

This table is similar with respect to the interpretation of a row: here too, each row represents a president. However, the presidents aren't numbered in this table, so we have to add these numbers.

Numbering rows has the advantage that it is easier to recognise a president.

POPULATION state[President*State] CONTAINS
  [ ("1", "Kentucky")
  , ("2", "Hawaii")
  , ("3", "Vermont")
  , ("5", "New York")
  , ("6", "Georgia")
  ]

POPULATION lastname[President*Surname] CONTAINS
  [ ("1", "Lincoln")
  , ("2", "Obama")
  , ("3", "Coolidge")
  , ("4", "Eisenhower")
  , ("5", "Roosevelt")
  , ("6", "Carter")
  ]

POPULATION capital[President*City] CONTAINS
  [ ("1", "Frankfort")
  , ("2", "Honolulu")
  , ("3", "Plymouth")
  , ("5", "New York")
  , ("6", "Atlanta")
  ]

There seems to be something funny about the relation capital[President*City]. In the model this relation pairs presidents to capital cities of the state in which they were born. This meaning can be made more obvious, by redefining one relation somewhat:

POPULATION capital[State*City] CONTAINS
  [ ("Kentucky", "Frankfort")
  , ("Hawaii", "Honolulu")
  , ("Vermont", "Plymouth")
  , ("New York", "New York")
  , ("Georgia", "Atlanta")
  ]

Obviously, the relation capital[State*City] feels more natural. The reason is obvious: a capital city belongs more to the state than it belongs to a president who happens to have been born in that state.

This example illustrates that you may find "strange things" in your data samples. You can fix them as you go, as we did with the relation capital in this example.

So far, we have looked at relations that can be extracted from existing spreadsheet data. With some practice, you will soon learn to do larger and more realistic problems. The essence of this technique is to break down knowledge in relations, to get to the bottom of the conceptual structure.

Add Meaning

More often than not, the meaning of data in a spreadsheet is obvious. For instance, in the relation firstname it hardly needs to be said that it contains the first name of each president. But it is not always that obvious. In the second example we saw that the meaning of the relation capital[President*City] was far from obvious. It relates a president to the capital city of the state in which he was born. There are two things we need to do about it:

replace a relation with a complicated meaning for a simpler one;
document the meaning of each relation.

The second thing, to document each relation, is necessary to ensure consensus and support. To make the meaning of every relation explicit is meant to trigger stakeholders to stand up and voice any different insights.

RELATION firstname[President*Name]
MEANING "The first name of a president is registered in this relation."

RELATION lastname[President*Surname]
MEANING "The last name of a president is registered in this relation."

RELATION birth[President*Date]
MEANING "The date of birth of a president is registered in this relation."

RELATION state[President*State]
MEANING "The state in which a president was born is registered in this relation."

RELATION capital[State*City]
MEANING "The capital of a state is registered in this relation."

Add multiplicity constraints

For making a data model, you need to do one more thing: decide which relations must be constrained to unique elements. Consider for example the fact that anyone is born in at most one state. A duplicate state of birth therefore considered a mistake. President Van Buren cannot have been born both in New York and in Maine. We can impose that on a data model by stating that RELATION birth[President*State] must be univalent:

RELATION birth[President*State] [UNI]

Two properties are relevant for the data model: univalent (UNI) and injective (INJ):

Relation r[A*B] [UNI,INJ] means the relation is both univalent and injective.

We have to do this for every relation:

CONTEXT Presidents

RELATION firstname[President*Name] [UNI]
MEANING "The first name of a president is registered in this relation."

RELATION lastname[President*Surname]
MEANING "The last name of a president is registered in this relation."

RELATION birth[President*Date] [UNI]
MEANING "The date of birth of a president is registered in this relation."

RELATION state[President*State] [UNI]
MEANING "The state in which a president was born is registered in this relation."

RELATION capital[State*City] [UNI,INJ]
MEANING "The capital of a state is registered in this relation."

ENDCONTEXT

Five decisions have been made here:

We will register only one first name for every president, as a result of constraining the relation firstname to be univalent.
The system may register multiple last names, as a result of not imposing any constraints on the relation lastname.
Only one date of birth will be registered, as a result of constraining the relation birth to be univalent
Only one state of birth will be registered, as a result of constraining the relation state to be univalent
Every state has only one capital city and every city is capital to only one state, as a result of constraining the relation capital to be univalent and injective.

Bonus

In this video you sit in in a private lecture in Dutch on data modeling. In 32 minutes, most of the above is being discussed. It may help you to get a different perspective on the theory above.

Assignment

Generate a functional specification from this script, open the generated document, and look up what your data model looks like.
Make a data analysis of a small problem of your own choosing; then generate a functional specification from that script.

What have you learned?

After finishing your assignment, you have learned:

why it makes sense to analyse actual data samples for creating a data model;
how to analyze spreadsheet data and produce relations from it;
why it is necessary to document meaning for each relation;
how to constrain relations with univalence and injectivity;
how easily mistakes are made (by using the Ampersand compiler in the assignment);
how Ampersand's messages help you fix mistakes;
how to make Ampersand create a data model based on your data analysis.

Legal modeling

Purpose

The purpose of legal modeling is usually software related. The most popular use is to obtain decision rules that can be executed by a computer. A much mentioned problem is the legal language of the legislative domain that is being modeled. A modeler who is unfamiliar with the domain language will require trial and error to formulate decision rules that ar both executable and legally compliant. Another problem is that the modeler is sometimes legally educated, sometimes technically educated, but rarely both. It requires a modeler who is both technically and legally competent to make this trial-and-error process converge. This explains why legal modeling can be very time consuming.

to be continued

Architecture modeling

Metamodeling

Limitations of Ampersand

Data modeling

Example

Let us start by looking at an example:

Since Ampersand works with relations, it must represent this table as relations. Three relations can do the job in the following manner:

POPULATION firstname[President*Name] CONTAINS
  [ ("1", "Abraham")
  , ("2", "Barack")
  , ("3", "Calvin")
  , ("4", "Dwight")
  ]

POPULATION lastname[President*Surname] CONTAINS
  [ ("1", "Lincoln")
  , ("2", "Obama")
  , ("3", "Coolidge")
  , ("4", "Eisenhower")
  ]

POPULATION birth[President*Date] CONTAINS
  [ ("1", "February 12, 1809")
  , ("2", "August 4, 1961")
  , ("3", "July 4, 1872")
  , ("4", "October 14, 1890")
  ]

Extract relations from tables

When things get bigger, it is useful to draw the relations on a whiteboard or in your notebook. This helps you keep overview. Here is how it is done:

Adapt as needed

Suppose we have a second table, which also has information

This table is similar with respect to the interpretation of a row: here too, each row represents a president. However, the presidents aren't numbered in this table, so we have to add these numbers.

Numbering rows has the advantage that it is easier to recognise a president.

POPULATION state[President*State] CONTAINS
  [ ("1", "Kentucky")
  , ("2", "Hawaii")
  , ("3", "Vermont")
  , ("5", "New York")
  , ("6", "Georgia")
  ]

POPULATION lastname[President*Surname] CONTAINS
  [ ("1", "Lincoln")
  , ("2", "Obama")
  , ("3", "Coolidge")
  , ("4", "Eisenhower")
  , ("5", "Roosevelt")
  , ("6", "Carter")
  ]

POPULATION capital[President*City] CONTAINS
  [ ("1", "Frankfort")
  , ("2", "Honolulu")
  , ("3", "Plymouth")
  , ("5", "New York")
  , ("6", "Atlanta")
  ]

Notice that this deviates slightly from the previous recipe. Instead of making a new relation president[President*President], we have reused the relation lastname. By doing so, we have interpreted the third column of the spreadsheet as the last name of the president. More importantly, we have reused an earlier relation. The drawing can also be extended:

POPULATION capital[State*City] CONTAINS
  [ ("Kentucky", "Frankfort")
  , ("Hawaii", "Honolulu")
  , ("Vermont", "Plymouth")
  , ("New York", "New York")
  , ("Georgia", "Atlanta")
  ]

This example illustrates that you may find "strange things" in your data samples. You can fix them as you go, as we did with the relation capital in this example.

Add Meaning

replace a relation with a complicated meaning for a simpler one;
document the meaning of each relation.

RELATION firstname[President*Name]
MEANING "The first name of a president is registered in this relation."

RELATION lastname[President*Surname]
MEANING "The last name of a president is registered in this relation."

RELATION birth[President*Date]
MEANING "The date of birth of a president is registered in this relation."

RELATION state[President*State]
MEANING "The state in which a president was born is registered in this relation."

RELATION capital[State*City]
MEANING "The capital of a state is registered in this relation."

The meaning of a relation gives guidance to the reader in the way we should speak about the contents of the relation. For instance, if the pair $(p,s)$ is a pair from RELATION birth[President*State], the reader should interpret that as "President $p$ was born in $s$ ."

Add multiplicity constraints

RELATION birth[President*State] [UNI]

Two properties are relevant for the data model: univalent (UNI) and injective (INJ):

Relation r[A*B] [UNI] means that there is at most one pair $(a,b)$ in the relation for every $a$ in A.
Relation r[A*B] [INJ] means that there is at most one pair $(a,b)$ in the relation for every $b$ in B.
Relation r[A*B] [UNI,INJ] means the relation is both univalent and injective.

We have to do this for every relation:

CONTEXT Presidents

RELATION firstname[President*Name] [UNI]
MEANING "The first name of a president is registered in this relation."

RELATION lastname[President*Surname]
MEANING "The last name of a president is registered in this relation."

RELATION birth[President*Date] [UNI]
MEANING "The date of birth of a president is registered in this relation."

RELATION state[President*State] [UNI]
MEANING "The state in which a president was born is registered in this relation."

RELATION capital[State*City] [UNI,INJ]
MEANING "The capital of a state is registered in this relation."

ENDCONTEXT

Five decisions have been made here:

We will register only one first name for every president, as a result of constraining the relation firstname to be univalent.
The system may register multiple last names, as a result of not imposing any constraints on the relation lastname.
Only one date of birth will be registered, as a result of constraining the relation birth to be univalent
Only one state of birth will be registered, as a result of constraining the relation state to be univalent
Every state has only one capital city and every city is capital to only one state, as a result of constraining the relation capital to be univalent and injective.

Note that we can envelop the relation definition in a CONTEXT and run it on RAP3. Ampersand will produce the following data model:

Bonus

In this video you sit in in a private lecture in Dutch on data modeling. In 32 minutes, most of the above is being discussed. It may help you to get a different perspective on the theory above.

Assignment

Generate a functional specification from this script, open the generated document, and look up what your data model looks like.
Make a data analysis of a small problem of your own choosing; then generate a functional specification from that script.

What have you learned?

After finishing your assignment, you have learned:

why it makes sense to analyse actual data samples for creating a data model;
how to analyze spreadsheet data and produce relations from it;
why it is necessary to document meaning for each relation;
how to constrain relations with univalence and injectivity;
how easily mistakes are made (by using the Ampersand compiler in the assignment);
how Ampersand's messages help you fix mistakes;
how to make Ampersand create a data model based on your data analysis.