To represent a real-world thing in an information system context, you use atoms.
An atom refers to an individual object in the real world, such as the student called "Caroline". But what if there are three different Carolines? What does it mean to say: "Caroline has passed the exam for Spanish Medieval Literature."? This sentence might be true for one Caroline, but false for the others. Clearly, to avoid ambiguous sentences, an atom must identify exactly one real-world object, no more, no less. Or rather, it suffices that the atom identifies one object within the context in which we are working: if the context is a group with only one Caroline, there will be no ambiguity. Similarly, ABBA is unique among all pop groups in the world; there ought to be only one building permit with number 5678; etcetera.
"Caroline"
, 5
, 1917-11-07
48
, 10.34
, 2.
, .001
, -125
, +5.33333
, 2.5E2
, 5E-3
The syntax of atoms is largely taken from ISO8601 and corresponds to the syntax of SQL and Excel. (Acknowledgement: the following text was adapted from Wikipedia)
Date and time values are ordered from the largest to smallest unit of time: year, month (or week), day, hour, minute, second, and fraction of second. The lexicographical order of the representation thus corresponds to chronological order, except for date representations involving negative years. This allows dates to be naturally sorting|sorted by, for example, file systems.
Each date and time value has a fixed number of digits that must be padded with leading zeros.
Representations can be done in one of two formats - a basic format with a minimal number of separators or an extended format with separators added to enhance human readability. The separator used between date values (year, month, week, and day) is the hyphen, while the colon is used as the separator between time values (hours, minutes, and seconds).
For reduced accuracy, any number of values may be dropped from any of the date and time representations, but in the order from the least to the most significant. For example, "2004-05" is a valid ISO 8601 date, which indicates May (the fifth month) 2004. This format will never represent the 5th day of an unspecified month in 2004, nor will it represent a time-span extending from 2004 into 2005.
If necessary for a particular application, the standard supports the addition of a decimal fraction to the smallest time value in the representation.
Atoms are represented in an SQL database. For this purpose, every atom has a type (sometimes called the technical type). The representation in SQL is given in the following table.
The last column, eq, tells whether Ampersand implements equality on these types. If equality is not defined, the operators \/
, /\
, -
, \
, /
, ;
, and <>
cannot be used.
The distinction between closed and open types is relevant in the following situations:
The complement of a relation, -r[A*B]
, is defined only if both A
and B
are closed.
The full relation, V[A*B]
is defined only if both A
and B
are closed.
A service INTERFACE X : e
requires that the target of e
is closed.
Violations are currently signaled at runtime, but future versions of Ampersand will signal these violations at compile time.
Every atom whose atomic type is marked "yes" in the column "eq" can be compared for equality. For all other atoms, equality is not defined.
The following Ampersand statement declares the atomic type of a concept:
e.g.
If Person
and Company
are both LegalEntity
, then both of them will be implicitly declared as ALPHANUMERIC
too.
type
purpose
SQL
closed
eq
ALPHANUMERIC
to represent strings of short length, i.e. less than 255 characters
VARCHAR(255)
yes
yes
BIGALPHANUMERIC
to represent large strings of limited length, i.e. less than 64 kb
TEXT
no
yes
HUGEALPHANUMERIC
to represent strings of arbitrary length
MEDIUMTEXT
no
no
PASSWORD
to represent passwords in a secure way
VARCHAR(255)
no
yes
BINARY
to represent uninterpreted binary data of short length
BLOB
no
no
BIGBINARY
to represent large binary data of limited length
MEDIUMBLOB
no
no
HUGEBINARY
to represent large binary data of arbitrary length
LONGBLOB
no
no
DATE
to represent dates compatible with ISO8601
DATE
yes
yes
DATETIME
to represent timestamps compatible with ISO8601
DATETIME
yes
yes
BOOLEAN
to represent True and False values
BOOLEAN
yes
yes
INTEGER
to represent positive and negative whole numbers in the range [-2^63..2^63 -1]
BIGINT
yes
yes
FLOAT
to represent floating-point numbers compatible with ISO8601
FLOAT
no
no
Object
to represent a key value for objects; it is not meant to be visible to end-users.
VARCHAR(255)
yes
yes
all other atoms
VARCHAR(255)
yes
yes