The Excel Importer
The Excel Importer allows you to import data from an Excel file in a (working) prototype, effectively adding it to the population of a running prototype. This text is meant for the Ampersand user who has more population than can be dealt with by POPULATION
statements. It is also useful if you have existing spreadsheets filled with tables, which you want to use as population in your Ampersand-script. Please note that some editing of your Excel-file may be required, to make Ampersand understand how columns match Ampersand-relations.
Installation
Nothing special is required for installation; this extension is enabled by default
Importing Excel files in a running prototype
If the user has the required access rights to import a population (by default no login/roles are required), the excel import is available via the Applications icon in the right half of the menu bar.
How to create importable Excel files
When you have a prototype running for an Ampersand context, you can import data for that prototype from an Excel file, which effectively adds the population specified in the Excel file to the population that is currently already in the database. This section describes how to construct an Excel file that can be used to do this.
There are 2 ways to import an Excel file. One is by using an appropriate INTERFACE definition (automatically detected based on sheet name equals INTERFACE name). The benefit of using this method is that you do not have to change your Excel files when you modify the names of relations in your amperand model. It is also more readable for third parties. The other is the (traditional) [import block] syntax.
Using an INTERFACE defintion
To be further specified. Functionality is already implemented.
This method is used to import data from a page in a spreadsheet document whenever the title of the page is the name of an INTERFACE that you have defined in your script. So if your page is called Accounts
, and you have defined
your page could look like this:
Account
Username
Password
Role
_NEW
Student1
Stud1PW
Student
_NEW
S2
Stud2PW
Student
Impoprting this page will create two accounts, one for Student1 and another for S2.
As you can see, the first cell must have the concept of the INTERFACE (Account
), and subsequent header fields have the names of the labels in the INTERFACE. You can change the order of the columns, as long as the first column is left as is.
Using the [import block] syntax
Let us consider a small (useless) Ampersand model, defined as follows:
If you want to specify data elements in an Excel file in order to populate such a model, you must define so called 'blocks' that contain this data. The importer looks for such blocks throughout the Excel file (meaning that you can have blocks on different sheets - all sheets will be inspected).
Here is an example of such a 'block' (note: all blocks must start in the leftmost column; if not, they are disregarded):
[A's]
rAA
rAB
rAC
rAC
rAC
sAB
tAD
uBA~
A
A
B
C
C
C
B
Delta
B
alfa1
alfa1
beta1
char1
char2
char3
beta2
d1
beta1
CMT
beta2
alfa2
beta2
char2
d2
beta3
alfa3
char4
char3
char2
beta1
d1
Here is the specification of a block: 1. A 'block' consists of 2 header rows followed by lines of data. A 'block' terminates whenever a next block starts or the end of a sheet is reached. Empty lines are disregarded. 2. Every cell in the leftmost column whose contents starts with the character '[' is the first cell in the first header row of a block. The contents of this cell is further disregarded. 3. Subsequent cells in the first header row must either be empty, or contain the name of a relations that is known in your Ampersand model, optionally followed by a ~
(flip) character. In our example, such cells may hence only contain '', rAA
, rAB
, rAC
, sAB
, tAD
or uBA~
. 4. The second header row only contains cells that are either empty or contain a concept name, or contain a concept name and delimiter. In our example, such cells may only contain '', A
, B
, C
, or Delta
. 5. The first cell (in the second header row) must contain the source (left) concept of all relations specified in the first header row. It may not be empty. 6. Every subsequent cell (in the second header row) must either be empty, or contain the name of the target (right) concept of the relation that is specified in the same column in the first header row. 7. There is an exception to the previous two specifications: if a cell in the first header row specifies a flipped relation, the leftmost cell in the second header row specifies the target concept for the (unflipped) relation, and the cell below the flipped relation specifies the source concept for the (unflipped) relation. In the example, all relations in the first header row have source concept A
, except for relation uBA
, which has concept A
as its target concept, and B
as its source concept. 8. Every subsequent row in this block is called a data row. Cells in a data row are either empty or non-empty. If a non-empty cell contains a formula, this formula is evaluated to obtain the cell contents. If a non-empty does not contain a formula, its contents is obtained as is (see notes for errors in formula-evaluation). From here on, when we talk about 'the contents of a cell', the obtained value from (evaluating the term in) that cell is meant.
Data rows are interpreted as follows:
When the first cell in a data row is empty, the content of all other cells in that row is disregarded (you may use such cells to include comments, computations, or whatever else you like)
When the first cell in a data row is not empty, the content of all other non-empty cells and the content of the first cell may define a set of pairs (srcAtom,tgtAtom), each of which is to be inserted into the population of the Ampersand model, where
'srcAtom' is the contents of the first cell
'tgtAtom' is the (untrimmed) contents of a non-empty cell
for the relation to which the pair (srcAtom,tgtAtom) is to be added,
its name is specified in the first row of the block in the same column as 'tgtAtom'
its SRC Concept is specified on the second header row in the first column;
its TGT Concept is specified on the second header row in the same column as 'tgtAtom'.
This means that the example is equivalent with the following population specification (note that the cell containing 'CMT' is disregarded as it is comment):
NOTES
You need NOT know about the internals of the database to use this plugin (at least, that's the idea).
You may specify formulae instead of texts. The result of the formula will be read (and converted to text) before being inserted into the database. This allows for dynamic construction of identifiers, precomputation of tables, date adaptations to the date of today, etc. Note, however, that this does not always work flawlessly. In particular, the functions
VLOOKUP
andHLOOKUP
are known to produce errors (that we are not capable of fixing), so such functions should be avoided.If you use '_NEW' in the first column, the (dirty) identifier for the atom will be automatically generated. If you use '_NEW' in a subsequent column on the same row, this will be replaced with the (dirty) identifier for the source atom (which you can use e.g. to populate property-relations). Note that as we also support formulae, you may use those to achieve the same result (and excercise control over the actual dirty identifiers used)
It is possible to store all sorts of data in the spreadsheet that will not interfere with the database population. The contents of the following cells is disregarded and can therefore be used for other purposes:
cells in a row whose first cell is empty.
cells in a column where the cell that specifies the relation name or the TGT concept is empty.
When you use something like 'CLASSIFY X ISA Y' in your model, and want to populate an atom 'xy', then you should populate it in the block where 'X's are populated. In this block, you can not only populate relations that have source concept X, but also relations that have source concept Y.
That's all, Folks!
Last updated