Main Memory Execution

In the following we describe the #import and #export built-ins of the DLVDB system. We will then also give a short introduction of the ODBC library and show how to configure a database system (PostgreSQL) to interface with DLV.

Syntax

We extend DLV with an ODBC interface by adding two new built-in commands to import and export relations (predicates) from/to a DBMS. These commands take a number of arguments providing the information needed for DBMS authentication, the relational predicate to import/export from/to the DBMS, and the name of the table in the DBMS.

The #import command

The #import command reads tuples from a specified table of a relational database and stores them as facts (EDB program) with a predicate name p provided by the user. The name of the imported atoms is set to p, and defines a part of the EDB program. Further EDB predicates can be added by providing input via text files. Since DLV supports only supports unsigned integer and constant data types the #import command takes a parameter which specifies a type conversion for every column of the table.

The #import command is of the form:

#import(databasename,"username","password","query", predname, typeConv).

where:

  1. databasename is the name of the database server;
  2. username defines the name of the user who connects to the database (the string must be enclosed by ” “);
  3. password defines the password of that user (the string must be enclosed by ” “);
  4. query as an SQL statement that constructs the table that will be imported (and must be quoted by ” “);
  5. predname defines the name of the predicate that will be used;
  6. typeConv specifies the conversion for mapping DBMS data types to DLV data type; it provides a conversion for each column imported by the database.

The typeConv parameter is a string with the following syntax: type: Conv [, Conv], where type: is a string constant and Conv is one of several conversion types:

  • U_INT: the column is converted to an unsigned integer;
  • UT_INT: the column is truncated to an unsigned integer;
  • UR_INT: the column is rounded to an unsigned integer;
  • CONST: the column is converted to a string without quotes;
  • Q_COST: the column is converted to a string with quotes.

The number of the entries in the conversion list has to match the number of columns in the selected table. Strings converted as CONST should be valid DLV constants (e.g. they must not contain spaces).

The #export command

The #export built-in allows exporting the extension of a predicate in an answer set to a database. Every atom (for that predicate) which is true in the answer set will lead to the insertion of a corresponding tuple into the database.

The #export command comes in two variants. The first is of the form:

#export(databasename, "username", "password", predname, tablename).

The second variant adds another parameter, “REPLACE where SQL-Condition” , which replaces the tuples in the table tablename for which SQL-Condition holds. It allows adding tuples to the table without creating conflicts whenever such tuples would violate some integrity constraint of the database (e.g. duplicate values for a key attribute):

#export(databasename, "username", "password", predname, tablename, "REPLACE where SQL-Condition").

where

  1. databasename is the name of the database server;
  2. username is the name of the user who connects to the database (the string must be enclosed by ” “);
  3. password provides the password of that user (the string must be enclosed by ” “);
  4. predname defines the name of the predicate that will be exported;
  5. tablename defines the name of the target table; it must be already present in the database, with the correct number of attributes;
  6. “REPLACE where SQL-Condition” contains the keywords REPLACE and where followed by an SQL-Condition which indicates the tuples which shall be deleted from the relational table before the export takes place.

Remark: Export must be used with the option -n=1.

The command line to evaluate the logic program (whit #import and #export command) in main-memory is:

dl program_ie.dlv

where, dl is an executable version of DLVDB and program_ie.dlv is a text file containing the logic program and the necessary #import/#export commands.

Note that, if the program query has some constants, you should prefer to activate the automatic magic-sets optimization, as:

dl -ODMS program_ie.dlv

For more details on magic-sets, pleas visit the official DLV Website.

Direct DB Execution

In the following we describe the direct database evaluation functionality of the DLVDB system. We will also give a necessary introduction of the database mapping directives and show how to prepare and configure dlv to be interfaced with a database system (e.g PostgreSQL or Oracle) in order to evaluate logic programs directly on the db. The present version of this functionality supports normal stratified logic programs plus aggregate functions, with input data (the facts) stored in possibly distributed databases.

Auxiliary Directives

The Auxiliary Directives are a set of dlv system directives. They must be written in a separate file that must have a “typ” extension. The Auxiliary Directives are composed by three sections. In the first section (Init-Section), we define the database connection to the working database. The working database is the DBMS on which the logic program must be evaluated. In the second section (Table-Definition), we define the “database table”/”logic atom” mappings. This mapping is needed in order to define the correspondences between atoms of the logic programs and tables of the distributed databases. In the last section (Final-Section), we define the output directive for storing some (or all) of the results of the program execution.

BNF Syntax of Auxiliary Directives

Init-Section ::=

1.USEDB databaseName:username:password [System-Like]?.

System-Like ::=

LIKE [POSTGRES | ORACLE | DB2 | SQLSERVER | MYSQL]

Table-Definition (EDB definition)::=

2. USE tableName [( attribute [, attribute]*)]?

3. [AS ("SQL-Statement")]?

4. [FROM DatabaseName:UserName:Password]?

5. MAPTO predName [( type [, type] )]? [ALLOW_APPEND(1)]?.

Table-Definition (IDB definition)::=

6. CREATE tableName [(attribute [, attribute]*)]?

7. MAPTO predName [( type [, type] )]?

8. [KEEP_AFTER_EXECUTION]?.

Table-Definition (Query definition)::=

9. QUERY tableName.

Final-Section ::=

10. DBOUTPUT DatabaseName:UserName:Password.

|

11. OUTPUT [Write-Option]? predName

12. [AS AliasTableName]?

13. IN DatabaseName:UserName:Password.

Write-Option :: =

14. APPEND

|

15. OVERWRITE

(1) ALLOW_APPEND clause is allowed only if the table is on the working database.
In the following, we describe how to compose Auxiliary Directives. The USEDB command (line 1.) is needed to define the connection properties with the working database. A possible use of this directive, when the database name is MyDatabase (e.g. on Postgres DBMS) and the access is granted to the user scott with password tiger, is:

USEDB MyDatabase:scott:tiger LIKE POSTGRES.

The USEDB is the only non optional directive.

The LIKE directive is optional, but it is highly recommended if the working database is among the listed one. Indeed, this directive allows our system both to use SQL dialect of the specified system (guaranteeing full compatibility with DLVDB) and to optimize the interaction of the system with the working database (to possibly enhance performances). If the LIKE directive is not used, standard SQL and ODBC functionalities are exploited (in particular, full compatibility with POSTGRES is guaranteed).

Note that, currently, the list of DBMS fully compatible as working database for DLVDB is:

  • POSTGRESQL 8.0 and higher
  • MySQL 5.0
  • ORACLE 10g
  • SQL Server 2005
  • DB2 UDB Version 8.2

For instance, suppose we have a table such as my_table(lname varchar(20), fname char(20)) on remote_database and our logic program defines the atom my_atom as my_atom(X, Y). If we need to specify a mapping between my_table and my_atom, we have to define the following Auxiliary Directive:

USE my_table (lname, fname) FROM remote_database:dlv:db MAPTO my_atom (varchar(20), char(20)).
On the other hand, when a [AS ("SQL-Statement")] (3.) clause is specified, the tablename is expected to be updated by means of the tuples resulting from execution of the "SQL-Statement". That is, the arity of the table tablename and the predicate predName must be compliant with both arity and types of the result set obtained.
The IDB definition is similar to the EDB definition. With the CREATE directive, the system creates an IDB table tableName and defines the mapping with the logic atom predName. This table will be created into the working database. The directive at line 8., if present, defines that the table is not deleted at the end of the evaluation. For example, if we need to map a logic IDB my_idb_predicate with a database table, we must define the following Auxiliary Directive:

CREATE my_idb_table (preferred_attribute_name_1, preferred_attribute_name_1) MAPTO my_idb_predicate (varchar(20), char(20)).

Noted that the just created table will be deleted after the execution end (this could be useful for intermediate or support tables). However, if we need this table to be persistent (e.g., because it store useful information we want to further analyze), than the correct definition is:

CREATE my_idb_table (preferred_attribute_name_1, preferred_attribute_name_1) MAPTO my_idb_predicate (varchar(20), char(20)) KEEP_AFTER_EXECUTION.
Intuitively, an Use directive is intended to map a predicate whose extension cannot change during processing. That is, the associated table is read-only. On the other hand, a Create directive is conceived to create a new table on the working database in order to store new data inferred for the related predicate. In case we need both accessing tuples within a table and adding new tuples derived during the computation, we have to explicitly allow for “append” operations. In particular, we have to specify an USE ... ALLOW_APPEND directive. For instance :

USE my_table (lname, fname) FROM working_database:dlv:db MAPTO my_atom (varchar(20), char(20)) ALLOW_APPEND.
The QUERY option, allows to define the name of the query table (i.e the table storing the results of the query possibly present in the logic program). In the Final-Section we introduce the OUTPUT directive. This directive is necessary in order to redirect the output of the program computation. With line 10., the working database in which the program stores the computation, is specified into the database DatabaseName. If we need to store only a specified predName (noted that predName is mapped to a table) then, we must use the directive 11. Specifically, if we need to store the content of the predicate my_predicate after the execution to the database remote_database (grant access to user pippo with password pluto), and I need to rename the table to my_new_table and I need to APPEND the new data to the data possibly already present in that table, I must use the following Mapping Directive:

OUTPUT APPEND my_predicate AS my_new_table IN remote_database:pippo:pluto.

The command line needed to evaluate the logic program in mass-memory (on a DBMS) is:

dl -DBSupport -ORdr- program.dlv directives.typ

where, dl is an executable version of DLVDB, program.dlv is a text file containing the logic program and directives.typ is a text file containing the auxialry directives.

Note that, if the program query has some constants, you should prefer to activate the automatic magic-sets optimization, as:

dl -DBSupport -ORdr- -ODMS program.dlv directives.typ

For more details on magic-sets, pleas visit the official DLV Website.

ODBC Set Up

DLVDB provides an interface to database systems via ODBC (Open Database Connectivity). For Unix-like systems, this is achieved by using unixODBC, while for Windows systems Microsoft ODBC is used.

Set up unixODBC

Now let’s set up unixODBC (we again assume that it has already been installed correctly). The relevant configuration files are /etc/odbcinst.ini and /etc/odbc.ini. For further information or finding out what to do for configuring unixODBC refer to http://www.unixodbc.org.

We will give a brief example in which we set up unixODBC with a PostgreSQL database system. Note that this is not at all related to DLV. It is not important, which database is “behind” the ODBC interface, PostgreSQL is just an example. You should in any case consult the respective manuals for your database, unixODBC or the Windows ODBC subsystem when creating your ODBC setup.

/etc/odbcinst.ini defines the drivers. This might have been configured already by your system upon installation; on our test system it looks as follows:

[PostgreSQLa]

Description = PostgreSQL ODBC ANSI driver

Driver = /usr/lib/odbc/psqlodbca.so

Setup = /usr/lib/odbc/libodbcpsqlS.so

UsageCount = 1

CPTimeout =

CPReuse =

/etc/odbc.ini defines the database access. For the database we have just created, this looks as follows:

[test-ansi]

Description = ODBC DSN test-ansi

Driver = PostgreSQLav

Trace = No

TraceFile =

Database = test

Servername = localhost

Username = dlvodbc

Password = dlvodbc

Port = 5432

Protocol = 6.4

ReadOnly = No

RowVersioning = No

ShowSystemTables = No

ShowOidColumn = No

FakeOidIndex = No

ConnSettings =

The name in square brackets is the ODBC DSN (Data Source Name). The driver must match the ID defined in /etc/odbcinst.ini; User, Password, and Database are those of our PostgreSQL setup. The Server is our own machine (localhost), using the standard PostgreSQL port 5432.

Set up ODBC on Windows system

The ODBC Data Source Administrator within Windows enables you to create DSNs, check driver installation and configure ODBC systems such as tracing (used for debugging) and connection pooling.

Different editions and versions of Windows store the ODBC Data Source Administrator in different locations depending on the version of Windows that you are using.

To open the ODBC Data Source Administrator on Windows XP:

  1. On the Start menu, click Control Panel.
  2. In the Control Panel when in Category View click Performance and Maintenance and then click Administrative Tools.. If you are viewing the Control Panel in Classic View, click Administrative Tools.
  3. In Administrative Tools, click Data Sources (ODBC).

To add and configure a new data source (i.e., for PostgreSQL database) on Windows, use the ODBC Data Source Administrator:

  1. Open the ODBC Data Source Administrator;
  2. To create a System DSN (which will be available to all users) , select the System DSN tab. To create a User DSN, which will be unique only to the current user, select the User DSN tab; click the Add.. button;
  3. You will need to select the ODBC driver for this DSN (i.e.,select PostgreSQL ANSI driver), then click Finish. Note that the ODBC driver must already installed on your system.
  4. You now need to configure the specific fields for the DSN you are creating through the Add Data Source Namedialog:
    • In the Data Source Name box, enter the name of the data source you want to access. It can be any valid name that you choose;
    • In the Description box, enter some text to help identify the connection;
    • In the Server field, enter the name of the server host that you want to access. By default, it is localhost;
    • In the User field, enter the user name to use for this connection;
    • In the Password field, enter the corresponding password for this connection;
    • In the Database field, enter the database name;
  5. Click OK to save the DSN.