CONNECT Table Types - Data Files

You are viewing an old version of this article. View the current version here.

Most of the tables processed by CONNECT are just plain DOS or UNIX data files, logically regarded as tables thanks to the description given when creating the table. This description comes from the CREATE TABLE statement. Depending on the application, these tables can already exist as data files, used as is by CONNECT, or can have been physically made by CONNECT as the result of a CREATE TABLE ... SELECT ... and/or INSERT statement(s).

The file path/name is given by the FILE_NAME option. If it is a relative path/name, it will be relative to the database directory, the one containing the table .FRM file.

Multiple File Tables

A multiple file table is one that is physically contained in several files of the same type instead of just one. These files are processed sequentially during the process of a query and the result is the same as if all the table files were merged into one. This is great to process files coming from different sources (such as cash register log files) or made at different time periods (such as bank monthly reports) regarded as one table. Note that the operations on such files are restricted to sequential Select and Update; and that VEC multiple tables are not supported by CONNECT. The file list depends on the setting of the multiple option of the CREATE TABLE statement for that table.

Multiple tables are specified by the option MULTIPLE=n, which can take three values:

0Not a multiple table (the default). This can be used in an alter table statement.
1The table is made from files located in the same directory. The FILE_NAME option is a pattern such as 'cash*.log' that all the table file path/names verify.
2The FILE_NAME gives the name of a file that contains the path/names of all the table files. This file can be made using a DIR table.

The FILEID special column, described here, allows filtering the file list or doing some grouping on the files that make a multiple table.

Note: Multiple was not initially implemented for XML tables. This restriction is removed since version 1.02.

File Mapping

For file based tables of reasonable size, processing time can be greatly enhanced under Windows(TM) and some flavors of UNIX or Linux by using the technique of “file mapping”, in which a file is processed as if it were entirely in memory. Mapping is specified when creating the table by the use of the MAPPED=YES option. This does not apply to tables not handled by system I/O functions (XML and INI).

Big File tables

Because all files are handled by the standard input/output functions of the operating system, their size is limited to 2GB, the maximum size handled by standard functions. For some table types, CONNECT can deal with files that are larger than 2GB, or prone to become larger than this limit. These are the FIX], BIN and VEC types. To tell connect to use input/output functions dealing with big files, specify the option huge=1 or huge=YES for that table. Note however that CONNECT cannot randomly access tables having more than 2G records.

Compressed Table Tables

CONNECT can make and process some tables whose data file is compressed. The only supported compression format is the gzlib format. Zip and zlib formats are not supported. The table types that can be compressed are DOS, FIX, BIN, CSV and FMT. This can save some disk space at the cost of a somewhat longer processing time.

Some restrictions apply to compressed tables:

  • Compressed tables are not indexable.
  • Update and partial delete are not supported.

Use the Boolean COMPRESS option to specify a compressed table.

Relational formatted tables

These are based on files whose records represent one table row. Only the column representation within each record can differ. The following relational formatted tables are supported:

NoSQL Table Types

These are based on files that do not match the relational format but often represent hierarchical data. CONNECT can handle JSON, INI-CFG, XML and some HTML files..

The way it is done is different from what PostgreSQL does. In addition to including in a table some column values of a specific data format (JSON, XML) to be handled by specific functions, CONNECT can directly use JSON, XML or INI files that can be produced by other applications and this is the table definition that describes where and how the contained information must be retrieved.

This is also different from what MariaDB does with dynamic columns, which is close to what MySQL and PostgreSQL do with the JSON column type.

The following NoSQL types are supported:

    Comments

    Comments loading...
    Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.