HBase storage engine
You are viewing an old version of this article. View
the current version here.
Data mapping from HBase to SQL
Hbase data model and operations
1.1 HBase data model
- An HBase table consists of rows, which are identified by row key.
- Each row has an arbitrary (potentially, very large) number of columns.
- Columns are split into column groups, column groups define how the columns are stored (not reading some column groups is an optimization).
- Each (row, column) combination can have multiple versions of the data, identified by timestamp.
1.2 Hbase read operations
HBase API defines two ways to read data:
- Point lookup: get record for a given row_key.
- Point scan: read all records in [startRow, stopRow) range.
Both kinds of scans allow to specify:
- A column family we're interested in
- A particular column we're interested in
The default behavior for versioned columns is to return only the most recent version. HBase API also allows to ask for
- versions of columns that were valid at some specific timestamp value;
- all versions that were valid within a specifed [minStamp, maxStamp) interval.
- N most recent versions We'll refer to the above as [VersionedDataConds].
One can see two ways to map HBase tables to SQL tables:
Comments
Comments loading...
Content reproduced on this site is the property of its respective owners,
and this content is not reviewed in advance by MariaDB. The views, information and opinions
expressed by this content do not necessarily represent those of MariaDB or any other party.