HBase storage engine

You are viewing an old version of this article. View the current version here.

Data mapping from HBase to SQL

Hbase data model and operations

1.1 HBase data model

  • An HBase table consists of rows, which are identified by row key.
  • Each row has an arbitrary (potentially, very large) number of columns.
  • Columns are split into column groups, column groups define how the columns are stored (not reading some column groups is an optimization).
  • Each (row, column) combination can have multiple versions of the data, identified by timestamp.

1.2 Hbase read operations

HBase API defines two ways to read data:

  • Point lookup: get record for a given row_key.
  • Point scan: read all records in [startRow, stopRow) range.

Both kinds of scans allow to specify:

  • A column family we're interested in
  • A particular column we're interested in

The default behavior for versioned columns is to return only the most recent version. HBase API also allows to ask for

  • versions of columns that were valid at some specific timestamp value;
  • all versions that were valid within a specifed [minStamp, maxStamp) interval.
  • N most recent versions We'll refer to the above as [VersionedDataConds].

One can see two ways to map HBase tables to SQL tables:

2. Per-row mapping

Let each row in HBase table be mapped into a row from SQL point of view:

SELECT * FROM hbase_table;

row-id column1 column2  column3  column4  ...
------ ------- -------  -------  -------  
row1    data1   data2
row2                     data3    
row3    data4                      data5

The problem is that the set of columns in a HBase table is not fixed and is potentially is very large. The solution is to put all columns into one blob column and use Dynamic Columns (http://kb.askmonty.org/en/dynamic-columns) functions to pack/extract values of individual columns:

row-id dyn_columns
------ ------------------------------
row1   {column1=data1,column2=data2}
row2   {column3=data3}
row3   {column1=data4,column4=data5}

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.