Dealing with ROWIDs When Migrating to MariaDB
ROWID is an Oracle proprietary feature and all Oracle DBAs use them at times, and you also often see them used in applications.
ROWID is often misunderstood and misused. The first half of this blog aims to explain what
ROWIDs are and how they work, and the second half discusses – importantly – some ideas on how to migrate applications that rely on
ROWID from Oracle to MariaDB.
What is a ROWID?
ROWID is three things in Oracle:
- It is a value that references a particular row.
- It is a data type.
- And, it is a hidden column of every table. Yes, every table has a
ROWIDcolumn, but it is hidden and will only show in a
SELECTstatement if requested.
ROWID value is actually a reference to the physical location of a row in the Oracle database, which means that if a row has to be moved around for some reason, the
ROWID will change. Current implementations of Oracle maintain a
ROWID that is 10 bytes long, but you usually see the ASCII representation of it, which is 18 characters.
Let me explain a few important attributes of
- Oracle has supported
ROWIDs basically always.(that’s not an attribute?)
- The format of a
ROWIDhas changed a bit over time, but not much.
ROWIDis physical and identifies the OBJECT of the row (which in the cases we look at here is always a table), the TABLESPACE of the row, the BLOCK where the row is stored and the ROW NUMBER inside the block.(Looking at the format, we might assume that
ROWIDs are unique not only for a table but across tables. Dangerously, however, this assumption is wrong!)
Dissecting a ROWID
This section really isn’t strictly necessary, and if you don’t think that knowing the internals of a
ROWID is going to make you popular at parties, you might want to skip it. If you’re bold enough to dive into the weeds, then here we go! A
ROWID consists of 10 bytes, but it is usually displayed as an 18 characters long Base-64 encoded string. However, please note that I am only discussing old-style proper
ROWIDs here, not
UROWIDs (discussed later in this blog), and I am also discussing
ROWIDs the way they look now…which has been the same since Oracle 8i.
Let us have a look at an example
SQL> SELECT rowid FROM customer WHERE cust_id = 1; ROWID ------------------ AAAm0HAAEAAAlMEAAA
The first 6 characters,
AAAm0H, tells us which table this is, but as I am not going to bore you with how Base64-encoding works (which, trust me, is really not a good conversational ice breaker at parties), I will instead use a set of functions that Oracle provides as part of the
ROWID package. Let’s extract the object from this
ROWID and see if it matches the table we are working on:
SQL> SELECT DBMS_ROWID.ROWID_OBJECT('AAAm0HAAEAAAlMEAAA') object FROM dual; OBJECT ---------- 158983 SQL> SELECT object_name FROM user_objects WHERE data_object_id = 158983; OBJECT_NAME -------------------------------------------------------------------------------- CUSTOMER
That looks right. Let us now figure out in which file this row is stored:
SQL> SELECT DBMS_ROWID.ROWID_RELATIVE_FNO('AAAm0HAAEAAAlMEAAA') object FROM dual; OBJECT ---------- 4 SQL> SELECT file_name FROM dba_data_files ddf JOIN user_tables ut ON ddf.tablespace_name = ut.tablespace_name WHERE ut.table_name = 'CUSTOMER' AND ddf.relative_fno = 4; FILE_NAME -------------------------------------------------------------------------------- /u01/app/oracle/oradata/XE/users.dbf
The last two fields of the
ROWID are more difficult to illustrate, but I trust you get the point by now.
ROWID for Clustered Tables
In Oracle, a table cluster is a set of tables that share a common key, called a cluster key. The syntax is fairly self-explanatory:
SQL> CREATE CLUSTER cust_clu(cust_id NUMBER); SQL> CREATE INDEX cust_ix ON CLUSTER cust_clu; SQL> CREATE TABLE customer(cust_id NUMBER NOT NULL PRIMARY KEY, cust_name VARCHAR(50)) CLUSTER cust_clu(cust_id); SQL> CREATE TABLE orders(order_id INT NOT NULL PRIMARY KEY, cust_id INT NOT NULL) CLUSTER cust_clu(cust_id);
Here we created a cluster and then two tables in that cluster. In a table cluster like this, the physical position of the cluster key is the
ROWID, so both tables will share the same
ROWIDs. Like this:
SQL> INSERT INTO customer VALUES(1, 'Joe Bloggs'); SQL> INSERT INTO orders VALUES(1, 1); SQL> SELECT rowid FROM customer; ROWID ------------------ AAAmz1AAEAAAlK0AAB SQL> SELECT rowid FROM orders; ROWID ------------------ AAAmz1AAEAAAlK0AAB
As you can see, the two rows in the two tables have the same
ROWID because they share the same cluster key.
ROWID for Index Organized Tables
Mention IOT to an Oracle DBA and he/she will not think “Internet of Things”, he/she will think Index Organized Tables. IOT in Oracle are tables that are pre-sorted according to the primary key. In a normal table, indexes and data are stored separately, with indexes being sorted and referencing data that is stored elsewhere. With IOT, a table is like a big index in-and-of-itself, where all columns are included and all rows are sorted according to the said primary key.
What does this mean? Well, it basically means that the physical location of a row may change as said rows need to be sorted somewhere else, typically due to an
UPDATE being processed that affects one of the primary key columns. Let’s look at an example. First, let’s create an IOT (a table, not an “Internet of Things”) and put some data in it:
SQL> CREATE TABLE order_item(order_id NUMBER NOT NULL, item_id NUMBER NOT NULL, product_id NUMBER NOT NULL, amount NUMBER NOT NULL, PRIMARY KEY(order_id, item_id)) ORGANIZATION INDEX; SQL> INSERT INTO order_item VALUES(1, 1, 1, 5); SQL> INSERT INTO order_item VALUES(1, 2, 2, 1); SQL> INSERT INTO order_item VALUES(2, 1, 1, 15); SQL> INSERT INTO order_item VALUES(3, 1, 3, 1); SQL> SELECT order_id, item_id, rowid FROM order_item; ORDER_ID ITEM_ID ROWID ---------- ---------- ------------------ 1 1 *BAECUzMCwQICwQL+ 1 2 *BAECUzMCwQICwQP+ 2 1 *BAECUzMCwQMCwQL+ 3 1 *BAECUzMCwQQCwQL+
Then, we assume that for
order_id 2 we want to change
item_id from 1 to 2, maybe we are to add another line as line 1 or so, and look at what happens to our
SQL> UPDATE order_item SET item_id = 2 WHERE order_id = 2; SQL> SELECT order_id, item_id, rowid FROM order_item; ORDER_ID ITEM_ID ROWID ---------- ---------- ------------------ 1 1 *BAECUzMCwQICwQL+ 1 2 *BAECUzMCwQICwQP+ 2 2 *BAECUzMCwQMCwQP+ 3 1 *BAECUzMCwQQCwQL+
Have a look at the
ROWID column for the row where
order_id is 2, you see that it has changed as the row has moved. This may not be much of an issue, except that it is a good reason for why you should not store the
ROWID in a referencing table. Also, if you
ROWID to use it in a later
UPDATE, then you need to make sure that you lock that row, else this may also be an issue.
ROWID for Partitioned Tables
Just like with ROWID for Index Organized Tables, rows in partitioned tables can move around when columns that are part of the partitioning key are updated.
UROWID datatype or Universal
ROWID, is a data type that arrived along with Oracle version 9. Whereas
ROWID only worked for native Oracle tables and some other cases,
UROWID is much more flexible and can reference rows in any kind of tables. The disadvantage is that
UROWID is cumbersome to use.
Advantages and disadvantages of using ROWID and UROWID
The main advantage of using
UROWID for that matter, is that it is the fastest way to access a particular row as this, at least for normal tables, is the physical address of a row. This led to applications using
ROWIDs, like this in PL/SQL:
DECLARE r ROWID; BEGIN SELECT rowid INTO r FROM orders WHERE orderid = 1 FOR UPDATE; UPDATE orders SET cust_id = 2 WHERE rowid = r; END;
FOR UPDATE clause is to ensure that the row isn’t touched between the
SELECT and the
UPDATE. This is a perfectly legit way of using
ROWID. But once it was accepted that using
ROWID is the “best” (in this case “best” means fastest) way of accessing a row,
ROWID began being used for things it shouldn’t be used for. Like this:
Is this good? If we have gotten all data from an order and want to look up the customer, it is going to be as fast as possible, right? And the same goes for a JOIN between these two tables, real fast.
No, actually this is a seriously bad idea. Rows may move, for example, by virtue of operational tasks. If the customer table is TEMPORAL, the ROW might change and this will not work. You really can no longer use PARTITIONING for the customer table if you expect to use a column that is likely to change in the partitioning expression. Oracle developers, please NEVER do this, i.e. do not store a
ROWID in a table for any reason, unless you know really well what you are doing and have considered all the drawbacks of a design like this.
An annoying aspect of this is that if you do go ahead and use
ROWID like this, it is likely to work just fine for a while. Issues will not be caught in testing or anything.
ROWID in Summary
Now you know more than you will probably ever need to know about what a
ROWID is. But there are also some things you can learn from this. Like that the
ROWID is deeply rooted in how Oracle works. Changing the format of a
ROWID would be a major undertaking in terms of software development for Oracle. And if a
ROWID contains the data it contains, it also imposes a bunch of restrictions on the workings of Oracle. Like if a table uses some format that isn’t necessarily using blocks like we are used to, that is maybe possible but many things will be very different and there will be some severe restrictions. Or tables in a remote database to the current instance? And Indexes, as you might have guessed now, use a
ROWID to reference the indexed rows, this is also affected by all this. But
ROWID is what it is, and it is restrictive, so why not migrate to something else? Like MariaDB?
Migrating ROWID to MariaDB
So now we know about what a
ROWID is and how it is used. Now you may be wondering how do I migrate applications that use
Starting in MariaDB 10.3, we added an Oracle mode that enables compatibility with a subset of PL/SQL.
While MariaDB supports most common functions in it’s Oracle mode, MariaDB doesn’t support
ROWID specifically, but there are some different ways of dealing with
ROWID for applications that have been migrated from Oracle to MariaDB. We will now have a look at a few different means of working with this and the pros and cons of these means.
Common to all these examples is that we are emulating the attribute of the
ROWID pseudo-column of being hidden unless asked for by using the
INVISIBLE column attribute introduced in MariaDB 10.3.
Example Schema and Code to Migrate
In the following examples, I use the very basic
customer table with some data to illustrate my point. In Oracle syntax, it looks like this:
CREATE SEQUENCE cust_seq; CREATE TABLE customer(cust_id NUMBER NOT NULL PRIMARY KEY, cust_name VARCHAR2(50) NOT NULL); INSERT INTO customer VALUES(cust_seq.NEXTVAL, 'Joe Bloggs'); INSERT INTO customer VALUES(cust_seq.NEXTVAL, 'John Doe');
Note that the
cust_id column is using the
NUMBER datatype, which is not a good practice as this means that any kind of numeric values can be used in this column. This is not an uncommon shortcut in Oracle though, and we assume that the values actually used are
INTEGERS, in particular as we have a
SEQUENCE generating them.
The PL/SQL code that we are looking to migrate looks like this:
CREATE OR REPLACE PROCEDURE set_cust_name(id NUMBER, name VARCHAR2) AS cust_row VARCHAR2(18); BEGIN SELECT rowid INTO cust_row FROM customer WHERE cust_id = id FOR UPDATE; IF cust_row IS NOT NULL THEN UPDATE customer SET cust_name = name WHERE rowid = cust_row; END IF; END;
The goal here is to be able to retain the PL/SQL code. If the procedure would use the Oracle proprietary
ROWID data types, then this needs to be changed to a
BIGINT in MariaDB. One issue is that if we use
BIGINT to substitute
ROWID it will not fit into a VARCHAR(18) if that is used for storing an Oracle
ROWID. If this is an issue, i.e. if the substitute
ROWID will get very high, then I suggest migrating to use a
BIGINT for the
ROWID instead of extending the
VARCHAR to the required
VARCHAR(20). In all honesty though, I would not think that this is an issue except in some extreme edge cases.
Using AUTO_INCREMENT to generate ROWIDs
You can have only one
auto_increment column per table but as we assume here that we use Oracle style sequences to generate primary keys,
auto_increment is free for us to use for something else, in this case as a substitute key, possibly unique. To be able to use
auto_increment in InnoDB in MariaDB, the column also has to be indexed and this makes sense, because in MariaDB, this
ROWID will be synthetic and actually a physical location but just an integer value, so to look up a row we need an index for this to be reasonably fast.
The corresponding table definition in MariaDB would then be like this:
CREATE TABLE customer(cust_id INTEGER NOT NULL PRIMARY KEY, cust_name VARCHAR(50) NOT NULL, rowid BIGINT NOT NULL AUTO_INCREMENT INVISIBLE UNIQUE);
This will allow the procedure above to run unmodified. If the
ROWID datatype is used, then only this needs to be changed to a
BIGINT. In general, you are likely better off using BIGINT here anyway, but that is a different story.
- This is easy to implement.
- Low overhead due to the
auto_incrementfeature being very effective.
- The PRIMARY KEY column cannot be an
ROWIDmust be materialized.
Reusing the PRIMARY KEY as a ROWID
Another way of dealing with a generated
ROWID is to reuse the primary key as it stands as a
ROWID by using MariaDB
CREATE TABLE customer(cust_id INTEGER NOT NULL PRIMARY KEY, cust_name VARCHAR(50) NOT NULL, rowid INTEGER AS (cust_id) INVISIBLE UNIQUE);
Note that we have to index the
ROWID column here, despite the fact that we know this is just a copy of the PRIMARY KEY column. We know this, but the optimizer cannot know it.
- Easy to implement and use
ROWIDcannot be updated.
- The UNIQUE index has to be materialized.
- Requires that the PRIMARY KEY is a single column and that it has an appropriate data type.
Using a SEQUENCE to generate ROWID
If we assume a table has been migrated to MariaDB to use and
AUTO_INCREMENT column as the
PRIMARY KEY already, then we can use a SEQUENCE for the
ROWID, this is the opposite of the first case above, more or less
CREATE SEQUENCE cust_rowid_seq; CREATE TABLE customer(cust_id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT, cust_name VARCHAR(50) NOT NULL, rowid BIGINT NOT NULL DEFAULT (NEXT VALUE FOR cust_rowid_seq) INVISIBLE UNIQUE);
- Completely independent from the
- Very flexible
- Not as good performance as when using
- Both value and the index need to be materialized
Using UUID or UUID_SHORT as a ROWID
UUID_SHORT generate a semi-unique value that can potentially be used to replace a
ROWID. The issue is that it is guaranteed to be unique, but if the application can deal with it, this is an option:
CREATE TABLE customer(cust_id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT, cust_name VARCHAR(50) NOT NULL, rowid BIGINT UNSIGNED NOT NULL DEFAULT (UUID_SHORT()) INVISIBLE UNIQUE);
You can use
SYS_GUID() also, but you have to change the datatype accordingly in that case. (
SYS_GUID() was added with MariaDB Server 10.6.) There are some restrictions with regards to uniqueness for all of these functions, so I recommend you read up on that before using this method to generate a
- The value is opaque, displaying no attribute that might tempt someone to use them for any other reason beyond as a
- There are no concurrency issues with generating a UUID
- The generated value is a long string in the case of UUID() and SYS_GUID().
- There are constraints in terms of how unique the generated values are.
Updating the ROWID
Some examples above allow the
ROWID to be updated. This is typically not a big deal as the
ROWID column has a unique index on it, so an update can only be done to some alternative UNIQUE value. If you still think this is an issue though, you can always create a trigger to prohibit the
ROWID to be updated.
CREATE TRIGGER cust_rowid_trg BEFORE UPDATE ON customer FOR EACH ROW BEGIN IF new.rowid != old.rowid THEN SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'ROWID should not be updated'; END IF; END;
ROWID Questions and Answers
Q: Can you create an index on the ROWID column in a table?
A: No you cannot, and it would be meaningless all the same. An index just points to a row using said
ROWID. In other words, even if you could index a
ROWID, all it would do is to slow access down.
Q: Is ROWID in the SQL-standard?
A: No, it is not. The SQL-Standard has different means of achieving the most common use for
ROWID, which is
DELETE of a previously
SELECTed row. This is by using the
WHERE CURRENT OF cursor construct.
Q: What is the difference between ROWNUM and ROWID?
A: These are two completely different things. ROWNUM is the order the row is fetched in a
SELECT statement and is transient and cannot be used to identify a row.
ROWID is the physical (in some cases logical) location of a row in the database.
Q: Must you use the ROWID datatype for ROWIDs?
A: No, that is not necessary, the most common representation beyond the
ROWID type is to use a
ROWID is a 10-byte number (which when represented as a Byte64 encoded string ends up being 18 characters) for current versions of Oracle. Do not assume this though, unless you know what you are doing as this isn’t covering
Q: Can you update the ROWID of a row in a table?
A: No, this makes no sense at all.