Comments - mariadb is crashing

12 years, 3 months ago Scott Feldstein

Yes, i'll file the bug today.

thanks.

 
12 years, 3 months ago Elena Stepanova

Hi,

Thank you for filing the bug report. Below is my comment from it, in case you are not getting updates:

-----------

Hi,

Just creating the tables doesn't seem to cause the crash, at least it isn't crashing for me so far. I used the same settings as you.

Your log suggests that your server is not idle, you have some scheduled activity, in a cronjob or such -- something that connects to the server and apparently executes something. Could you find out what is executed? The connection looks local, that is whatever is running, is probably on the same box.

If you could turn on the general log temporarily, its output could help to understand what flow causes the problem. To turn on logging, set general_log=1 and global general_log_file=<file location>. If you do it on a running server, please make sure you are changing global variables (set global); but it would be more useful if you could set them in the cnf file, so we see the whole activity from the server restart and till the crash.

One more thing, in the log file, when the problem started, the server complains about mysql.user and mysql.db tables being crashed. Could you please check if they are OK now (run check table on them), just so we know we are not dealing with an underlying condition?

Thank you --------------

 
12 years, 3 months ago Scott Feldstein

Hi, For me it definitely crashed after a few mins of me adding those tables. This actually occurred on two separate running hosts with the same rhel version. At one point I thought it had stabilized and that is when I started running it against my application.

Out of curiosity did you try this on the same rhel version that I posted in the bug? I am wondering if the segfault is related to the platform. I did try on an earlier version, rhel 5.2, and I don't see the crash, although I didn't run it with any load against it.

To answer your questions: All that is executed on the db is a lot of insert stmts and a few selects. Right now my env is down for maintenance, but when it comes back up I'll get you the log. From what I see I don't think that will cause the crash.

I'll run check table and get that info back to you.

Not sure how long my env will be down for maintenance, but if you have some time please try the same rhel version (or even centos) to see if you have the same issue because on both these running instances they crash a few mins after I add the tables.

thanks.

 
12 years, 3 months ago Elena Stepanova

Hi,

No, so far I was using a different Linux flavor.

The point is, your server is not just crashing every few minutes, it is crashing according to the perfect schedule, like this:

16:00                      
16:05                      18:05
16:10                      18:10
16:15                      18:15
16:20                      18:20
16:25                      18:25
16:30                      18:30
16:35                      18:35
16:40                      18:40
16:45                      18:45
16:50                      18:50
16:55                      18:55
17:00                      19:00
17:10  <-- ATTENTION! -->  19:10
18:00  <-- ATTENTION! -->  20:00

and so on, without fault, on the same 2-hour cycle.

There are also connectivity errors in the log, which happen with 1 second presision which also suggests that something is scheduled.

I cannot completely rule out the idea that it is OS-specific, but it seems to me more likely that something is happening on the machine(s) or on the DB server(s) where you observe these regular crashes. It might be related to the database, like the scheduled connections, or something completely independent -- disk maintenance, or be whatever. The former is easier to rule out: when we see the general log, we will hopefully know more.

One more thing to check is whether you have any events configured in the database itself -- there are none in the SQL file you sent, but maybe they were created separately.

Regards, ---

 
12 years, 3 months ago Scott Feldstein

Yes, this is what I pointed out in my initial comments, "This occurs just about every 5 minutes on the dot, but sometimes it crashes out of band. mysqld_safe continuously restarts it." and that the segfault seems to be from the checkpoint thread.

Here are some things to consider as well:

1) I have run percona mysql on these same servers, using the same disks and it is completely stable for several months now.

2) this is occurring on two different servers with the same OS and not the OS with rhel 5.2

3) the stack trace says it is a segfault

From these I can say with lots of confidence that nothing is killing the daemon.

I'll get you the required data when my box is accessible again.

 
12 years, 3 months ago Elena Stepanova

There is no doubt it's a crash inside Aria checkpoint execution, we just need to find a way to reproduce it locally. The more we know about what is going on on your servers, the faster we will get there.

The fact that it is not observed on the RHEL 5.2 server does not yet mean the bad memory access is not happening there, only that it does not cause the crash. We might see more on debug binaries when we have the right DML flow to make Aria checkpoints work same way as they do on your servers. I tried simple INSERTs on your schema, but did not get any valgrind complaints so far, so it's more efficient to use the real pattern than keep trying random ones.

 
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.