11 years, 3 months ago Scott Feldstein

Hi, For me it definitely crashed after a few mins of me adding those tables. This actually occurred on two separate running hosts with the same rhel version. At one point I thought it had stabilized and that is when I started running it against my application.

Out of curiosity did you try this on the same rhel version that I posted in the bug? I am wondering if the segfault is related to the platform. I did try on an earlier version, rhel 5.2, and I don't see the crash, although I didn't run it with any load against it.

To answer your questions: All that is executed on the db is a lot of insert stmts and a few selects. Right now my env is down for maintenance, but when it comes back up I'll get you the log. From what I see I don't think that will cause the crash.

I'll run check table and get that info back to you.

Not sure how long my env will be down for maintenance, but if you have some time please try the same rhel version (or even centos) to see if you have the same issue because on both these running instances they crash a few mins after I add the tables.


11 years, 3 months ago Elena Stepanova


No, so far I was using a different Linux flavor.

The point is, your server is not just crashing every few minutes, it is crashing according to the perfect schedule, like this:

16:05                      18:05
16:10                      18:10
16:15                      18:15
16:20                      18:20
16:25                      18:25
16:30                      18:30
16:35                      18:35
16:40                      18:40
16:45                      18:45
16:50                      18:50
16:55                      18:55
17:00                      19:00
17:10  <-- ATTENTION! -->  19:10
18:00  <-- ATTENTION! -->  20:00

and so on, without fault, on the same 2-hour cycle.

There are also connectivity errors in the log, which happen with 1 second presision which also suggests that something is scheduled.

I cannot completely rule out the idea that it is OS-specific, but it seems to me more likely that something is happening on the machine(s) or on the DB server(s) where you observe these regular crashes. It might be related to the database, like the scheduled connections, or something completely independent -- disk maintenance, or be whatever. The former is easier to rule out: when we see the general log, we will hopefully know more.

One more thing to check is whether you have any events configured in the database itself -- there are none in the SQL file you sent, but maybe they were created separately.

Regards, ---

11 years, 3 months ago Scott Feldstein

Yes, this is what I pointed out in my initial comments, "This occurs just about every 5 minutes on the dot, but sometimes it crashes out of band. mysqld_safe continuously restarts it." and that the segfault seems to be from the checkpoint thread.

Here are some things to consider as well:

1) I have run percona mysql on these same servers, using the same disks and it is completely stable for several months now.

2) this is occurring on two different servers with the same OS and not the OS with rhel 5.2

3) the stack trace says it is a segfault

From these I can say with lots of confidence that nothing is killing the daemon.

I'll get you the required data when my box is accessible again.

11 years, 3 months ago Elena Stepanova

There is no doubt it's a crash inside Aria checkpoint execution, we just need to find a way to reproduce it locally. The more we know about what is going on on your servers, the faster we will get there.

The fact that it is not observed on the RHEL 5.2 server does not yet mean the bad memory access is not happening there, only that it does not cause the crash. We might see more on debug binaries when we have the right DML flow to make Aria checkpoints work same way as they do on your servers. I tried simple INSERTs on your schema, but did not get any valgrind complaints so far, so it's more efficient to use the real pattern than keep trying random ones.

