Comments - Identifying and removing duplicates
Content reproduced on this site is the property of its respective owners,
and this content is not reviewed in advance by MariaDB. The views, information and opinions
expressed by this content do not necessarily represent those of MariaDB or any other party.
There is another alternative that should work on versions before MariaDB 10.3. This works by setting a unique index on all the identical fields, and by making use of the old ALTER TABLE copying behaviour, ignoring errors when it copies the data back, basically silently dropping duplicates. This requires the old_alter_table system variable to be temporarily set. Here's a sample:
With your data, you would create the unique index on all the duplicated fields, so something like:
It appears the values in all of your ORDER BY fields will be the same for more than one row, so your example does not specify a determinate order.
But since you have a unique ID, there's a much easier way to do this, without adding a new fields, or updating anything. See Removing Duplicates.
Thanks Ian for the reply.
If I apply the coding shown in 'Removing Duplicates' to my problem as
DELETE FROM jobtransactionactuallabour WHERE I_TransactionId IN ( SELECT t2.I_TransactionId FROM jobtransactionactuallabour t1, jobtransactionactuallabour t2 WHERE t1.SI_JobYearNumber=t2.SI_JobYearNumber AND t1.I_JobNumber=t2.I_JobNumber AND t1.DT_ActualDate=t2.DT_ActualDate AND t1.I_Hours1=t2.I_Hours1 AND t1.C_OperationCode=t2.C_OperationCode AND t1.C_EmployeeCode=t2.C_EmployeeCode AND t1.I_TransactionId <>t2.I_TransactionId AND t1.I_TransactionId =( SELECT MAX(I_TransactionId ) FROM jobtransactionactuallabour tab WHERE tab SI_JobYearNumber=t1.SI_JobYearNumber AND tab.I_JobNumber=t1.I_JobNumber AND tab.DT_ActualDate=t1.DT_ActualDate AND tab.I_Hours1=t1.I_Hours1 AND tab.C_OperationCode=t1.C_OperationCode AND tab.C_EmployeeCode=t1.C_EmployeeCode ) );I get the error message-
SQL Erroe (1093): Table 'jobtransactionactuallabour' is specified twice, both as a target for 'DELETE' and as a separate source for the data.
I had thought that as the table was identified with aliases this should not be a problem but this doesn't appear to be the case
Please paste your CREATE TABLE statement so that it's easier to replicate your structure.
Hi Ian. This is the file structure dumped using HeidiSQL. The actual file size is 181.2Mb.
Apologies, I was leading you to syntax that only works in MariaDB 10.3 - MDEV-12137.
Thanks for the info. I'll dowwnload 10.3 and give it a go.
Note that MariaDB 10.3 is still alpha. So you can use it just to convert your data and then re-import it back to your current version, or you can try the alternative version I've added above. I wouldn't suggest using 10.3 in production yet.
Many thanks for the solution Ian. I have tried it using 10.3 and it has worked.
I now have six other major files on the system that have similarly been corrupted with duplicates to work my wat through.
The corruption occured not, I hasten to add, in MariaDB but in a D-ISAM based system which I have converted to MariaDB to solve the problem.
I really appreciate the time and trouble you have spent in helping me solve the issue.