Hi Guys,
The quickest one is to write a query that does that or else write a script that will remove your duplicates.
Lets check on how we can achieve this using a query and introducing a UNIQUE index on our solumn
Say for example I have a table as shown below
And I have data such as the one shown below
(lat, lng columns removed)
id | street_name | city_id | state_id | display_name | last_modified_on |
1434419 | Moorefields Rd | 224 | 2 | Moorefields Road, Sydney NSW 2209, Australia | 1359946138 |
1434420 | Kingsgrove Rd | 701 | 2 | Kingsgrove Road, Clemton Park NSW 2206, Australia | 1359946139 |
1434421 | Dewar St | 13260 | 8 | Dewar Street, Armadale WA 6112, Australia | 1359946140 |
1434423 | Dewar St | 13151 | 8 | Dewar Street, Morley WA 6062, Australia | 1359946142 |
1434424 | Dewar St | 9793 | 4 | Dewar Street, Point Arkwright QLD 4573, Australia | 1359946143 |
1434425 | Dewar St | 13929 | 8 | Dewar Street, Lake Grace WA 6353, Australia | 1359946144 |
1434426 | Dewar St | 681 | 2 | Dewar Street, Campsie NSW 2194, Australia | 1359946145 |
1434428 | Dewar St | 10990 | 4 | Dewar Street, Mission Beach QLD 4852, Australia | 1359946147 |
1434429 | Anglo Rd | 342 | 2 | Anglo Road, Greenwich NSW 2065, Australia | 1359946148 |
1434430 | Anglo Rd | 681 | 2 | Anglo Road, Campsie NSW 2194, Australia | 1359946150 |
1434445 | Anglo Rd | 681 | 2 | Anglo Road, Campsie NSW 2194, Australia | 1359946150 |
1434450 | Anglo Rd | 681 | 2 | Anglo Road, Campsie NSW 2194, Australia | 1359946150 |
As you can see that entry Anglo Road, Campsie NSW 2194, Australia
is repeated three times and data other than my row ID is pretty much the same so how do I get rid of duplicates
Here is a quick query to do that
mySQL query
1 |
ALTER IGNORE TABLE streets_with_city_state_ids ADD UNIQUE KEY myindex(display_name); |
This will remove my duplicates rows. Most important part of above query is using ALTER with IGNORE
More details can be found here
But to wrap things up this here is a quick explanation from above page
IGNORE
is a MySQL extension to standard SQL. It controls howALTER TABLE
works if there are duplicates on unique keys in the new table or if warnings occur when strict mode is enabled. IfIGNORE
is not specified, the copy is aborted and rolled back if duplicate-key errors occur. IfIGNORE
is specified, only the first row is used of rows with duplicates on a unique key. The other conflicting rows are deleted. Incorrect values are truncated to the closest matching acceptable value.
I hope this helps
Cheers!
Leave a Reply