-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
De-duplicate passwords imports (#2920)
Task/Issue URL: https://app.asana.com/0/1202926619870900/1207598052765981/f Tech Design URL: https://app.asana.com/0/1202926619870900/1207676132777813/f **Description**: Duplicated passwords are a frequently-reported challenge in adopting our password manager. Users who import twice or from two different sources may find themselves with hundreds of duplicated entries, making their UI a mess. This change means that, upon import, passwords that are already stored are not duplicated. **Steps to test this PR**: 1. Download latest build from [Builds: Deduplicate passwords on import](https://app.asana.com/0/0/1207719004459281) 2. Reset your autofill data (Debug → Reset Data → Reset Autofill Data) 3. Import login_deduplication_starting_data.csv from [✓ CSV files with test cases: 0.5 days](https://app.asana.com/0/72649045549333/1207597760316576/f) 4. Check your logins to ensure that all 17 entries have been imported 5. Import login_deduplication_test_data.csv from [✓ CSV files with test cases: 0.5 days](https://app.asana.com/0/0/1207597760316576) 6. Check against [✓ Define a duplicate: 0.5 days](https://app.asana.com/0/0/1207598052765977) rules to make sure duplicates have been removed (the title/notes values should help you identify each case against the examples defined in the rules). 7. Import login_import_data_large.csv from [✓ CSV files with test cases: 0.5 days](https://app.asana.com/0/0/1207597760316576) to test large import data sets 8. Import the data from the previous step again to test large import data + large stored data sets 9. This could be better and I'm timeboxing some low-hanging fruit optimisations to see if can be improved while trying to avoid DB schema changes that may require a migration 10. The autofill UI is currently not handling large volumes of data very well even before this project. This is being discussed and tracked in [Logins/Autofill: improve list view performance](https://app.asana.com/0/0/1207725472020618) **Definition of Done**: * [x] Does this PR satisfy our [Definition of Done](https://app.asana.com/0/1202500774821704/1207634633537039/f)? --- ###### Internal references: [Pull Request Review Checklist](https://app.asana.com/0/1202500774821704/1203764234894239/f) [Software Engineering Expectations](https://app.asana.com/0/59792373528535/199064865822552) [Technical Design Template](https://app.asana.com/0/59792373528535/184709971311943) [Pull Request Documentation](https://app.asana.com/0/1202500774821704/1204012835277482/f) Download latest build from [Builds: Deduplicate passwords on import](https://app.asana.com/0/0/1207719004459281) Reset your autofill data (Debug → Reset Data → Reset Autofill Data) Import login_deduplication_starting_data.csv from [✓ CSV files with test cases: 0.5 days](https://app.asana.com/0/72649045549333/1207597760316576/f) Check your logins to ensure that all 17 entries have been imported Import login_deduplication_test_data.csv from [✓ CSV files with test cases: 0.5 days](https://app.asana.com/0/0/1207597760316576) Check against [✓ Define a duplicate: 0.5 days](https://app.asana.com/0/0/1207598052765977) rules to make sure duplicates have been removed (the title/notes values should help you identify each case against the examples defined in the rules). Import login_import_data_large.csv from [✓ CSV files with test cases: 0.5 days](https://app.asana.com/0/0/1207597760316576) to test large import data sets Import the data from the previous step again to test large import data + large stored data sets This could be better and I'm timeboxing some low-hanging fruit optimisations to see if can be improved while trying to avoid DB schema changes that may require a migration The autofill UI is currently not handling large volumes of data very well even before this project. This is being discussed and tracked in [Logins/Autofill: improve list view performance](https://app.asana.com/0/0/1207725472020618)
- Loading branch information
Showing
16 changed files
with
10,418 additions
and
28 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
93 changes: 93 additions & 0 deletions
93
IntegrationTests/DataImport/CSVImporterIntegrationTests.swift
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
// | ||
// CSVImporterIntegrationTests.swift | ||
// | ||
// Copyright © 2024 DuckDuckGo. All rights reserved. | ||
// | ||
// Licensed under the Apache License, Version 2.0 (the "License"); | ||
// you may not use this file except in compliance with the License. | ||
// You may obtain a copy of the License at | ||
// | ||
// http://www.apache.org/licenses/LICENSE-2.0 | ||
// | ||
// Unless required by applicable law or agreed to in writing, software | ||
// distributed under the License is distributed on an "AS IS" BASIS, | ||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
// | ||
|
||
import Foundation | ||
import XCTest | ||
@testable import DuckDuckGo_Privacy_Browser | ||
import BrowserServicesKit | ||
|
||
final class CSVImporterIntegrationTests: XCTestCase { | ||
|
||
override func setUpWithError() throws { | ||
super.setUp() | ||
try clearDB() | ||
executionTimeAllowance = 10 | ||
} | ||
|
||
override func tearDownWithError() throws { | ||
try clearDB() | ||
super.tearDown() | ||
} | ||
|
||
func clearDB() throws { | ||
let vault = try AutofillSecureVaultFactory.makeVault(reporter: SecureVaultReporter.shared) | ||
|
||
let accounts = try vault.accounts() | ||
for accountID in accounts.compactMap(\.id) { | ||
if let accountID = Int64(accountID) { | ||
try vault.deleteWebsiteCredentialsFor(accountId: accountID) | ||
} | ||
} | ||
} | ||
|
||
func _testImportPasswordsPerformance() async throws { | ||
let csvURL = Bundle(for: Self.self).url(forResource: "mock_login_data_large", withExtension: "csv")! | ||
let csvImporter = CSVImporter( | ||
fileURL: csvURL, | ||
loginImporter: SecureVaultLoginImporter(), | ||
defaultColumnPositions: nil | ||
) | ||
let importTask = csvImporter.importData(types: [.passwords]) | ||
|
||
// No baseline set, but should be no more than 0.3 seconds on an M1 Max with 32GB memory | ||
measureMetrics([.wallClockTime], automaticallyStartMeasuring: false) { | ||
let expectation = expectation(description: "Measure finished") | ||
Task { | ||
startMeasuring() | ||
let result = await importTask.result | ||
_ = try result.get() | ||
stopMeasuring() | ||
expectation.fulfill() | ||
} | ||
wait(for: [expectation]) | ||
} | ||
} | ||
|
||
// Deduplication rules: https://app.asana.com/0/0/1207598052765977/f | ||
func _testImportingPasswords_deduplicatesAccordingToDefinedRules() async throws { | ||
let startingDataURL = Bundle(for: Self.self).url(forResource: "login_deduplication_starting_data", withExtension: "csv")! | ||
let startingDataImporter = CSVImporter( | ||
fileURL: startingDataURL, | ||
loginImporter: SecureVaultLoginImporter(), | ||
defaultColumnPositions: nil | ||
) | ||
_ = await startingDataImporter.importData(types: [.passwords]).result | ||
|
||
let testDataURL = Bundle(for: Self.self).url(forResource: "login_deduplication_test_data", withExtension: "csv")! | ||
let testDataImporter = CSVImporter( | ||
fileURL: testDataURL, | ||
loginImporter: SecureVaultLoginImporter(), | ||
defaultColumnPositions: nil | ||
) | ||
let importTask = testDataImporter.importData(types: [.passwords]) | ||
let result = await importTask.result | ||
let summary = try result.get()[.passwords]?.get() | ||
|
||
XCTAssertEqual(summary?.duplicate, 4) | ||
} | ||
} |
18 changes: 18 additions & 0 deletions
18
IntegrationTests/DataImport/login_deduplication_starting_data.csv
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
title,username,password,url,notes | ||
Strict duplicate,isillitoe0,iL8>{M(t,http://lycos.com,"Quisque porta volutpat erat. Quisque erat eros, viverra eget, congue eget, semper rutrum, nulla. Nunc purus. Phasellus in felis. Donec semper sapien a libero. Nam dui. Proin leo odio, porttitor id, consequat in, consequat ut, nulla. Sed accumsan felis." | ||
Strict antithesis,aschermick1,"hS0#,JU>",https://icio.us,"Donec ut dolor. Morbi vel lectus in quam fringilla rhoncus. Mauris enim leo, rhoncus sed, vestibulum sit amet, cursus id, turpis." | ||
Differing usernames,karran3,rP9*0X%Xj<iBOVX~,http://angelfire.com,"Aliquam augue quam, sollicitudin vitae, consectetuer eget, rutrum at, lorem. Integer tincidunt ante vel ipsum. Praesent blandit lacinia erat. Vestibulum sed magna at nunc commodo placerat." | ||
Nil stored username,,rW4|)T2GagA!|NHc,http://blinklist.com,Etiam faucibus cursus urna. Ut tellus. Nulla ut erat id mauris vulputate elementum. Nullam varius. Nulla facilisi. Cras non velit nec nisi vulputate nonummy. Maecenas tincidunt lacus at velit. Vivamus vel nulla eget eros elementum pellentesque. Quisque porta volutpat erat. | ||
Nil imported username,ratlee5,xB0)X*hFWr,http://un.org,Nullam sit amet turpis elementum ligula vehicula consequat. Morbi a ipsum. Integer a nibh. In quis justo. Maecenas rhoncus aliquam lacus. Morbi quis tortor id nulla ultrices aliquet. | ||
Differing passwords,rinkin6,oK8*r5Q%O,https://psu.edu,"Donec posuere metus vitae ipsum. Aliquam non mauris. Morbi non lectus. Aliquam sit amet diam in magna bibendum imperdiet. Nullam orci pede, venenatis non, sodales sed, tincidunt eu, felis. Fusce posuere felis sed lacus. Morbi sem mauris, laoreet ut, rhoncus aliquet, pulvinar sed, nisl." | ||
Nil stored password,ccowan7,,http://canalblog.com,"Nulla mollis molestie lorem. Quisque ut erat. Curabitur gravida nisi at nibh. In hac habitasse platea dictumst. Aliquam augue quam, sollicitudin vitae, consectetuer eget, rutrum at, lorem. Integer tincidunt ante vel ipsum. Praesent blandit lacinia erat. Vestibulum sed magna at nunc commodo placerat. Praesent blandit." | ||
Nil imported password,mmonan8,oT3(a2cW$,http://symantec.com,"Phasellus in felis. Donec semper sapien a libero. Nam dui. Proin leo odio, porttitor id, consequat in, consequat ut, nulla. Sed accumsan felis. Ut at dolor quis odio consequat varius. Integer ac leo. Pellentesque ultrices mattis odio. Donec vitae nisi." | ||
Differing URL,hfranceschi9,iS3_XQ=}k)cvuy/,http://un.org,"Proin eu mi. Nulla ac enim. In tempor, turpis nec euismod scelerisque, quam turpis adipiscing lorem, vitae mattis nibh ligula nec sem. Duis aliquam convallis nunc. Proin at turpis a pede posuere nonummy." | ||
Nil stored URL,hcanepea,"kP3&*y7~26q,Veb",,In hac habitasse platea dictumst. | ||
Nil imported URL,ayanshonokb,rY7}IB6@,https://delicious.com,"Pellentesque at nulla. Suspendisse potenti. Cras in purus eu magna vulputate luctus. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Vivamus vestibulum sagittis sapien. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Etiam vel augue. Vestibulum rutrum rutrum neque. Aenean auctor gravida sem." | ||
Differing notes,nlepardc,hX4'<+Q.&<z,http://indiegogo.com,Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Mauris viverra diam vitae quam. Suspendisse potenti. Nullam porttitor lacus at turpis. Donec posuere metus vitae ipsum. | ||
Nil stored notes,tschalld,iK6%k&n?}OG_d,http://mysql.com, | ||
Nil imported notes,lhowe,gA8)oS53,http://latimes.com,"Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Proin interdum mauris non ligula pellentesque ultrices. Phasellus id sapien in sapien iaculis congue." | ||
Differing titles,sparncuttf,"sU1""v$c<J7<",http://statcounter.com,Curabitur convallis. Duis consequat dui nec nisi volutpat eleifend. Donec ut dolor. | ||
,lsambrookg,nM6)CyX&/hy,https://mapquest.com,Nil stored title | ||
Nil imported title,abrimbleh,iU3*~0DzuU9>0sD,https://gov.uk,Nil imported title |
18 changes: 18 additions & 0 deletions
18
IntegrationTests/DataImport/login_deduplication_test_data.csv
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
title,username,password,url,notes,,,, | ||
Strict duplicate,isillitoe0,iL8>{M(t,http://lycos.com,"Quisque porta volutpat erat. Quisque erat eros, viverra eget, congue eget, semper rutrum, nulla. Nunc purus. Phasellus in felis. Donec semper sapien a libero. Nam dui. Proin leo odio, porttitor id, consequat in, consequat ut, nulla. Sed accumsan felis.",,,, | ||
Complete antithesis,abrimbleh,iU3*~0DzuU9>0sD,https://icio.us,"Quisque erat eros, viverra eget, congue eget, semper rutrum, nulla. Nunc purus. Phasellus in felis. Donec semper sapien a libero. Nam dui. Proin leo odio, porttitor id, consequat in, consequat ut, nulla. Sed accumsan felis. Ut at dolor quis odio consequat varius. Integer ac leo. Pellentesque ultrices mattis odio.Hatity",caizic2,yN1{Yv%nkJ=,http://aboutads.info/diam/nam/tristique/tortor.png?ac=sapien&nulla=a,"Duis consequat dui nec nisi volutpat eleifend. Donec ut dolor. Morbi vel lectus in quam fringilla rhoncus. Mauris enim leo, rhoncus sed, vestibulum sit amet, cursus id, turpis. Integer aliquet, massa id lobortis convallis, tortor risus dapibus augue, vel accumsan tellus nisi eu orci. Mauris lacinia sapien quis libero. Nullam sit amet turpis elementum ligula vehicula consequat." | ||
Differing usernames,wblindmann1,rP9*0X%Xj<iBOVX~,http://angelfire.com,"Aliquam augue quam, sollicitudin vitae, consectetuer eget, rutrum at, lorem. Integer tincidunt ante vel ipsum. Praesent blandit lacinia erat. Vestibulum sed magna at nunc commodo placerat.",,,, | ||
Nil stored username,dprescott4,rW4|)T2GagA!|NHc,http://blinklist.com,Etiam faucibus cursus urna. Ut tellus. Nulla ut erat id mauris vulputate elementum. Nullam varius. Nulla facilisi. Cras non velit nec nisi vulputate nonummy. Maecenas tincidunt lacus at velit. Vivamus vel nulla eget eros elementum pellentesque. Quisque porta volutpat erat.,,,, | ||
Nil imported username,,xB0)X*hFWr,http://un.org,Nullam sit amet turpis elementum ligula vehicula consequat. Morbi a ipsum. Integer a nibh. In quis justo. Maecenas rhoncus aliquam lacus. Morbi quis tortor id nulla ultrices aliquet.,,,, | ||
Differing passwords,rinkin6,yT7/=or7WhIX<,https://psu.edu,"Donec posuere metus vitae ipsum. Aliquam non mauris. Morbi non lectus. Aliquam sit amet diam in magna bibendum imperdiet. Nullam orci pede, venenatis non, sodales sed, tincidunt eu, felis. Fusce posuere felis sed lacus. Morbi sem mauris, laoreet ut, rhoncus aliquet, pulvinar sed, nisl.",,,, | ||
Nil stored password,ccowan7,"qZ0""2*S?uO/k{",http://canalblog.com,"Nulla mollis molestie lorem. Quisque ut erat. Curabitur gravida nisi at nibh. In hac habitasse platea dictumst. Aliquam augue quam, sollicitudin vitae, consectetuer eget, rutrum at, lorem. Integer tincidunt ante vel ipsum. Praesent blandit lacinia erat. Vestibulum sed magna at nunc commodo placerat. Praesent blandit.",,,, | ||
Nil imported password,mmonan8,,http://symantec.com,"Phasellus in felis. Donec semper sapien a libero. Nam dui. Proin leo odio, porttitor id, consequat in, consequat ut, nulla. Sed accumsan felis. Ut at dolor quis odio consequat varius. Integer ac leo. Pellentesque ultrices mattis odio. Donec vitae nisi.",,,, | ||
Differing URL,hfranceschi9,iS3_XQ=}k)cvuy/,https://sogou.com,"Proin eu mi. Nulla ac enim. In tempor, turpis nec euismod scelerisque, quam turpis adipiscing lorem, vitae mattis nibh ligula nec sem. Duis aliquam convallis nunc. Proin at turpis a pede posuere nonummy.",,,, | ||
Nil stored URL,hcanepea,"kP3&*y7~26q,Veb",https://icq.com,In hac habitasse platea dictumst.,,,, | ||
Nil imported URL,ayanshonokb,rY7}IB6@,,"Pellentesque at nulla. Suspendisse potenti. Cras in purus eu magna vulputate luctus. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Vivamus vestibulum sagittis sapien. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Etiam vel augue. Vestibulum rutrum rutrum neque. Aenean auctor gravida sem.",,,, | ||
Differing notes,nlepardc,hX4'<+Q.&<z,http://indiegogo.com,"Integer aliquet, massa id lobortis convallis, tortor risus dapibus augue, vel accumsan tellus nisi eu orci. Mauris lacinia sapien quis libero. Nullam sit amet turpis elementum ligula vehicula consequat. Morbi a ipsum.",,,, | ||
Nil stored notes,tschalld,iK6%k&n?}OG_d,http://mysql.com,"Nullam varius. Nulla facilisi. Cras non velit nec nisi vulputate nonummy. Maecenas tincidunt lacus at velit. Vivamus vel nulla eget eros elementum pellentesque. Quisque porta volutpat erat. Quisque erat eros, viverra eget, congue eget, semper rutrum, nulla. Nunc purus.",,,, | ||
Nil imported notes,lhowe,gA8)oS53,http://latimes.com,,,,, | ||
Varying titles,sparncuttf,"sU1""v$c<J7<",http://statcounter.com,Curabitur convallis. Duis consequat dui nec nisi volutpat eleifend. Donec ut dolor.,,,, | ||
Nil stored title,lsambrookg,nM6)CyX&/hy,https://mapquest.com,Nil stored title,,,, | ||
,abrimbleh,iU3*~0DzuU9>0sD,https://gov.uk,Nil imported title,,,, |
Oops, something went wrong.