-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Reorganize source around XML and JSON folders * GDT-54 Create Aardvark transform Why these changes are being introduced: * This is the initial structure for the Aardvark transform class. The class will be expanded with new methods in subsequent commits. How this addresses that need: * Add jsonlines to Pipfile * Add fixtures for aardvark and generic JSONLines files * Update argument type hinting for Transformer and JsonTransformer classes to clarify expected content types * Update JsonTransformer.parse_source_file method to use jsonlines library * Add Aardvark class with get_main_titles, get_source_record_id, record_is_deleted (in progress), get_optional_fields (in progress), and get_subjects methods and corresponding unit tests Side effects of this change: * None Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/GDT-54 * Refactor unit test to resolve CI error * Updates based on discussion in PR #108 * Update json_records fixture to aardvark_records for more accurate unit tests * Rename Aardvark > MITAardvark to unify terminology across repos * Update get_main_titles method to reflect it is a required field * Update Aardvark method docstrings to provide greater context * Add Transformer._transform method to minimize code duplication between JsonTransformer and XmlTransformer methods * Update _transform method docstring
- Loading branch information
Showing
30 changed files
with
763 additions
and
209 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"id": "123", "dcat_keyword_sm": ["Country"], "dcat_theme_sm": ["Political boundaries"], "dct_spatial_sm": ["Some city, Some country"], "dct_subject_sm": ["Geography", "Earth"], "gbl_resourceClass_sm": ["Dataset"], "gbl_resourceType_sm": ["Vector data"], "dct_title_s": "Test title 1"} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
{"id": "123", "dct_title_s": "Test title 1"} | ||
{"id": "456", "dct_title_s": "Test title 2"} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
import transmogrifier.models as timdex | ||
from transmogrifier.sources.json.aardvark import MITAardvark | ||
|
||
|
||
def test_aardvark_get_required_fields_returns_expected_values(aardvark_records): | ||
transformer = MITAardvark("cool-repo", aardvark_records) | ||
assert transformer.get_required_fields(next(aardvark_records)) == { | ||
"source": "A Cool Repository", | ||
"source_link": "https://example.com/123", | ||
"timdex_record_id": "cool-repo:123", | ||
"title": "Test title 1", | ||
} | ||
|
||
|
||
def test_jsontransformer_transform_returns_timdex_record(aardvark_records): | ||
transformer = MITAardvark("cool-repo", aardvark_records) | ||
assert next(transformer) == timdex.TimdexRecord( | ||
source="A Cool Repository", | ||
source_link="https://example.com/123", | ||
timdex_record_id="cool-repo:123", | ||
title="Test title 1", | ||
citation="Test title 1. Geospatial data. https://example.com/123", | ||
content_type=["Geospatial data"], | ||
) | ||
|
||
|
||
def test_aardvark_get_main_titles_success(aardvark_record_all_fields): | ||
assert MITAardvark.get_main_titles(aardvark_record_all_fields) == ["Test title 1"] | ||
|
||
|
||
def test_aardvark_get_source_record_id_success(aardvark_record_all_fields): | ||
assert MITAardvark.get_source_record_id(aardvark_record_all_fields) == "123" | ||
|
||
|
||
def test_aardvark_get_subjects_success(aardvark_record_all_fields): | ||
assert MITAardvark.get_subjects(aardvark_record_all_fields) == [ | ||
timdex.Subject(value=["Country"], kind="DCAT Keyword"), | ||
timdex.Subject(value=["Political boundaries"], kind="DCAT Theme"), | ||
timdex.Subject(value=["Geography"], kind="Dublin Core Subject"), | ||
timdex.Subject(value=["Earth"], kind="Dublin Core Subject"), | ||
timdex.Subject(value=["Dataset"], kind="Subject scheme not provided"), | ||
timdex.Subject(value=["Vector data"], kind="Subject scheme not provided"), | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 1 addition & 1 deletion
2
tests/test_dspace_dim.py → tests/sources/xml/test_dspace_dim.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 1 addition & 1 deletion
2
tests/test_dspace_mets.py → tests/sources/xml/test_dspace_mets.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 1 addition & 1 deletion
2
tests/test_springshare.py → tests/sources/xml/test_springshare.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
Oops, something went wrong.