Skip to content

Commit

Permalink
wip: switch to generic ServiceUpdater class
Browse files Browse the repository at this point in the history
There was nothing left after drying up the Feature/Table subclasses.
  • Loading branch information
stdavis committed Aug 8, 2024
1 parent 53648da commit d288884
Show file tree
Hide file tree
Showing 6 changed files with 260 additions and 458 deletions.
1 change: 1 addition & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
"astype",
"authed",
"auths",
"caplog",
"casefolded",
"castable",
"ceildiv",
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ The `arcgis` library does all the heavy lifting for spatial data. If the `arcpy`

#: Truncate the existing feature service data and load the new data
gis = arcgis.gis.GIS('my_agol_org_url', 'username', 'super-duper-secure-password')
updates = palletjack.load.FeatureServiceUpdater.truncate_and_load(
updates = palletjack.load.ServiceUpdater.truncate_and_load(
gis, 'feature_service_item_id', cleaned_df, r'c:\directory\to\save\truncated\data\in\case\of\error'
)
```
Expand All @@ -59,7 +59,7 @@ The `arcgis` library does all the heavy lifting for spatial data. If the `arcpy`

### Troubleshooting Weird Append Errors

If a `FeatureLayer.append()` call (within a load.FeatureServiceUpdater method) fails with an "Unknown Error: 500" error or something like that, you can query the results to get more info. The `urllib3` debug-level logs will include the HTTP GET or POST call, something like the following:
If a `FeatureLayer.append()` call (within a load.ServiceUpdater method) fails with an "Unknown Error: 500" error or something like that, you can query the results to get more info. The `urllib3` debug-level logs will include the HTTP GET or POST call, something like the following:
`https://services1.arcgis.com:443 POST /<unique string>/arcgis/rest/services/<feature layer name>/FeatureServer/<layer id>/append/jobs/<job guid>?f=json token=<crazy long token string>`. The defualt `basicConfig` logger includes the `urllib3` logs (`logging.basicConfig(level=logging.DEBUG)`) and is great for development debugging, or you can add a specific logger for `urllib3` in your code and set it's level to debug.

You can use this and a token from an AGOL tab to build a new job status url. To get the token, log into AGOL in a browser and open a private hosted feature layer item. Click the layer, and then open the developer console. With the Network tab of the console open, click on the "View" link for the service URL. You should see a document in the list whose name includes "?token=<really long token string>". Copy the name and then copy out the token string.
Expand Down
6 changes: 3 additions & 3 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Classes in `extract` handle the extract stage, pulling data in from external sou

There are a handful of classes in `transform` with methods for cleaning and preparing your dataframes for upload to AGOL. You may also need to modify your data to fit your specific business needs: calculating fields, renaming fields, performing quality checks, etc. Some classes only have static methods can be called directly without needing to instantiate the class.

Once your dataframe is looking pretty, the `load` module will help you update a hosted feature service with your new data. The `FeatureServiceUpdater` class contains several class methods that handle the instantiation process for you, allowing you to make a single method call. The other classes in `load` require you to instantiate the class yourself.
Once your dataframe is looking pretty, the `load` module will help you update a hosted feature service with your new data. The `ServiceUpdater` class contains several class methods that handle the instantiation process for you, allowing you to make a single method call. The other classes in `load` require you to instantiate the class yourself.

While many parts of the classes' functionality are hidden in private methods, commonly-used code is exposed publicly in the `utils` module. You will probably not need any of the methods provided, but they may be useful for other projects. This is palletjack's junk drawer.

Expand Down Expand Up @@ -42,7 +42,7 @@ Because the upload process uses geojsons, you **MUST** project your dataframe to

### OBJECTID and Join Keys

If you want to update existing data without truncating and loading, you will need a join key between the incoming new data and the existing AGOL data. Do not use OBJECTID for this field; it may change at any time. Instead, use your own custom field that you have complete control over. You will perform the join manually in the transform step with pandas by loading the live AGOL data into a dataframe, joining the new data into the live data, and then passing the resulting dataframe to `palletjack.load.FeatureServiceUpdater.update_features`. This method uses the live data's OBJECTID to apply the edits to the proper rows.
If you want to update existing data without truncating and loading, you will need a join key between the incoming new data and the existing AGOL data. Do not use OBJECTID for this field; it may change at any time. Instead, use your own custom field that you have complete control over. You will perform the join manually in the transform step with pandas by loading the live AGOL data into a dataframe, joining the new data into the live data, and then passing the resulting dataframe to `palletjack.load.ServiceUpdater.update`. This method uses the live data's OBJECTID to apply the edits to the proper rows.

## Error handling

Expand Down Expand Up @@ -83,7 +83,7 @@ The largest change is that the namespace has been refactored to match the ETL st

As a corollary to this, clients now import each module rather than palletjack exposing the classes directly. The recommended import is `from palletjack import extract, transform, load, utils` (omitting unused modules as necessary).

Version 3 also introduces the use of class methods to take care of object instantiation for the client. These are used the most in `palletjack.load.FeatureServiceUpdater`, where the client just calls the relevant methods.
Version 3 also introduces the use of class methods to take care of object instantiation for the client. These are used the most in `palletjack.load.ServiceUpdater`, where the client just calls the relevant methods.

### One Step at a Time

Expand Down
4 changes: 1 addition & 3 deletions docs/examples.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,9 +107,7 @@ def download_from_sftp_update_agol_reclassify_map():
update_df = transform.FeatureServiceMerging.update_live_data_with_new_data(live_df, dataframe, join_key_column)

#: Update the AGOL data
number_of_rows_updated = load.FeatureServiceUpdater.update_features(
gis, feature_layer_itemid, update_df, update_geometry=False
)
number_of_rows_updated = load.ServiceUpdater.update(gis, feature_layer_itemid, update_df, update_geometry=False)

#: Reclassify the break values on the webmap's color ramp
reclassifier = load.ColorRampReclassifier(webmap_item, gis)
Expand Down
Loading

0 comments on commit d288884

Please sign in to comment.