-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add record count meta table #56
Comments
Erwin B shows me how to write the function but pg_fs doesn't seem to like the regclass type. https://www.hillcrestgeo.ca/fwapg/functions/record_count/items.json I don't want to work around the security features of above function and record count is only needed for a few tables that generally do not have changing counts.... a meta table seems easier to implement for now? |
Also, getting the record count takes forever - much better to have this cached somewhere:
(pg13, ubuntu20.04, 4CPU, 8G ram) |
@smnorris Is it faster to get the feature count via Python and WFS? If so, I wonder why it is so much faster? |
I don't have WFS set up on my db - my comment references getting a WFS feature count from DataBC which I use for paging their collections. That request is pretty much instant, but it is (presumably) a geoserver / oracle back end. |
Hmm, I'm not sure why counting the records is so slow on that db, I may want to adjust something in the config. If I make a similar request on my local db it is <1s. |
Could it be something to do with the indexing, or the table statistics? Or perhaps parallel workers allocated? Or how warm the DB instance is? Doing an |
Maybe? Indexes should be the same and autovacuum is on on both dbs. The query to the ubuntu db does not speed up on repeated calls. Ubuntu 20.04 / PG 13.4:
MacOS, PG 13.3 (homebrew)
But this is getting off into the weeds. For paging - an estimate (maybe rounded up to the request limit) might be adequate? |
Looks like the MacOS query is using an index-only scan, and the Ubuntu instance is not. If the indexes are indeed defined in the same way, not sure why that would be. Something about the table stats perhaps? Agreed that finding a faster way to get an approximate count would be nice. Perhaps using the |
Well they should be defined the same way but that doesn't mean they are. The index being used has a different name in one db - but both are basic btree indexes and were named by the system. I might try rebuilding both at some point. To solve the fwapgr requirements in the short term I could just create an FWA meta table and manually populate it. The data rarely changes... and I don't want to get in the habit of serving huge collections either. Perhaps pg_fs is better but paging through large DataBC WFS requests gets unreliable after a few 100k. |
Would it be helpful to have some way in the request API to indicate that a count of total number of records in query response should be calculated and returned? E.g. perhaps a (non-standard) query parameter And how should the query result size be returned? As a special JSON document? Or as the standard |
Unreliable how? Missing records? The unreliability might be a function of how Oracle implements offsetting. Postgres may or may not be better in this respect. Hard to see why humans would want to wade through a large number of pages, but I guess automated systems might do data extracts that way, and thus require full reliability? I'm starting to think about implementing a chunk of the CQL filter spec, so maybe that will provide a better alternative to reduce the size of query results? |
WFS Reliability : I have not tried to debug - the issue might just be that my script doesn't handle network interruptions. Files are generally available for these larger collections and the files are far faster to download/load to postgres than requesting geojson feature by feature. API questions: |
For pg_featureserv :
poissonconsulting/fwapgr#47
The text was updated successfully, but these errors were encountered: