forked from PDAL/PDAL
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
GeoParquet and Arrow IPC read/write support (PDAL#4115)
* remove dead codepath * fix initialization order * implement writers.arrow for feather and parquet support * use type_fwd provided from arrow * add ORC output support too * fix doc warnings * add license * add readers.arrow scaffolding * readers.arrow implementation. fix writers.arrow to write dimensions in correct order * parquet read support * make sure to init m_formatType * retab dependabot? * geoparquet output * configure CI to run arrow builds * feather/parquet GeoParquet-style metadata reading * missing file * report read failure error information * NOMINMAX for WIN32 * need NOMINMAX for tests too * typo'd target names * fix up geoparquet projjson output * fix parquet reader * remove extraneous Close() * bump ci * WIP * arrow and parquet batch writing * wip * support pdal::Geometry creation from WKB * set XYZ from GeoParquet wkb if it is there * write XYZ as StructArray for GeoArrow compatibility * warning nit * GeoArrow support * write arrow schema * set 4326 for empty crs for geoparquet * oops * cleanups and docs * add writers.arrow.write_pipeline_metadata to support writing final table metadata into the ARROW schema for the GeoArrow struct
- Loading branch information
Showing
28 changed files
with
2,381 additions
and
12 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# | ||
# Arrow configuration. | ||
# | ||
|
||
find_package(Arrow REQUIRED) | ||
find_package(Parquet REQUIRED) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
.. _readers.arrow: | ||
|
||
readers.arrow | ||
============== | ||
|
||
|
||
.. plugin:: | ||
|
||
.. streamable:: | ||
|
||
The Arrow reader supports reading Arrow and Parquet -formatted data as written by | ||
:ref:`writers.arrow`, although it should support point clouds written by other | ||
writers too if they follow either the `GeoArrow <https://github.com/geoarrow/geoarrow/>`__ | ||
or `GeoParquet <https://github.com/opengeospatial/geoparquet/>`__ specification. | ||
|
||
Caveats: | ||
|
||
* Which schema is read is chosen by the file name extension, but can be | ||
overridden with the `format` option set to `geoarrow` or `geoparquet` | ||
* | ||
|
||
Options | ||
------- | ||
|
||
filename | ||
Arrow GeoArrow or GeoParquet file to read [Required] | ||
format | ||
`geoarrow` or `geoparquet` option to override any filename extension | ||
hinting of data type [Optional] | ||
|
||
.. include:: reader_opts.rst | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
.. _writers.arrow: | ||
|
||
writers.arrow | ||
=============== | ||
|
||
The **Arrow Writer** supports writing to `Apache Arrow`_ `Feather`_ | ||
and `Parquet`_ file types. | ||
|
||
.. plugin:: | ||
|
||
.. streamable:: | ||
|
||
|
||
|
||
Example | ||
------- | ||
|
||
.. code-block:: json | ||
[ | ||
{ | ||
"type":"readers.las", | ||
"filename":"inputfile.las" | ||
}, | ||
{ | ||
"type":"writers.arrow", | ||
"format":"feather", | ||
"filename":"outputfile.feather" | ||
} | ||
] | ||
.. code-block:: json | ||
[ | ||
{ | ||
"type":"readers.las", | ||
"filename":"inputfile.las" | ||
}, | ||
{ | ||
"type":"writers.arrow", | ||
"format":"parquet", | ||
"geoparquet":"true", | ||
"filename":"outputfile.parquet" | ||
} | ||
] | ||
Options | ||
------- | ||
|
||
batch_size | ||
Number of rows to write as a batch [Default: 65536*64 ] | ||
|
||
filename | ||
Output file to write [Required] | ||
format | ||
File type to write (feather, parquet) [Default: "feather"] | ||
|
||
geoarrow_dimension_name | ||
Dimension name to write GeoArrow struct [Default: xyz] | ||
|
||
geoparquet | ||
Write WKB column and GeoParquet metadata when writing parquet output | ||
|
||
write_pipeline_metadata | ||
Write PDAL pipeline metadata into `PDAL:pipeline:metadata` of | ||
`geoarrow_dimension_name` | ||
|
||
.. include:: writer_opts.rst | ||
|
||
.. _Apache Arrow: https://arrow.apache.org/ | ||
.. _Feather: https://arrow.apache.org/docs/python/feather.html | ||
.. _Parquet: https://arrow.apache.org/docs/cpp/parquet.html | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.