Changelog

8.15.4 (2024-10-17)

Revert "Allow reading Elasticsearch certs in Wolfi image" (#734)

8.15.3 (2024-10-09)

Added support for DeBERTa-V2 tokenizer (#717)
Fixed --ca-cert with a shared Elasticsearch Docker volume (#732)

8.15.2 (2024-10-02)

Fixed Docker image build (#728)

8.15.1 (2024-10-01)

Upgraded PyTorch to version 2.3.1, which is compatible with Elasticsearch 8.15.2 or above (#718)
Migrated to distroless Wolfi base Docker image (#720)

8.15.0 (2024-08-12)

Added a default truncation of second for text similarity (#713)
Added note about using text_similarity for rerank in the CLI (#716)
Added support for lists in result hits (#707)
Removed input fields from exported LTR models (#708)

8.14.0 (2024-06-10)

Added

Added Elasticsearch Serverless support in DataFrames (#690, contributed by @AshokChoudhary11) and eland_import_hub_model (#698)

Fixed

Fixed Python 3.8 support (#695, contributed by @bartbroere)
Fixed non _source fields missing from the results hits (#693, contributed by @bartbroere)

8.13.1 (2024-05-03)

Added

Added support for HTTP proxies in eland_import_hub_model (#688)

8.13.0 (2024-03-27)

Added

Added support for Python 3.11 (#681)
Added eland.DataFrame.to_json function (#661, contributed by @bartbroere)
Added override option to specify the model's max input size (#674)

Changed

Upgraded torch to 2.1.2 (#671)
Mirrored pandas' lineterminator instead of line_terminator in to_csv (#595, contributed by @bartbroere)

8.12.1 (2024-01-30)

Fixed

Fix missing value support for XGBRanker (#654)

8.12.0 (2024-01-18)

Added

Supported XGBRanker model (#649)
Accepted LTR (Learning to rank) model config when importing model (#645, #651)
Added LTR feature logger (#648)
Added prefix_string config option to the import model hub script (#642)
Made online retail analysis notebook runnable in Colab (#641)
Added new movie dataset to the tests (#646)

8.11.1 (2023-11-22)

Added

Make demo notebook runnable in Colab (#630)

Changed

Bump Shap version to 0.43 (#636)

Fixed

Fix failed import of Sentence Transformer RoBERTa models (#637)

8.11.0 (2023-11-08)

Added

Support E5 small multilingual model (#625)

Changed

Stream writes in ed.DataFrame.to_csv() (#579)
Improve memory estimation for NLP models (#568)

Fixed

Fixed deprecations in preparation of Pandas 2.0 support (#602, #603, contributed by @bartbroere)

8.10.1 (2023-10-11)

Fixed

Fixed direct usage of TransformerModel (#619)

8.10.0 (2023-10-09)

Added

Published pre-built Docker images to docker.elastic.co/eland/eland (#613)
Allowed importing private HuggingFace models (#608)
Added Apple Silicon (arm64) support to Docker image (#615)
Allowed importing some DPR models like ance-dpr-context-multi (#573)
Allowed using the Pandas API without monitoring/main permissions (#581)

Changed

Updated Docker image to Debian 12 Bookworm (#613)
Reduced Docker image size by not installing unused PyTorch GPU support on amd64 (#615)
Reduced model chunk size to 1MB (#605)

Fixed

Fixed deprecations in preparation of Pandas 2.0 support (#593, #596, contributed by @bartbroere)

8.9.0 (2023-08-24)

Added

Simplify embedding model support and loading (#569)
Make eland_import_hub_model easier to find on Windows (#559)
Update trained model inference endpoint (#556)
Add BertJapaneseTokenizer support with bert_ja tokenization configuration (#534)
Add ability to upload xlm-roberta tokenized models (#518)
Tolerate different model output formats when measuring embedding size (#535)
Generate valid NLP model id from file path (#541)
Upgrade torch to 1.13.1 and check the cluster version before uploading a NLP model (#522)
Set embedding_size config parameter for Text Embedding models (#532)
Add support for the pass_through task (#526)

Fixed

Fixed black to comply with the code style (#557)
Fixed No module named 'torch' (#553)
Fix autosummary directive by removing hack autosummaries (#548)
Prevent TypeError with None check (#525)

8.7.0 (2023-03-30)

Added

Added a new NLP model task type "text_similarity" (#486)
Added a new NLP model task type "text_expansion" (#520)
Added support for exporting an Elastic ML model as a scikit-learn pipeline via MLModel.export_model() (#509)

Fixed

Fixed an issue that occurred when LightGBM was installed but libomp wasn't installed on the system. (#499)

8.3.0 (2022-07-11)

Added

Added a new NLP model task type "auto" which infers the task type based on model configuration and architecture (#475)

Changed

Changed required version of 'torch' package to >=1.11.0,<1.12 to match required PyTorch version for Elasticsearch 8.3 (was >=1.9.0,<2) (#479)
Changed the default value of the --task-type parameter for the eland_import_hub_model CLI to be "auto" (#475)

Fixed

Fixed decision tree classifier serialization to account for probabilities (#465)
Fixed PyTorch model quantization (#472)

8.2.0 (2022-05-09)

Added

Added support for passing Cloud ID via --cloud-id to eland_import_hub_model CLI tool (#462)
Added support for authenticating via --es-username, --es-password, and --es-api-key to the eland_import_hub_model CLI tool (#461)
Added support for XGBoost 1.6 (#458)
Added support for question_answering NLP tasks (#457)

8.1.0 (2022-03-31)

Added

Added support for eland.Series.unique() (#448, contributed by @V1NAY8)
Added --ca-certs and --insecure options to eland_import_hub_model for configuring TLS (#441)

8.0.0 (2022-02-10)

Added

Added support for Natural Language Processing (NLP) models using PyTorch (#394)
Added new extra eland[pytorch] for installing all dependencies needed for PyTorch (#394)
Added a CLI script eland_import_hub_model for uploading HuggingFace models to Elasticsearch (#403)
Added support for v8.0 of the Python Elasticsearch client (#415)
Added a warning if Eland detects it's communicating with an incompatible Elasticsearch version (#419)
Added support for number_samples to LightGBM and Scikit-Learn models (#397, contributed by @V1NAY8)
Added ability to use datetime types for filtering dataframes (`284`_, contributed by @Fju)
Added pandas datetime64 type to use the Elasticsearch date type (`#425`_, contributed by @Ashton-Sidhu)
Added es_verify_mapping_compatibility parameter to disable schema enforcement with pandas_to_eland (#423, contributed by @Ashton-Sidhu)

Changed

Changed to_pandas() to only use Point-in-Time and search_after instead of using Scroll APIs for pagination.

7.14.1b1 (2021-08-30)

Added

Added support for DataFrame.iterrows() and DataFrame.itertuples() (#380, contributed by @kxbin)

Performance

Simplified result collectors to increase performance transforming Elasticsearch results to pandas (#378, contributed by @V1NAY8)
Changed search pagination function to yield batches of hits (#379)

7.14.0b1 (2021-08-09)

Added

Added support for Pandas 1.3.x (#362, contributed by @V1NAY8)
Added support for LightGBM 3.x (#362, contributed by @V1NAY8)
Added DataFrame.idxmax() and DataFrame.idxmin() methods (#353, contributed by @V1NAY8)
Added type hints to eland.ndframe and eland.operations (#366, contributed by @V1NAY8)

Removed

Removed support for Pandas <1.2 (#364)
Removed support for Python 3.6 to match Pandas (#364)

Changed

Changed paginated search function to use Point-in-Time and Search After features instead of Scroll when connected to Elasticsearch 7.12+ (#370 and #376, contributed by @V1NAY8)
Optimized the FieldMappings.aggregate_field_name() method (#373, contributed by @V1NAY8)

7.13.0b1 (2021-06-22)

Added

Added DataFrame.quantile(), Series.quantile(), and DataFrameGroupBy.quantile() aggregations (#318 and #356, contributed by @V1NAY8)

Changed

Changed the error raised when es_index_pattern doesn't point to any indices to be more user-friendly (#346)

Fixed

Fixed a warning about conflicting field types when wildcards are used in es_index_pattern (#346)
Fixed sorting when using DataFrame.groupby() with dropna (#322, contributed by @V1NAY8)
Fixed deprecated usage numpy.int in favor of numpy.int_ (#354, contributed by @V1NAY8)

7.10.1b1 (2021-01-12)

Added

Added support for Pandas 1.2.0 (#336)
Added DataFrame.mode() and Series.mode() aggregation (#323, contributed by @V1NAY8)
Added support for pd.set_option("display.max_rows", None) (#308, contributed by @V1NAY8)
Added Elasticsearch storage usage to df.info() (#321, contributed by @V1NAY8)

Removed

Removed deprecated aliases read_es, read_csv, DataFrame.info_es, and MLModel(overwrite=True) (#331, contributed by @V1NAY8)

7.10.0b1 (2020-10-29)

Added

Added DataFrame.groupby() method with all aggregations (#278, #291, #292, #300 contributed by @V1NAY8)
Added es_match() method to DataFrame and Series for filtering rows with full-text search (#301)
Added support for type hints of the elasticsearch-py package (#295)
Added support for passing dictionaries to es_type_overrides parameter in the pandas_to_eland() function to directly control the field mapping generated in Elasticsearch (#310)
Added es_dtypes property to DataFrame and Series (#285)

Changed

Changed pandas_to_eland() to use the parallel_bulk() helper instead of single-threaded bulk() helper to improve performance (#279, contributed by @V1NAY8)
Changed the es_type_overrides parameter in pandas_to_eland() to raise ValueError if an unknown column is given (#302)
Changed DataFrame.filter() to preserve the order of items (#283, contributed by @V1NAY8)
Changed when setting es_type_overrides={"column": "text"} in pandas_to_eland() will automatically add the column.keyword sub-field so that aggregations are available for the field as well (#310)

Fixed

Fixed Series.__repr__ when the series is empty (#306)

7.9.1a1 (2020-09-29)

Added

Added the predict() method and model_type, feature_names, and results_field properties to MLModel (#266)

Deprecated

Deprecated ImportedMLModel in favor of MLModel.import_model(...) (#266)

Changed

Changed DataFrame aggregations to use numeric_only=None instead of numeric_only=True by default. This is the same behavior as Pandas (#270, contributed by @V1NAY8)

Fixed

Fixed DataFrame.agg() when given a string instead of a list of aggregations will now properly return a Series instead of a DataFrame (#263, contributed by @V1NAY8)

7.9.0a1 (2020-08-18)

Added

Added support for Pandas v1.1 (#253)
Added support for LightGBM LGBMRegressor and LGBMClassifier to ImportedMLModel (#247, #252)
Added support for multi:softmax and multi:softprob XGBoost operators to ImportedMLModel (#246)
Added column names to DataFrame.__dir__() for better auto-completion support (#223, contributed by @leonardbinet)
Added support for es_if_exists='append' to pandas_to_eland() (#217)
Added support for aggregating datetimes with nunique and mean (#253)
Added es_compress_model_definition parameter to ImportedMLModel constructor (#220)
Added .size and .ndim properties to DataFrame and Series (#231 and #233)
Added .dtype property to Series (#258)
Added support for using pandas.Series with Series.isin() (#231)
Added type hints to many APIs in DataFrame and Series (#231)

Deprecated

Deprecated the overwrite parameter in favor of es_if_exists in ImportedMLModel constructor (#249, contributed by @V1NAY8)

Changed

Changed aggregations for datetimes to be higher precision when available (#253)

Fixed

Fixed ImportedMLModel.predict() to fail when errors are present in the ingest.simulate response (#220)
Fixed Series.median() aggregation to return a scalar instead of pandas.Series (#253)
Fixed Series.describe() to return a pandas.Series instead of pandas.DataFrame (#258)
Fixed DataFrame.mean() and Series.mean() dtype (#258)
Fixed DataFrame.agg() aggregations when using extended_stats Elasticsearch aggregation (#253)

7.7.0a1 (2020-05-20)

Added

Added the package to Conda Forge, install via conda install -c conda-forge eland (#209)
Added DataFrame.sample() and Series.sample() for querying a random sample of data from the index (#196, contributed by @mesejo)
Added Series.isna() and Series.notna() for filtering out missing, NaN or null values from a column (#210, contributed by @mesejo)
Added DataFrame.filter() and Series.filter() for reducing an axis using a sequence of items or a pattern (#212)
Added DataFrame.to_pandas() and Series.to_pandas() for converting an Eland dataframe or series into a Pandas dataframe or series inline (#208)
Added support for XGBoost v1.0.0 (#200)

Deprecated

Deprecated info_es() in favor of es_info() (#208)
Deprecated eland.read_csv() in favor of eland.csv_to_eland() (#208)
Deprecated eland.read_es() in favor of eland.DataFrame() (#208)

Changed

Changed var and std aggregations to use sample instead of population in line with Pandas (#185)
Changed painless scripts to use source rather than inline to improve script caching performance (#191, contributed by @mesejo)
Changed minimum elasticsearch Python library version to v7.7.0 (#207)
Changed name of Index.field_name to Index.es_field_name (#208)

Fixed

Fixed DeprecationWarning raised from pandas.Series when an an empty series was created without specifying dtype (#188, contributed by @mesejo)
Fixed a bug when filtering columns on complex combinations of and and or (#204)
Fixed an issue where DataFrame.shape would return a larger value than in the index if a sized operation like .head(X) was applied to the data frame (#205, contributed by @mesejo)
Fixed issue where both scikit-learn and xgboost libraries were required to use eland.ml.ImportedMLModel, now only one library is required to use this feature (#206)

7.6.0a5 (2020-04-14)

Added

Added support for Pandas v1.0.0 (#141, contributed by @mesejo)
Added use_pandas_index_for_es_ids parameter to pandas_to_eland() (#154)
Added es_type_overrides parameter to pandas_to_eland() (#181)
Added NDFrame.var(), .std() and .median() aggregations (#175, #176, contributed by @mesejo)
Added DataFrame.es_query() to allow modifying ES queries directly (#156)
Added eland.__version__ (#153, contributed by @mesejo)

Removed

Removed support for Python 3.5 (#150)
Removed eland.Client() interface, use elasticsearch.Elasticsearch() client instead (#166)
Removed all private objects from top-level eland namespace (#170)
Removed geo_points from pandas_to_eland() in favor of es_type_overrides (#181)

Changed

Changed ML model serialization to be slightly smaller (#159)
Changed minimum elasticsearch Python library version to v7.6.0 (#181)

Fixed

Fixed inference_config being required on ML models for ES >=7.8 (#174)
Fixed unpacking for DataFrame.aggregate("median") (#161)

7.6.0a4 (2020-03-23)

Changed

Changed requirement for xgboost from >=0.90 to ==0.90

Fixed

Fixed issue in DataFrame.info() when called on an empty frame (#135)
Fixed issues where many _source fields would generate a too_long_frame error (#135, #137)

Files

CHANGELOG.rst

Latest commit

History

CHANGELOG.rst

File metadata and controls

Changelog

8.15.4 (2024-10-17)

8.15.3 (2024-10-09)

8.15.2 (2024-10-02)

8.15.1 (2024-10-01)

8.15.0 (2024-08-12)

8.14.0 (2024-06-10)

Added

Fixed

8.13.1 (2024-05-03)

Added

8.13.0 (2024-03-27)

Added

Changed

8.12.1 (2024-01-30)

Fixed

8.12.0 (2024-01-18)

Added

8.11.1 (2023-11-22)

Added

Changed

Fixed

8.11.0 (2023-11-08)

Added

Changed

Fixed

8.10.1 (2023-10-11)

Fixed

8.10.0 (2023-10-09)

Added

Changed

Fixed

8.9.0 (2023-08-24)

Added

Fixed

8.7.0 (2023-03-30)

Added

Fixed

8.3.0 (2022-07-11)

Added

Changed

Fixed

8.2.0 (2022-05-09)

Added

8.1.0 (2022-03-31)

Added

8.0.0 (2022-02-10)

Added

Changed

7.14.1b1 (2021-08-30)

Added

Performance

7.14.0b1 (2021-08-09)

Added

Removed

Changed

7.13.0b1 (2021-06-22)

Added

Changed

Fixed

7.10.1b1 (2021-01-12)

Added

Removed

7.10.0b1 (2020-10-29)

Added

Changed

Fixed

7.9.1a1 (2020-09-29)

Added

Deprecated

Changed

Fixed

7.9.0a1 (2020-08-18)

Added