Skip to content

Select ClickHouse data, convert to pandas dataframes and various other formats, by using the ClickHouse HTTP interface

License

Notifications You must be signed in to change notification settings

lee19840806/clickhouse2pandas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

clickhouse2pandas

Select ClickHouse data, convert to pandas dataframes and various other formats, by using the ClickHouse HTTP interface.

Features

  • The transmitting data is compressed by default, which reduces network traffic and thus reduces the time for downloading data.
  • Comes with a dynamic download label, which shows how many data is downloaded.
  • Converts the ClickHouse query result into proper pandas data types, e.g., ClickHouse DateTime -> pandas datetime64.
  • Minimum dependencies, 5 standard python libraries (urllib, http, gzip, json, time) and 1 external library (pandas).

Installation

pip install clickhouse2pandas

Usage

import clickhouse2pandas as ch2pd

connection_url = 'http://user:password@clickhouse_host:8123'

query = 'select * from system.numbers limit 1000000'

df = ch2pd.select(connection_url, query)
# df is a pandas dataframe converted from ClickHouse query result

API Reference

clickhouse2pandas.select(connection_url, query = None, convert_to = 'DataFrame', settings = None)

Return a formatted query result specified by "convert_to" parameter.

Parameters:

  • connection_url: the connection url to the ClickHouse HTTP interface, e.g., http://user:password@clickhouse_host:8123
  • query: the SQL query, the query should start with 'select'
  • convert_to: convert the query result into specific format, could be one of the following: 'DataFrame', 'TabSeparated', 'TabSeparatedRaw', 'TabSeparatedWithNames', 'TabSeparatedWithNamesAndTypes', 'CSV', 'CSVWithNames', 'Values', 'Vertical', 'JSON', 'JSONCompact', 'JSONEachRow', 'TSKV', 'Pretty', 'PrettyCompact', 'PrettyCompactMonoBlock', 'PrettyNoEscapes', 'PrettySpace', 'XML'. Refer to ClickHouse Input and Output Formats
  • settings: a dict containing the setting key-values, default settings are {'enable_http_compression': 1, 'send_progress_in_http_headers': 0,'log_queries': 1, 'connect_timeout': 10, 'receive_timeout': 300, 'send_timeout': 300, 'output_format_json_quote_64bit_integers': 0, 'wait_end_of_query': 0}. Refer to ClickHouse Settings

About

Select ClickHouse data, convert to pandas dataframes and various other formats, by using the ClickHouse HTTP interface

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages