Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] Add HKV which is based on merlin HierarchicalHV #356

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/make_wheel_Linux_x86.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ fi

# if tensorflow version >= 2.6.0 and <= 2.11.9
if [[ "$TF_VERSION" =~ ^2\.([6-9]|10|11)\.[0-9]$ ]] ; then
export BUILD_IMAGE="tfra/nosla-cuda11.2.1-cudnn8-ubuntu20.04-manylinux2014-python$PY_VERSION"
export BUILD_IMAGE="tfra/nosla-cuda11.2.2-cudnn8-ubuntu20.04-manylinux2014-python$PY_VERSION"
export TF_CUDA_VERSION="11.2"
export TF_CUDNN_VERSION="8.1"
elif [ $TF_VERSION == "2.4.1" ] ; then
Expand Down
8 changes: 8 additions & 0 deletions WORKSPACE
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,14 @@ http_archive(
url = "https://github.com/sewenew/redis-plus-plus/archive/refs/tags/1.2.3.zip",
)

http_archive(
name = "hkv",
build_file = "//build_deps/toolchains/hkv:hkv.BUILD",
sha256 = "3839f91b703b401fd6d2449c034662b6f8d6563e5b9b71b4c25b217cf1cd63fd",
strip_prefix = "HierarchicalKV-0.1.0-beta.8",
url = "https://github.com/NVIDIA-Merlin/HierarchicalKV/archive/refs/tags/v0.1.0-beta.8.tar.gz",
)

tf_configure(
name = "local_config_tf",
)
Expand Down
2 changes: 0 additions & 2 deletions build_deps/toolchains/gpu/cuda_configure.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,6 @@ _DEFAULT_CUDA_COMPUTE_CAPABILITIES = {

_DEFAULT_CUDA_COMPUTE_CAPABILITIES.update(
{"11.{}".format(v): [
"6.0",
"6.1",
"7.0",
"7.5",
"8.0",
Expand Down
Empty file added build_deps/toolchains/hkv/BUILD
Empty file.
18 changes: 18 additions & 0 deletions build_deps/toolchains/hkv/hkv.BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
load("@local_config_cuda//cuda:build_defs.bzl", "if_cuda", "if_cuda_is_configured")

package(default_visibility = ["//visibility:public"])

cc_library(
name = "hkv",
hdrs = glob([
"include/merlin/core_kernels/*.cuh",
"include/merlin/*.cuh",
"include/*.cuh",
"include/*.hpp",
]),
copts = [
"-Ofast",
],
include_prefix = "include",
includes = ["include"],
)
6 changes: 6 additions & 0 deletions docs/api_docs/tfra/dynamic_embedding.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,12 @@ Export dynamic_embedding APIs.

[`class ModelMode`](../tfra/dynamic_embedding/ModelMode.md): The global config of model modes.

[`class HkvHashTable`](../tfra/dynamic_embedding/HkvHashTable.md): A generic mutable hash table implementation.

[`class HkvHashTableConfig`](../tfra/dynamic_embedding/HkvHashTableConfig.md): HkvHashTableConfig config init_capacity, max_capacity, max_hbm_for_values of HkvHashTable

[`class HkvHashTableCreator`](../tfra/dynamic_embedding/HkvHashTableCreator.md): A generic KV table creator.

[`class RedisTable`](../tfra/dynamic_embedding/RedisTable.md): A generic mutable hash table implementation.

[`class RedisTableConfig`](../tfra/dynamic_embedding/RedisTableConfig.md): RedisTableConfig config json file for connecting Redis service and
Expand Down
7 changes: 6 additions & 1 deletion docs/api_docs/tfra/dynamic_embedding/CuckooHashTable.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,8 @@ remove method. It does not support initialization via the init method.
```python
table = tfra.dynamic_embedding.CuckooHashTable(key_dtype=tf.string,
value_dtype=tf.int64,
default_value=-1)
default_value=-1,
device=['/GPU:0'])
sess.run(table.insert(keys, values))
out = table.lookup(query_keys)
print(out.eval())
Expand Down Expand Up @@ -106,6 +107,10 @@ A `CuckooHashTable` object.
* <b>`ValueError`</b>: If checkpoint is True and no name was specified.


## <b>`Important update!!`</b>

We have made updates to the underlying implementation of the CuckooHashTable. The original CPU table remains unchanged, but the GPU table now uses the HKV implementation instead of nvhash. To ensure interface consistency, the init_capacity and max_capacity of HKV will be set to the init_size value you pass in. It is important to note that after this setting, the GPU hash table will not automatically resize, and the final capacity will be the same as the init_size. The max_hbm_for_values parameter of hkv will be set to a sufficiently large number to ensure that all your data is stored in the GPU table. Additionally, hkv has requirements for GPU compute capability, which needs to be 8.0 or above. For more detailed information about HKV, please refer to the documentation of HKV.


## Properties

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,13 +41,10 @@ class for creating the real KV table backend(TF resource).

#### Example usage:


Due to CuckooHashTableConfig include nothing for parameter default satisfied. Just setting the parameter saver is enough.

```python
redis_config1=tfra.dynamic_embedding.RedisTableConfig(
redis_config_abs_dir="xx/yy.json"
)
redis_creator1=tfra.dynamic_embedding.RedisTableCreator(redis_config1)
cuckoo_creator=tfra.dynamic_embedding.CuckooHashTableCreator(saver=de.FileSystemSaver())
```

<h2 id="__init__"><code>__init__</code></h2>
Expand Down
Loading