Clarify performance considerations for mappers like scale_to_z_score_per_key and key_vocabulary_filename #251

cyc · 2021-10-26T16:52:28Z

If I understand correctly, there may be some performance considerations when using mappers such as scale_to_z_score_per_key and either setting or leaving unset key_vocabulary_filename. The documentation makes it sound like it's simply a matter of whether the keys fit into memory or not. Please correct me if I am wrong, but it seems like if you leave key_vocabulary_filename=None, then it will do the lookups in memory via map_per_key_reductions which can be very inefficient if the number of keys is more than just a handful. On the other hand, setting key_vocabulary_filename will create a StaticHashTable and lookups will be much more efficient.

If my understanding is correct, it would be good to note this in the docs so that other people can decide what is best for their use case.

The text was updated successfully, but these errors were encountered:

UsharaniPagadala self-assigned this Nov 2, 2021

UsharaniPagadala added the type:feature label Nov 8, 2021

UsharaniPagadala assigned zoyahav and unassigned UsharaniPagadala Nov 8, 2021

UsharaniPagadala added the stat:awaiting tensorflower label Nov 8, 2021

zoyahav assigned iindyk Nov 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify performance considerations for mappers like scale_to_z_score_per_key and key_vocabulary_filename #251

Clarify performance considerations for mappers like scale_to_z_score_per_key and key_vocabulary_filename #251

cyc commented Oct 26, 2021

Clarify performance considerations for mappers like scale_to_z_score_per_key and key_vocabulary_filename #251

Clarify performance considerations for mappers like scale_to_z_score_per_key and key_vocabulary_filename #251

Comments

cyc commented Oct 26, 2021