You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I understand correctly, there may be some performance considerations when using mappers such as scale_to_z_score_per_key and either setting or leaving unset key_vocabulary_filename. The documentation makes it sound like it's simply a matter of whether the keys fit into memory or not. Please correct me if I am wrong, but it seems like if you leave key_vocabulary_filename=None, then it will do the lookups in memory via map_per_key_reductions which can be very inefficient if the number of keys is more than just a handful. On the other hand, setting key_vocabulary_filename will create a StaticHashTable and lookups will be much more efficient.
If my understanding is correct, it would be good to note this in the docs so that other people can decide what is best for their use case.
The text was updated successfully, but these errors were encountered:
If I understand correctly, there may be some performance considerations when using mappers such as
scale_to_z_score_per_key
and either setting or leaving unsetkey_vocabulary_filename
. The documentation makes it sound like it's simply a matter of whether the keys fit into memory or not. Please correct me if I am wrong, but it seems like if you leavekey_vocabulary_filename=None
, then it will do the lookups in memory viamap_per_key_reductions
which can be very inefficient if the number of keys is more than just a handful. On the other hand, settingkey_vocabulary_filename
will create aStaticHashTable
and lookups will be much more efficient.If my understanding is correct, it would be good to note this in the docs so that other people can decide what is best for their use case.
The text was updated successfully, but these errors were encountered: