Skip to content

Commit

Permalink
Added idxmax implementation (#25)
Browse files Browse the repository at this point in the history
* Added idxmax implementation, tests and documentation

* fix error change python version in documentation

* Change implementation, add tests and improve documentation

* fix typo error

* Alternative implementation of idxmax (#33)

Co-authored-by: Jesús López-González <[email protected]>

---------

Co-authored-by: Jesús López-González <[email protected]>
Co-authored-by: Jesús López-González <[email protected]>
Co-authored-by: Jesús López-González <[email protected]>
  • Loading branch information
4 people authored Feb 13, 2024
1 parent 7ef4142 commit 4f59edd
Show file tree
Hide file tree
Showing 3 changed files with 102 additions and 1 deletion.
68 changes: 67 additions & 1 deletion docs/user-guide/advanced/Pandas_API.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2685,7 +2685,7 @@
"\n",
"| Type | Description |\n",
"| :----------------: | :------------------------------------------------------------------- |\n",
"| Dictionary | A dictionary where the key represent the column name / row number and the values are the result of calling `max` on that column / row. |"
"| Dictionary | A dictionary where the key represent the column name / row number and the values are the result of calling `max` on that column / row. |"
]
},
{
Expand Down Expand Up @@ -2764,6 +2764,72 @@
"tab.idxmin(axis=1, numeric_only=True)"
]
},
{
"cell_type": "markdown",
"id": "d98b298c",
"metadata": {},
"source": [
"### Table.idxmax()\n",
"\n",
"```\n",
"Table.idxmax(axis=0, skipna=True, numeric_only=False)\n",
"```\n",
"\n",
"Return index of first occurrence of maximum over requested axis.\n",
"\n",
"**Parameters:**\n",
"\n",
"| Name | Type | Description | Default |\n",
"| :----------: | :--: | :------------------------------------------------------------------------------- | :-----: |\n",
"| axis | int | The axis to calculate the idxmax across 0 is columns, 1 is rows. | 0 |\n",
"| skipna | bool | Ignore any null values along the axis. | True |\n",
"| numeric_only | bool | Only use columns of the table that are of a numeric data type. | False |\n",
"\n",
"**Returns:**\n",
"\n",
"| Type | Description |\n",
"| :----------------: | :------------------------------------------------------------------- |\n",
"| Dictionary | A dictionary where the key represents the column name / row number and the values are the result of calling `idxmax` on that column / row. |"
]
},
{
"cell_type": "markdown",
"id": "143f5483",
"metadata": {},
"source": [
"**Examples:**\n",
"\n",
"Calculate the idxmax across the columns of a table"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "da7cbf8f",
"metadata": {},
"outputs": [],
"source": [
"tab.idxmax()"
]
},
{
"cell_type": "markdown",
"id": "fb531e00",
"metadata": {},
"source": [
"Calculate the idxmax across the rows of a table using only columns thar are of a numeric data type"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9907226a",
"metadata": {},
"outputs": [],
"source": [
"tab.idxmax(axis=1, numeric_only=True)"
]
},
{
"cell_type": "markdown",
"id": "301ab2c2",
Expand Down
12 changes: 12 additions & 0 deletions src/pykx/pandas_api/pandas_meta.py
Original file line number Diff line number Diff line change
Expand Up @@ -283,6 +283,18 @@ def min(self, axis=0, skipna=True, numeric_only=False):
res
), cols)

@convert_result
def idxmax(self, axis=0, skipna=True, numeric_only=False):
tab = self
axis = q('{$[11h~type x; `index`columns?x; x]}', axis)
res, cols, ix = preparse_computations(tab, axis, skipna, numeric_only)
return (q(
'''{[row;tab;axis]
row:{$[11h~type x; {[x1; y1] $[x1 > y1; x1; y1]} over x; max x]} each row;
m:$[0~axis; (::); flip] value flip tab;
$[0~axis; (::); cols tab] m {$[abs type y;x]?y}' row}
''', res, tab[ix], axis), cols)

@convert_result
def idxmin(self, axis=0, skipna=True, numeric_only=False):
tab = self
Expand Down
23 changes: 23 additions & 0 deletions tests/test_pandas_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -1964,6 +1964,29 @@ def test_pandas_max(q):
assert float(qmax[i]) == float(pmax[i])


def test_pandas_idxmax(q):
tab = q('([] sym: 100?`foo`bar`baz`qux; price: 250.0f - 100?500.0f; ints: 100 - 100?200)')
df = tab.pd()

p_m = df.idxmax()
q_m = tab.idxmax()
for c in q.key(q_m).py():
assert p_m[c] == q_m[c].py()

q_m = tab.idxmax(axis=1, numeric_only=True, skipna=True)
p_m = df.idxmax(axis=1, numeric_only=True, skipna=True)
for c in q.key(q_m).py():
assert p_m[c] == q_m[c].py()

tab = q('([]price: 250.0f - 100?500.0f; ints: 100 - 100?200)')
df = tab.pd()

q_m = tab.idxmax(axis=1)
p_m = df.idxmax(axis=1)
for c in q.key(q_m).py():
assert p_m[c] == q_m[c].py()


def test_pandas_idxmin(q):
tab = q('([] sym: 100?`foo`bar`baz`qux; price: 250.0f - 100?500.0f; ints: 100 - 100?200)')
df = tab.pd()
Expand Down

0 comments on commit 4f59edd

Please sign in to comment.