doc: add pandas code and use markdown tables

Lifemap-ToL · Apr 9, 2024 · 44be57a · 44be57a
1 parent 72e8b4a
commit 44be57a
Showing 1 changed file with 54 additions and 27 deletions.
diff --git a/doc/getting_started.qmd b/doc/getting_started.qmd
@@ -14,11 +14,15 @@ To create a Lifemap data visualization, you will have to follow these steps:
 
 ## Prepare your data
 
-The date you want to visualize on the Lifemap tree of life must be in a [pandas](https://pandas.pydata.org) or [polars](https://pola.rs) DataFrame. They must contain at least observations (species) as rows, and variables as columns, and at least one column must contain the NCBI taxonomy identifier of the species.
+The data you want to visualize on the Lifemap tree of life must be in a [pandas](https://pandas.pydata.org) or [polars](https://pola.rs) DataFrame. They must contain observations (species) as rows, and variables as columns, and one column must contain the NCBI taxonomy identifier of the species.
 
-`pylifemap` includes an example polars data file generated from [The IUCN Red List of Threatened Species](https://www.gbif.org/dataset/19491596-35ae-4a91-9a98-85cf505f1bd3). It is a CSV file with the Red List category (in 2022) of more than 84000 species.
+`pylifemap` includes an example dataset generated from [The IUCN Red List of Threatened Species](https://www.gbif.org/dataset/19491596-35ae-4a91-9a98-85cf505f1bd3). It is a CSV file with the Red List category (in 2022) of more than 84000 species.
 
-We can import it as a polars DataFrame with the following code:
+We can import it as a polars or pandas DataFrame with the following code:
+
+::: {.panel-tabset}
+
+## Polars
 
 ```{python}
 import polars as pl
@@ -28,7 +32,20 @@ iucn = pl.read_csv(
 )
 ```
 
-If we display the resulting table, we can see that it only has two columns, one called `taxid` which contains the species identifiers, and another called `status` with the Red List category of each species:
+## Pandas
+
+```{python}
+#| eval: false
+import pandas as pd
+
+iucn = pd.read_csv(
+    "https://raw.githubusercontent.com/juba/pylifemap/main/data/iucn.csv"
+)
+```
+
+:::
+
+The resulting table only has two columns: `taxid`, which contains the species identifiers, and `status`, with the Red List category of each species.
 
 ```{python}
 iucn
@@ -59,24 +76,19 @@ Lifemap(iucn, taxid_col="taxid", width="100%", height=800)
 
 After initializing our `Lifemap` object, we have to add visualization layers to create graphical representations. There are several different layers available:
 
-<table class="table">
-<thead>
-<tr><th>Layer</th><th>Description</th></tr>
-</thead>
-<tbody>
-<tr><td>[layer_points](layers/layer_points.qmd)</td><td>Displays each observation with a point. Radius and color can be dependent of an  attribute in the DataFrame.</td></tr>
-<tr><td>[layer_lines](layers/layer_lines.qmd)</td><td>Using aggregated data, highlights branches of the tree by lines of varying width and color.</td></tr>
-<tr><td>[layer_donuts](layers/layer_donuts.qmd)</td><td>Displays aggregated categorical data as donut charts.</td></tr>
-<tr><td>[layer_heatmap](layers/layer_heatmap.qmd)</td><td>Displays a heatmap of the observations distribution in the tree.</td></tr>
-<tr><td>[layer_screengrid](layers/layer_screengrid.qmd)</td><td>Displays the observations distribution with a colored grid with fixed-size cells..</td></tr>
-</tbody>
-</table>
+| Layer                                           | Description                                                                                                  |
+| :---------------------------------------------- | :----------------------------------------------------------------------------------------------------------- |
+| [layer_points](layers/layer_points.qmd)         | Displays each observation with a point. Radius and color can be dependent of an  attribute in the DataFrame. |
+| [layer_lines](layers/layer_lines.qmd)           | Using aggregated data, highlights branches of the tree with lines of varying width and color.                |
+| [layer_donuts](layers/layer_donuts.qmd)         | Displays aggregated categorical data as donut charts.                                                        |
+| [layer_heatmap](layers/layer_heatmap.qmd)       | Displays a heatmap of the observations distribution in the tree.                                             |
+| [layer_screengrid](layers/layer_screengrid.qmd) | Displays the observations distribution with a colored grid with fixed-size cells..                           |
 
 To add a layer, we just have to call the corresponding `layer_` method of our `Lifemap` object. For example, to add a points layer:
 
 ```{python}
 #| eval: false
-éLifemap(iucn, taxid_col="taxid").layer_points()
+Lifemap(iucn, taxid_col="taxid").layer_points()
 ```
 
 We can add several layers by calling several methods. For example we could display a heatmap layer, and a points layer above it:
@@ -110,6 +122,7 @@ Lifemap(iucn, taxid_col="taxid").layer_points().save("lifemap.html")
 
 Each layer accepts a certain number of arguments to customize its appearance. For example we can change the radius and opacity of our points and make their color depend on their `status` value:
 
+
 ```{python}
 (
     Lifemap(iucn, taxid_col="taxid")
@@ -124,23 +137,37 @@ Each layer accepts a certain number of arguments to customize its appearance. Fo
 
 `pylifemap` provides several aggregation functions that allow to aggregate data along the branches of the tree:
 
-<table class="table">
-<thead>
-<tr><th>Function</th><th>Description</th></tr>
-</thead>
-<tbody>
-<tr><td>[aggregate_count](`~pylifemap.aggregations.aggregate_count`)</td><td>Aggregates the number of children of each tree node.</td></tr>
-<tr><td>[aggregate_num](`~pylifemap.aggregations.aggregate_num`)</td><td>Aggregates a numerical variable along the tree branches with a given function (sum, mean, max...).</td></tr>
-<tr><td>[aggregate_freq](`~pylifemap.aggregations.aggregate_freq`)</td><td>Aggregates the frequencies of the levels of a categorical variable.</td></tr>
-</tbody>
-</table>
+
+
+| Function                                                     | Description                                                                                         |
+| :----------------------------------------------------------- | :-------------------------------------------------------------------------------------------------- |
+| [aggregate_count](`~pylifemap.aggregations.aggregate_count`) | Aggregates the number of children of each tree node.                                                |
+| [aggregate_num](`~pylifemap.aggregations.aggregate_num`)     | Aggregates a numerical variable along the tree branches with a given function (sum , mean, max...). |
+| [aggregate_freq](`~pylifemap.aggregations.aggregate_freq`)   | Aggregates the frequencies of the levels of a categorical variable.                                 |
+
+
 
 For example, we could filter out in our data set the species which have an "extinct" status:
 
+
+::: {.panel-tabset}
+
+## Polars
+
 ```{python}
 iucn_extinct = iucn.filter(pl.col("status") == "Extinct")
 ```
 
+## Pandas
+
+```{python}
+#| eval: false
+iucn_extinct = iucn[iucn["status"] == "Extinct"]
+```
+
+:::
+
+
 We can then aggregate their count along the branches with [aggregate_count](`~pylifemap.aggregations.aggregate_count`):
 
 ```{python}