diff --git a/README.md b/README.md
index aeec4ef70..b4d372356 100644
--- a/README.md
+++ b/README.md
@@ -30,7 +30,8 @@ The goal is to offer high-level APIs for developers to quickly get started in wo
- [Scaling transforms from laptop to cluster](#laptop_cluster)
- [Repository Use and Navigation](doc/repo.md)
- [How to Contribute](CONTRIBUTING.md)
-- [Papers and Talks](#talks_papers)
+- [Talks and Papers](#talks_papers)
+- [Citations](#citations)
## 📖 About
@@ -131,7 +132,7 @@ The matrix below shows the the combination of modules and supported runtimes. Al
| **Data Ingestion** | | | | |
| [Code (from zip) to Parquet](transforms/code/code2parquet/python/README.md) | :white_check_mark: | :white_check_mark: | | :white_check_mark: |
| [PDF to Parquet](transforms/language/pdf2parquet/python/README.md) | :white_check_mark: | :white_check_mark: | | :white_check_mark: |
-| [HTML to Parquet](transforms/universal/html2parquet/python/README.md) | :white_check_mark: | | | |
+| [HTML to Parquet](transforms/language/html2parquet/python/README.md) | :white_check_mark: | :white_check_mark: | | |
| **Universal (Code & Language)** | | | | |
| [Exact dedup filter](transforms/universal/ededup/ray/README.md) | :white_check_mark: | :white_check_mark: | | :white_check_mark: |
| [Fuzzy dedup filter](transforms/universal/fdedup/ray/README.md) | | :white_check_mark: | | :white_check_mark: |
@@ -220,3 +221,23 @@ You can run transforms via docker image or using virtual environments. This [doc
5. Talk on "Hands on session for fine tuning LLMs" [Video](https://www.youtube.com/watch?v=VEHIA3E64DM)
6. Talk on "Build your own data preparation module using data-prep-kit" [Video](https://www.youtube.com/watch?v=0WUMG6HIgMg)
+## Citations
+
+If you use Data Prep Kit in your research, please cite our paper:
+
+```bash
+@misc{wood2024dataprepkitgettingdataready,
+ title={Data-Prep-Kit: getting your data ready for LLM application development},
+ author={David Wood and Boris Lublinsky and Alexy Roytman and Shivdeep Singh
+ and Abdulhamid Adebayo and Revital Eres and Mohammad Nassar and Hima Patel
+ and Yousaf Shah and Constantin Adam and Petros Zerfos and Nirmit Desai
+ and Daiki Tsuzuku and Takuya Goto and Michele Dolfi and Saptha Surendran
+ and Paramesvaran Selvam and Sungeun An and Yuan Chi Chang and Dhiraj Joshi
+ and Hajar Emami-Gohari and Xuan-Hong Dang and Yan Koyfman and Shahrokh Daijavad},
+ year={2024},
+ eprint={2409.18164},
+ archivePrefix={arXiv},
+ primaryClass={cs.AI},
+ url={https://arxiv.org/abs/2409.18164},
+}
+```
\ No newline at end of file