Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Algoria docsearch results include dupulicated word #1002

Open
Shinyaigeek opened this issue Feb 13, 2022 · 6 comments
Open

Algoria docsearch results include dupulicated word #1002

Shinyaigeek opened this issue Feb 13, 2022 · 6 comments

Comments

@Shinyaigeek
Copy link

Shinyaigeek commented Feb 13, 2022

Hi team 👋

I found the issue on parceljs website. When we type some word in search input, the result returned by Algeria docsearch includes duplicated hit word. This is not a critical issue but bothering.

At a glance, this issue seems to be caused by Docsearch config file. I feel to specify some entry points with the same content to ignore with stop_urls field will fix this.

FYI: https://docsearch.algolia.com/docs/legacy/faq/#why-do-i-have-duplicate-content-in-my-results

However, I do not have any access to this config file, so I cannot see it to check and also cannot fix it 😭 , so I’m sorry if my intuition is wrong 🙇‍♂️ .

Screenshot 📷

スクリーンショット 2022-02-14 2 17 47

@kidonng
Copy link
Contributor

kidonng commented Feb 28, 2022

I posted a similar issue for Deno website as well: denoland/dotland#1879

But in Parcel website's case it should have only allowed slashed (/foo/bar/) or unslashed (/foo/bar) pathnames, not both.

@mischnic
Copy link
Member

@DeMoorJasper any ideas? I don't even know where the config for Algolia lives

@DeMoorJasper
Copy link
Member

@DeMoorJasper any ideas? I don't even know where the config for Algolia lives

Yes will fix this later today

@DeMoorJasper
Copy link
Member

DeMoorJasper commented Feb 28, 2022

The more recent docsearch docs (https://docsearch.algolia.com/docs/crawler/#why-do-i-have-duplicate-content-in-my-results) mention we can use canonical links to let the algolia crawler know which pages are duplicates. For now I've worked around it by changing the config to exclude all paths that don't end in a slash (at least I think it worked)

@kidonng
Copy link
Contributor

kidonng commented Feb 28, 2022

@DeMoorJasper Thanks for the info! That would also be useful for the Deno website.

@Shinyaigeek
Copy link
Author

@DeMoorJasper Thank you for fixing and investigating!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants