Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build: "automagic" builder is broken for some Python versions #9562

Closed
humitos opened this issue Aug 30, 2022 · 7 comments
Closed

Build: "automagic" builder is broken for some Python versions #9562

humitos opened this issue Aug 30, 2022 · 7 comments
Assignees
Labels
Accepted Accepted issue on our roadmap Improvement Minor improvement to code

Comments

@humitos
Copy link
Member

humitos commented Aug 30, 2022

Read the Docs executes some code in the build process on behalf the user to facilitate the on boarding process. This means that you can just put some .rst or .md files in your repository and Read the Docs will create a nice documentation site for you without requiring any type of configuration.

This is a nice feature for new users without too much knowledge on Sphinx or MkDocs. However, it has bitten us and also users due that sometimes there are unexpected errors that are hard to debug due to a disparity between Read the Docs and a local build for the same exact commit.

That said, this "automagic" builder is currently broken 😞 . If you follow the example that I mentioned (just put some .rst files in your repository without a conf.py file), Read the Docs will create one for you. Well, that conf.py generated by Read the Docs contains a problematic line:

from recommonmark.parser import CommonMarkParser

that fails with the following error:

Traceback (most recent call last):
  File "/home/docs/checkouts/readthedocs.org/user_builds/test-builds/envs/no-conf-py/lib/python3.9/site-packages/sphinx/config.py", line 347, in eval_config_file
    exec(code, namespace)
  File "/home/docs/checkouts/readthedocs.org/user_builds/test-builds/checkouts/no-conf-py/docs/conf.py", line 7, in <module>
    from recommonmark.parser import CommonMarkParser
  File "/home/docs/checkouts/readthedocs.org/user_builds/test-builds/envs/no-conf-py/lib/python3.9/site-packages/recommonmark/parser.py", line 9, in <module>
    from commonmark import Parser
  File "/home/docs/checkouts/readthedocs.org/user_builds/test-builds/envs/no-conf-py/lib/python3.9/site-packages/commonmark/__init__.py", line 4, in <module>
    from commonmark.main import commonmark
  File "/home/docs/checkouts/readthedocs.org/user_builds/test-builds/envs/no-conf-py/lib/python3.9/site-packages/commonmark/main.py", line 14, in <module>
    from commonmark.blocks import Parser
  File "/home/docs/checkouts/readthedocs.org/user_builds/test-builds/envs/no-conf-py/lib/python3.9/site-packages/commonmark/blocks.py", line 5, in <module>
    from commonmark import common
  File "/home/docs/checkouts/readthedocs.org/user_builds/test-builds/envs/no-conf-py/lib/python3.9/site-packages/commonmark/common.py", line 14, in <module>
    HTMLunescape = html.parser.HTMLParser().unescape
AttributeError: 'HTMLParser' object has no attribute 'unescape'

Note that recommonmark has been deprecated since May 2021 (readthedocs/commonmark.py#308) and its repository got archived. So, I don't think this issue will be fixed.

Moving forward

There are different alternatives that we can follow here to workaround and/or fix the problem:

0. Force installing commonmark==0.9.1

1. Replace recommonmark by MyST in the conf.py.tmpl

  • this solution would allow us to keep the functionality around but using better and supported tools
  • we should check compatibility with old versions of Sphinx (does MyST work with Sphinx 1.8?)
  • remove recommonmark from Python packages installed and install myst-parser instead
  • this will have an impact on all the builds, even if they don't currently use recommonmark and/or myst-parser

2. Remove the auto-generated conf.py functionality

This is the solution I'd like to discuss a little more. IMO, it seems that not many users are using this feature currently. I took a quick look at the logs (https://onenr.io/0LwG6A5yvQ6) and I found some usage there, but not a ton. It seems the currently is based on the Python version used: it breaks with 3.9 but it works with 3.7.

I parsed the logs manually with grep on util01 and I found that 2543 projects used this feature in the last 90 days. However, I don't know how many of them had successful builds 😄 or if they are spam or not. We could get better data if we spend some more time on this, tho. On Read the Docs for Business, we have 135 projects using this feature in the last 90 days --which is probably where this feature is more useful, if any.

We have been talking about reducing the magic that Read the Docs does on behalf the user, and I think this is a good opportunity to remove this dark magic that produces unexpected errors for users on builds. Besides, we also talked about educating our users suggesting them to use a config file, a requirements file and pin their dependencies properly following our https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html

I think it's good to have a nice and quick on boarding experience. However, I think we should work on making the "the proper and recommended way" (reproducible builds) as part of the on boarding experience, making it simple, nice and quick. Instead of on boarding users in a dirty way (magic things) and then tell them that's not a good way to use our platform.

Note that yesterday I received a support request about this and I thought the user was using recommonmark and I told them to use something newer like MyST. Then the user replied that they are not using recommonmark and they don't know how that got installed. Turned out that we are installing it and configuring it on behalf the user 🙃 . I was so confused.

@humitos humitos added the Needed: design decision A core team decision is required label Aug 30, 2022
humitos added a commit that referenced this issue Aug 30, 2022
Currently, we are installing an older version of `commonmark` that has some
problems with newer versions of Python. Upgrading it to 0.9.1 seems to solve
this problem.

See #9562 for a detailed explanation of the problem
@humitos
Copy link
Member Author

humitos commented Aug 30, 2022

I parsed the logs manually with grep on util01 and I found that 2543 projects used this feature in the last 90 days. However, I don't know how many of them had successful builds smile or if they are spam or not. We could get better data if we spend some more time on this, tho. On Read the Docs for Business, we have 135 projects using this feature in the last 90 days --which is probably where this feature is more useful, if any.

I ended up using New Relic for this, which I think is more trust-able and accurate than my grepping, 😅

Projects in the last 3 months where Read the Docs auto-created the Sphinx/MkDocs configuration file magically:

Note that this includes spam projects and it does not differentiate failed from successful builds. The real number of project effectively using this feature is lower than these ones.

humitos added a commit that referenced this issue Aug 30, 2022
Currently, we are installing an older version of `commonmark` that has some
problems with newer versions of Python. Upgrading it to 0.9.1 seems to solve
this problem.

See #9562 for a detailed explanation of the problem
humitos added a commit that referenced this issue Aug 30, 2022
Currently, we are installing an older version of `commonmark` that has some
problems with newer versions of Python. Upgrading it to 0.9.1 seems to solve
this problem.

See #9562 for a detailed explanation of the problem
humitos added a commit that referenced this issue Sep 1, 2022
Currently, we are installing an older version of `commonmark` that has some
problems with newer versions of Python. Upgrading it to 0.9.1 seems to solve
this problem.

See #9562 for a detailed explanation of the problem
@humitos
Copy link
Member Author

humitos commented Sep 5, 2022

We implemented "0. Force installing commonmark==0.9.1" as a workaround for now in #9563

@agjohnson
Copy link
Contributor

I agree that option number 2 might be the best overall way to address this one. When I've run into builds that used this feature, the project always falls into one of these categories:

  • A project from a new user, who is very lost
  • A project squatting on a namespace
  • Spam project

Obviously, we shouldn't care about squatting or spam projects. For new users though, we now have much better resources, and this automatic content feature doesn't really serve much purpose.

With better educational resources like example projects, the experience of misconfiguring a project would be much nicer if we were explicit about how to proceed. Masking misconfiguration issues by generating output for the user doesn't help them solve their problem, where giving a build failure would almost certainly help the user more.

If the user really just wants to see how Read the Docs works, by not giving us input, we should point them to using an example project first.

@ericholscher
Copy link
Member

ericholscher commented Sep 6, 2022

Yea, 👍 on removing the conf.py generation. I don't think that's meaningfully helping anyone. I'd love to combine this with a nicer onboarding and error messages in this case though, if possible. It would be nice to at least link these users to our tutorial.

@humitos
Copy link
Member Author

humitos commented Sep 7, 2022

OK. I think we are all on board about moving forward with 2. Remove the auto-generated conf.py functionality. I'm marking this issue as "Accepted" and putting it under the "Roadmap" project so we prioritize in the following weeks/months.

It seems the required work would be:

  • use the data from NR combined to our database to know what are the exact projects using this feature (non-spam project with successful builds in last 6 months)
  • follow the not-yet-defined process to deprecate this feature
  • remove the auto-generated conf.py functionality from our codebase
  • make the build error pretty clear about what happened
  • point those users to the tutorial

Feel free to edit this list to add anything that I may be missing here.

@humitos humitos added Improvement Minor improvement to code Accepted Accepted issue on our roadmap and removed Needed: design decision A core team decision is required labels Sep 7, 2022
@humitos humitos moved this to Planned in 📍Roadmap Sep 7, 2022
@humitos
Copy link
Member Author

humitos commented Jun 5, 2023

I hope our situation regarding this issue will improve a lot due to the enforcing of a YAML config file we are doing at https://blog.readthedocs.com/migrate-configuration-v2/. I'm adding this issue to Q4, so we can review the situation at that point after the migrations "is done".

@humitos humitos self-assigned this Jun 30, 2023
humitos added a commit that referenced this issue Jul 4, 2023
I created a new feature flag called `INSTALL_LATEST_CORE_REQUIREMENTS` that does
not depend on any other feature flag. Its goal is to always install the latest
Python packages.

We are trying to move away from pinning the dependencies on our side and telling
users to do that on their side if they want to have reproducible builds over
time.

I think this is the best outcome for new projects since they will immediately
use the latest versions of everything instead of old (and maybe broken)
dependencies.

I propose to set a particular date for this feature flag and make new projects
to start building with all the latest dependencies. This will, at least, stop
making the problem bigger. Later, we can decide how to communicate to old
projects that we are going to install the latest requirements and if they don't
want that, they should pin their dependencies. We can follow our new and shiny
deprecation path we have built and tested in the last month.

Related readthedocs/meta#8
Related #9562
Related #10402
Related #9081
humitos added a commit that referenced this issue Jul 12, 2023
* Build: install all the latest Python "core requirements"

I created a new feature flag called `INSTALL_LATEST_CORE_REQUIREMENTS` that does
not depend on any other feature flag. Its goal is to always install the latest
Python packages.

We are trying to move away from pinning the dependencies on our side and telling
users to do that on their side if they want to have reproducible builds over
time.

I think this is the best outcome for new projects since they will immediately
use the latest versions of everything instead of old (and maybe broken)
dependencies.

I propose to set a particular date for this feature flag and make new projects
to start building with all the latest dependencies. This will, at least, stop
making the problem bigger. Later, we can decide how to communicate to old
projects that we are going to install the latest requirements and if they don't
want that, they should pin their dependencies. We can follow our new and shiny
deprecation path we have built and tested in the last month.

Related readthedocs/meta#8
Related #9562
Related #10402
Related #9081

* Build: remove deprecated dependencies

* Build: install latest core requirements when using Conda

* Docs: update default versions documentation to match the changes

* Feedback from review

Do not install `sphinx-rtd-theme` and clarify the docstring.

* Update documentation
@humitos
Copy link
Member Author

humitos commented Jul 12, 2023

remove the auto-generated conf.py functionality from our codebase
make the build error pretty clear about what happened

This is already happening at #2483 and readthedocs/blog#229

@humitos humitos closed this as completed Jul 12, 2023
@github-project-automation github-project-automation bot moved this from Planned to Done in 📍Roadmap Jul 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Accepted issue on our roadmap Improvement Minor improvement to code
Projects
Archived in project
Development

No branches or pull requests

3 participants