Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS native host resolving no longer works with vz VMs since v1.0.0 #2939

Open
dhealio opened this issue Nov 22, 2024 · 18 comments · Fixed by #2964
Open

DNS native host resolving no longer works with vz VMs since v1.0.0 #2939

dhealio opened this issue Nov 22, 2024 · 18 comments · Fixed by #2964
Labels
bug Something isn't working component/dns component/vz regression Used to work but has been broken

Comments

@dhealio
Copy link

dhealio commented Nov 22, 2024

Description

When using the vz VM and not using user-v2 networking, the DNS behavior mentioned in the documentation is no longer working as advertised when hostResolver.enabled is true (or unassigned).

This prevents being able to use the native OS resolver, which is needed for VPN configurations, split-DNS setups, as well as mDNS, local /etc/hosts etc.

The breaking change seems to be here, which prevents args.UDPDNSLocalPort etc. from getting assigned.

@dhealio dhealio changed the title DNS hostResolver no longer works with vz VMs since v1.0.0 DNS native host resolving no longer works with vz VMs since v1.0.0 Nov 22, 2024
@nirs nirs added bug Something isn't working component/vz component/dns labels Nov 23, 2024
@dhealio
Copy link
Author

dhealio commented Nov 27, 2024

To be more specific, DNS resolving of VPN hostnames on macOS was working on v0.23.2 when starting lima with the default configuration/parameters, but no longer works with v1.0.0 or v1.0.1.

FWIW, I have also tried starting lima with user-v2 networking via limactl start --network=lima:user-v2, but I am still unable to resolve VPN hostnames within lima. This seems to contradict the behavior mentioned here, but perhaps I am doing something wrong.

If this is by design, then how would I go about configuring lima to allow DNS resolving of VPN hostnames? The documentation should be updated to reflect any changes of behavior in this regard.

Otherwise, this is a breaking change that I'm hoping can be addressed soon.

@nirs nirs added the regression Used to work but has been broken label Nov 28, 2024
@nirs
Copy link
Member

nirs commented Nov 28, 2024

Based on the comment this a regression.

@balajiv113
Copy link
Member

@dhealio
The reason for that condition is, vz uses gvisor implementation for network. For that dns resolution is taken care by gvisor-tap-vsock. Which will take care of reading /etc/hosts to populate nameservers

The change in behaviour is due to gvisor-tap-vsock breaking change. This is fixed later. So after upgrade gvisor-tap-vsock it should work fine. Will check with gvisor-tap-vsock for a release
containers/gvisor-tap-vsock#398

@dhealio
Copy link
Author

dhealio commented Dec 3, 2024

Thank you for the quick response!

It's low priority, but the network documentation would ideally be updated to reflect the changes in behavior mentioned by @balajiv113. This may save others from wasting time troubleshooting configuration issues etc.

@dhealio
Copy link
Author

dhealio commented Dec 3, 2024

I have updated to v1.0.2, and unfortunately I am still seeing the issue (with and without user-v2 networking).

In case it helps, here are some details on my host system:

  • macOS 15.1.1
  • OpenVPN connect client v3.4.9
  • The /etc/resolv.conf file on my host system does not contain any nameserver entries
  • I see a VPN specific resolver when running scutil --dns
  • I am testing with ping, e.g. lima ping host.vpn-domain.com; this returns "Name or service not known"

@subpop
Copy link
Contributor

subpop commented Dec 3, 2024

Using v1.0.2 as well. When I start up my default guest, this gets printed:

INFO[0010] [hostagent] Not forwarding TCP 127.0.0.53:53

That's the address that ends up in /etc/resolv.conf in the guest. Is that relevant?

@jandubois
Copy link
Member

127.0.0.53:53 is the address for systemd-resolved, which will be running inside the VM.

You can run resolvectl status inside the VM to see what it is using to resolve names.

@nirs
Copy link
Member

nirs commented Dec 3, 2024

Should we reopen this issue or open a new one?

@jandubois jandubois reopened this Dec 3, 2024
@subpop
Copy link
Contributor

subpop commented Dec 4, 2024

I've been experimenting with a guest and my hostagent resolver. When the hostagent starts, it successfully starts up the DNS resolver and listens on a random port (in this example, 50563). From my Mac, querying the resolver directly works for both VPN and non-VPN hosts:

dig @127.0.0.1 -p 50563 google.com

I can see the handleQuery received DNS query log entry when this happens. From within a guest, I can also succeed in querying the resolver if I dig at that port.

I modified the cloud-init generated systemd.network file (/run/systemd/network/10-netplan-eth0.network) and set the DNS= entry to 192.168.5.2:50563 (instead of just the IP address) and restarted systemd-networkd. After that, resolvectl query will successfully resolve VPN hosts and non-VPN hosts, and I can see handleQuery log entries.

So it appears that the hostagent resolver is working, but something is misconfigured with the ports so that the resolver is not listening on a port that the guest is querying.

@dhealio
Copy link
Author

dhealio commented Dec 4, 2024

@subpop From my understanding of the comment made by @balajiv113, the functionality that you're referring to is intentionally no longer wired up, with the gvisor-tap-vsock DNS implementation taking over instead.

It seems as though that the gvisor-tap-vsock DNS implementation is still lacking functionality as compared to the hostagent-based resolver. It doesn't appear to be taking into account all the resolvers that are available on the host system. It's also possible that it's not respecting domain-specific resolver rules either.

If an upstream fix is required, it may take some time. If this is the case, I'm hoping that lima offers an alternative solution, such as adding a config option that wires up the original hostagent-resolver, or at least rolls back the logic that prevents using it.

@subpop
Copy link
Contributor

subpop commented Dec 4, 2024

I see. That makes sense in context now. Thank you.

+1 to a template option to use the hostagent resolver. That would be very useful.

@subpop
Copy link
Contributor

subpop commented Dec 4, 2024

I don't know if it's a true workaround, but I was able to get my VPN hostnames resolving in a guest by creating a guest using:

networks:
- vzNAT: true
vmType: vz

This seems functional enough for my needs.

@dhealio
Copy link
Author

dhealio commented Dec 4, 2024

This is another place where the documentation could be improved, i.e. the vzNAT section could be expanded to elaborate on what it does differently and when to use it. etc.

I can confirm that vzNAT does allow queries on the VM itself to work with common utilities such as ping and dig.

Unfortunately, this workaround is not viable for me, as the resolve behavior is somehow not being picked up by select things that are running on the VM (like k3s). This may require having to add custom configuration there to get things to work.

@subpop
Copy link
Contributor

subpop commented Dec 4, 2024

In a Fedora guest using systemd-networkd and systems-resolved, I have not had any issues. On guests running NetworkManager (like CentOS), I had to manually set the ipv4.dns-priority value of the lima0 interface to a number higher than 0 so that it would be listed earlier in the generated /etc/resolv.conf. I did this by adding a connection override in /etc/NetworkManager/conf.d/dnf.conf:

[connection-cloud-init-lima0]
match-device=interface-name:lima0
ipv4.dns-priority=50

@nirs
Copy link
Member

nirs commented Dec 4, 2024

On guests running NetworkManager (like CentOS)

Does this apply to Ubuntu 24.04?

@subpop
Copy link
Contributor

subpop commented Dec 4, 2024

On guests running NetworkManager (like CentOS)

Does this apply to Ubuntu 24.04?

I just created a 24.04 guest and it appears to be using systemd-networkd/systemd-resolved like my Fedora guest. As long as it's created with --network vzNAT and has both an eth0 and a lima0 Ethernet interface, both VPN and non-VPN domains resolve correctly. No additional configuration changes are required. I didn't even need to restart any services. I started the guest and then after it was up, connected to VPN. After connecting to VPN, the VPN domains started resolving within the guest after a second or two of uptime.

@dhealio
Copy link
Author

dhealio commented Dec 12, 2024

Has there been any progress on this?

The previous behavior provided by the host resolver shouldn't be replaced by gvisor-tap-vsock until gvisor-tap-vsock can provide the same level of service.

It's unfortunate to lose functionality. Reverting to v0.23.2 is currently the only option to gain it back.

Every workaround mentioned here introduces new issues and limitations in some form or another. Just to name a few: VPN traffic breaks altogether, k8s implementations running on lima cannot access resolvers on lima's loopback adapter, the priority is wrong when using multiple resolvers, etc.

Could the behavior be reverted (with an option to opt-in to gvisor-tap-vsock)? If not, could an option be added to explicitly use the previous behavior?

tmeijn pushed a commit to tmeijn/dotfiles that referenced this issue Dec 13, 2024
⚠️ **CAUTION: this is a major update, indicating a breaking change!** ⚠️

This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [lima-vm/lima](https://github.com/lima-vm/lima) | major | `v0.23.2` -> `v1.0.2` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>lima-vm/lima (lima-vm/lima)</summary>

### [`v1.0.2`](https://github.com/lima-vm/lima/releases/tag/v1.0.2)

[Compare Source](lima-vm/lima@v1.0.1...v1.0.2)

#### Changes

-   DNS:
    -   Fixed the host resolver regression in v1.0.0 [#&#8203;2939](lima-vm/lima#2939) ([#&#8203;2964](lima-vm/lima#2964))

-   `limactl create`:
    -   Fixed races during parallel downloads ([#&#8203;2903](lima-vm/lima#2903), thanks to [@&#8203;nirs](https://github.com/nirs))
    -   Optimized qcow2-to-raw conversion for vz mode ([#&#8203;2933](lima-vm/lima#2933), thanks to [@&#8203;nirs](https://github.com/nirs))

-   `limactl start-at-login`:
    -   Fixed the support for Linux hosts (systemd) ([#&#8203;2943](lima-vm/lima#2943), thanks to [@&#8203;kachick](https://github.com/kachick))

-   nerdctl:
    -   Updated to [v2.0.1](https://github.com/containerd/nerdctl/releases/tag/v2.0.1) ([#&#8203;2966](lima-vm/lima#2966))

-   Templates:
    -   Updated to the latest revisions ([#&#8203;2936](lima-vm/lima#2936) [#&#8203;2953](lima-vm/lima#2953), thanks to [@&#8203;tcooper](https://github.com/tcooper))

-   Web site:
    -   Added an example of running Lima on GitHub Actions to run commands on non-Ubuntu ([#&#8203;2954](lima-vm/lima#2954)): https://lima-vm.io/docs/examples/gha/

-   Project:
    -   Invite Nir Soffer ([@&#8203;nirs](https://github.com/nirs)) as a Reviewer ([#&#8203;2916](lima-vm/lima#2916), thanks to [@&#8203;jandubois](https://github.com/jandubois))

Full changes: https://github.com/lima-vm/lima/milestone/51?closed=1
Thanks to [@&#8203;SpiffyEight77](https://github.com/SpiffyEight77) [@&#8203;alexandear](https://github.com/alexandear) [@&#8203;jandubois](https://github.com/jandubois) [@&#8203;kachick](https://github.com/kachick) [@&#8203;nirs](https://github.com/nirs) [@&#8203;norio-nomura](https://github.com/norio-nomura) [@&#8203;tamird](https://github.com/tamird) [@&#8203;tcooper](https://github.com/tcooper)

#### Usage

```console
[macOS]$ limactl create
[macOS]$ limactl start
...
INFO[0029] READY. Run `lima` to open the shell.

[macOS]$ lima uname
Linux
```

***

The binaries were built automatically on GitHub Actions.
The build log is available for 90 days: https://github.com/lima-vm/lima/actions/runs/12134682585

The sha256sum of the SHA256SUMS file itself is `02ef78494c498ca4180915ba78d5e2fc471ed401f63dfb2b5864c3711f3c0fb2` .

***

Release manager: [@&#8203;AkihiroSuda](https://github.com/AkihiroSuda)

### [`v1.0.1`](https://github.com/lima-vm/lima/releases/tag/v1.0.1)

[Compare Source](lima-vm/lima@v1.0.0...v1.0.1)

Reverted the default port forwarder from gRPC to SSH for the stability reason ([#&#8203;2864](lima-vm/lima#2864)).
This reversion fixes several regressions related to `docker run -p` in Lima v1.0.0 ([#&#8203;2859](lima-vm/lima#2859)).

Although the gRPC forwarder is faster and has an advanced feature (UDP support), it turned out to be still immature.
Set `LIMA_SSH_PORT_FORWARDER=false` to opt-in to the gRPC forwarder.
See <https://lima-vm.io/docs/config/port/>.

Full changes: https://github.com/lima-vm/lima/milestone/50?closed=1
Thanks to [@&#8203;alexandear](https://github.com/alexandear) [@&#8203;jandubois](https://github.com/jandubois) [@&#8203;norio-nomura](https://github.com/norio-nomura)

#### Usage

```console
[macOS]$ limactl create
[macOS]$ limactl start
...
INFO[0029] READY. Run `lima` to open the shell.

[macOS]$ lima uname
Linux
```

***

The binaries were built automatically on GitHub Actions.
The build log is available for 90 days: https://github.com/lima-vm/lima/actions/runs/11735352652

The sha256sum of the SHA256SUMS file itself is `f5c12d003e25dc46291803a8acae9e9d325a45eca0c1f9f40bd6852ec8ed9be1` .

***

Release manager: [@&#8203;AkihiroSuda](https://github.com/AkihiroSuda)

### [`v1.0.0`](https://github.com/lima-vm/lima/releases/tag/v1.0.0)

[Compare Source](lima-vm/lima@v0.23.2...v1.0.0)

With the support from 110+ contributors in 3+ years, the Lima project has finally reached v1.0. 🎉

This release introduces several breaking changes, such as switching the default machine driver from QEMU to VZ for better filesystem performance.

The `limactl` CLI is designed to print hints when the user hits those breaking changes.
e.g., `limactl create template://experimental/vz` now fails with a hint that suggests using `limactl create --vm-type=vz template://default` instead.

🔴 = Major breaking changes
🟡 = Minor breaking changes

-   VZ:
    -   Graduate VZ machine driver from experimental ([#&#8203;2758](lima-vm/lima#2758))
    -   🔴 Use VZ by default for new instances on macOS >= 13.5 ([#&#8203;1951](lima-vm/lima#1951))
    -   Support nested virtualization on M3 ([#&#8203;2530](lima-vm/lima#2530), thanks to [@&#8203;abiosoft](https://github.com/abiosoft))
    -   Optimize qcow2-to-raw image conversion (lima-vm/go-qcow2reader@v0.1.2...v0.4.0 , thanks to [@&#8203;nirs](https://github.com/nirs))
    -   Support specifying a custom kernel ([#&#8203;2562](lima-vm/lima#2562), thanks to [@&#8203;norio-nomura](https://github.com/norio-nomura))

-   QEMU:
    -   Graduate 9p mount driver from experimental ([#&#8203;2758](lima-vm/lima#2758))
    -   🔴 Use 9p by default for most templates ([#&#8203;1953](lima-vm/lima#1953), [#&#8203;2822](lima-vm/lima#2822))
    -   riscv64: switch from u-boot to EDK2 ([#&#8203;2592](lima-vm/lima#2592))

-   Network:
    -   Graduate user-v2 network driver from experimental ([#&#8203;2758](lima-vm/lima#2758))
    -   Support UDP port forwarding ([#&#8203;2411](lima-vm/lima#2411), thanks to [@&#8203;balajiv113](https://github.com/balajiv113))
    -   🔴 Strictly require `socket_vmnet` binary to be owned by root ([#&#8203;2734](lima-vm/lima#2734))

-   SSH:
    -   🟡 Disable `ssh.loadDotSSHPubKeys` by default ([#&#8203;2706](lima-vm/lima#2706))

-   YAML:
    -   Support generating jsonschema ([#&#8203;2306](lima-vm/lima#2306), thanks to [@&#8203;afbjorklund](https://github.com/afbjorklund))
    -   Support specifying `param` for provisioning scripts ([#&#8203;2570](lima-vm/lima#2570), thanks to [@&#8203;jandubois](https://github.com/jandubois))
    -   Support specifying `minimumLimaVersion` and `vmOpts.qemu.minimumVersion` ([#&#8203;2659](lima-vm/lima#2659), thanks to [@&#8203;jandubois](https://github.com/jandubois))
    -   Support template expansion in mounts ([#&#8203;2588](lima-vm/lima#2588), thanks to [@&#8203;norio-nomura](https://github.com/norio-nomura))

-   `limactl` CLI:
    -   Add `limactl tunnel` command so as to allow the host to join the guest network ([#&#8203;2710](lima-vm/lima#2710))
    -   Add `--log-format=json` ([#&#8203;2584](lima-vm/lima#2584), thanks to [@&#8203;nirs](https://github.com/nirs))
    -   `limactl prune`: Add `--keep-referred` ([#&#8203;2569](lima-vm/lima#2569), thanks to [@&#8203;norio-nomura](https://github.com/norio-nomura))

-   nerdctl:
    -   Updated to [v2.0.0](https://github.com/containerd/nerdctl/releases/tag/v2.0.0) ([#&#8203;2178](lima-vm/lima#2178))
    -   rootless: allocate 1G subuids from 524288 (0x80000) for new users ([#&#8203;2725](lima-vm/lima#2725))

-   Templates:
    -   🔴 `experimental/vz`: Merged into the `default` template ([#&#8203;2730](lima-vm/lima#2730), [#&#8203;2736](lima-vm/lima#2736))
    -   🟡 `experimental/{riscv64, armv7l}`: Merged into the `default` template ([#&#8203;2730](lima-vm/lima#2730), [#&#8203;2736](lima-vm/lima#2736))
    -   🔴 `vmnet`: Removed in favor of `limactl create --network=lima:shared template://default` ([#&#8203;2736](lima-vm/lima#2736))
    -   🟡 `experimental/net-user-v2`: Removed in favor of `limactl create --network=lima:user-v2 template://default` ([#&#8203;2736](lima-vm/lima#2736))
    -   🔴 `experimental/9p`: Removed in favor of `limactl create --mount-type=9p template://default` ([#&#8203;2736](lima-vm/lima#2736))
    -   🟡 `experimental/virtiofs-linux`: Removed in favor of `limactl create --mount-type=virtiofs template://default` ([#&#8203;2736](lima-vm/lima#2736))
    -   🔴 `alpine`: Renamed to `alpine-iso` ([#&#8203;2704](lima-vm/lima#2704))
    -   🔴 `alpine-image`: Renamed to `alpine` ([#&#8203;2704](lima-vm/lima#2704))
    -   `archlinux`: Demoted from Tier 1 to Tier 2 ([#&#8203;2717](lima-vm/lima#2717), [#&#8203;2823](lima-vm/lima#2823))
    -   `default`, `ubuntu`, ...: Updated to Ubuntu 24.10. The older versions are available as `ubuntu-20.04`, `ubuntu-22.04`, and `ubuntu-24.04` ([#&#8203;2755](lima-vm/lima#2755), [#&#8203;2795](lima-vm/lima#2795))
    -   `fedora`: Updated to Fedora 41 ([#&#8203;2821](lima-vm/lima#2821), [#&#8203;2822](lima-vm/lima#2822), thanks to [@&#8203;subpop](https://github.com/subpop))
    -   `opensuse`: Renamed to `opensuse-leap`. Still aliased as `opensuse` ([#&#8203;2612](lima-vm/lima#2612), thanks to [@&#8203;afbjorklund](https://github.com/afbjorklund))
    -   `experimental/opensuse-tumbleweed`: Support aarch64 ([#&#8203;2613](lima-vm/lima#2613), thanks to [@&#8203;afbjorklund](https://github.com/afbjorklund))
    -   `hack/update-template.sh` is added for automating updates ([#&#8203;1347](lima-vm/lima#1347), thanks to [@&#8203;norio-nomura](https://github.com/norio-nomura))

-   Project:
    -   Invite Norio Nomura ([@&#8203;norio-nomura](https://github.com/norio-nomura)) as a Reviewer ([#&#8203;2567](lima-vm/lima#2567))

Full changes: https://github.com/lima-vm/lima/milestone/47?closed=1
Thanks to [@&#8203;AdamKorcz](https://github.com/AdamKorcz) [@&#8203;Mr-Sunglasses](https://github.com/Mr-Sunglasses) [@&#8203;SmartManoj](https://github.com/SmartManoj) [@&#8203;YorikSar](https://github.com/YorikSar) [@&#8203;abiosoft](https://github.com/abiosoft) [@&#8203;afbjorklund](https://github.com/afbjorklund) [@&#8203;alexandear](https://github.com/alexandear) [@&#8203;balajiv113](https://github.com/balajiv113) [@&#8203;hasan4791](https://github.com/hasan4791) [@&#8203;jandubois](https://github.com/jandubois) [@&#8203;nirs](https://github.com/nirs) [@&#8203;norio-nomura](https://github.com/norio-nomura) [@&#8203;pvdvreede](https://github.com/pvdvreede) [@&#8203;subpop](https://github.com/subpop) [@&#8203;tsukasaI](https://github.com/tsukasaI)

#### Usage

```console
[macOS]$ limactl create
[macOS]$ limactl start
...
INFO[0029] READY. Run `lima` to open the shell.

[macOS]$ lima uname
Linux
```

***

The binaries were built automatically on GitHub Actions.
The build log is available for 90 days: https://github.com/lima-vm/lima/actions/runs/11695321667

The sha256sum of the SHA256SUMS file itself is `4bd200a163111fe78c6f3e6de405113d416053802fe1507597f9a42f89a98c90` .

***

Release manager: [@&#8203;AkihiroSuda](https://github.com/AkihiroSuda)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy40NDAuNyIsInVwZGF0ZWRJblZlciI6IjM3LjQ0MC43IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
@dhealio
Copy link
Author

dhealio commented Dec 13, 2024

[...] vz uses gvisor implementation for network. For that dns resolution is taken care by gvisor-tap-vsock. Which will take care of reading /etc/hosts to populate nameservers

I think it's worth noting that reading /etc/hosts to populate nameservers is not sufficient in terms of picking up all the nameservers in use by macOS. In fact, my /etc/hosts file contains this notice:

# macOS Notice
#
# This file is not consulted for DNS hostname resolution, address
# resolution, or the DNS query routing mechanism used by most
# processes on this system.
#
# To view the DNS configuration used by this system, use:
#   scutil --dns
#
# SEE ALSO
#   dns-sd(1), scutil(8)
#
# This file is automatically generated.

For me, /etc/hosts only contains nameserver entries for active adapters that are listed in System Settings -> Network. It does not contain nameserver entries that are in use by my VPN connection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component/dns component/vz regression Used to work but has been broken
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants