Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how mergerfs re-mount interacts with NFS #1221

Open
nohajc opened this issue Aug 4, 2023 · 19 comments
Open

Document how mergerfs re-mount interacts with NFS #1221

nohajc opened this issue Aug 4, 2023 · 19 comments

Comments

@nohajc
Copy link

nohajc commented Aug 4, 2023

Hi, not really a bug but some behavior which I haven't seen mentioned in the documentation.

I noticed that each time a mergerfs pool is re-mounted (to upgrade or apply new configuration), my NFS shares stop working until I re-export them.

Specifically, Kodi hangs when trying to load files from a mergerfs-backed NFS share (already configured file source).
When I try to mount the same NFS share on a different machine (for the first time), it also hangs.

Running exportfs -ar after every re-mount solves this problem.

@nohajc
Copy link
Author

nohajc commented Aug 4, 2023

In contrast, SMB is not affected by this.

@trapexit
Copy link
Owner

trapexit commented Aug 4, 2023

Is the original mergerfs instance being umounted? That's not the default behavior. In fact lazy umount of the underlying mountpoint was only added recently.

I'm not entirely sure what is happening but at the end of the day FUSE has no way to remount like a regular filesystem. Either the FUSE server has to be able to be changed at runtime (which mergerfs can for many options) or just mounted again. But when you have an app already using a filesystem and mount on top of it that app still sees the underlying filesystem. Not the new one. So I wouldn't expect NFS to have any behavior change unless somehow the new mount confuses it because it reevaluates the paths. Maybe it does... but if that is the problem it wouldn't likely be a mergerfs issue as a NFS one and should happen if you did the same thing with another filesystem.

@nohajc
Copy link
Author

nohajc commented Aug 4, 2023

Ok, so to clarify, I'm following the steps described in the README's upgrade section. Except just mounting with the same version and modified flags.

That means no umounting, just mounting over the previous mount. I also have lazy mount enabled.
My setup is a mergerfs pool mounted at /media/data with the same path being shared by both SMB and NFS.

While SMB shares can happily survive the remounting, NFS clients will experience the behavior I described (until you re-export the share).

I believe you can easily verify this. It's definitely not a mergerfs issue but something worth mentioning in the docs if the behavior is really not consistent across network filesystems. Otherwise it's not obvious why NFS shares would stop working after upgrading mergerfs in the recommended way.

@trapexit
Copy link
Owner

trapexit commented Aug 4, 2023

That means no umounting, just mounting over the previous mount. I also have lazy mount enabled.

It is lazy unmount. Not mount. Is lazy-umount-mountpoint=true ? And if so is the mount actually being umounted or is it staying around because it is in use? Have you tried disabling lazy umount?

While SMB shares can happily survive the remounting

SMB, if you are using Samba as the server, is very different in design to NFS. It's a purely userspace solution. NFS (normally) is a kernel solution and has a different way of interacting with the underlying filesystems. So this is no surprise.

@nohajc
Copy link
Author

nohajc commented Aug 4, 2023

Sorry, that was a typo. I can try without lazy-umount-mountpoint=true.

Sure, it's no surprise if you know this stuff. For the average user it might be confusing.

@trapexit
Copy link
Owner

trapexit commented Aug 4, 2023

Sure, it's no surprise if you know this stuff. For the average user it might be confusing.

I've no problem adding details about external dependencies. I do it all the time. But there is basically infinite combinations of setups so adding such info is typically on demand and iterative. But we also need to understand exactly what is going on so it can be properly documented.

@nohajc
Copy link
Author

nohajc commented Aug 4, 2023

But we also need to understand exactly what is going on so it can be properly documented.

I agree. To that end, could you also try to replicate this behavior?

I can surely play with the flags, observe when exactly the old mount is removed but I'm also curious if this is how it works for everyone or it's something specific with my setup only...

@trapexit
Copy link
Owner

trapexit commented Aug 4, 2023

Sure. I'll try to take a look this weekend. I don't use any network filesystems with mergerfs for anything serious so I'll have to set it all up from scratch.

@nohajc
Copy link
Author

nohajc commented Aug 4, 2023

Thanks. I'll also provide the NFS flags I'm using.

@nohajc
Copy link
Author

nohajc commented Aug 5, 2023

So, interestingly, with lazy unmount enabled, the old mergerfs process will keep running until I call exportfs -ar.
Before re-exporting, all clients who were already connected can still use the share (probably via the old process) but any new mount attempt will hang.

If I try to umount the mergerfs pool immediately, it will report "target busy". There is nothing in lsof output but I guess it is used by the NFS server.

NFS flags:

rw,async,no_root_squash,no_subtree_check,insecure,fsid=1

I cannot re-test with a Kodi client right now (Kodi is different because it uses the userspace libnfs), but that's where I noticed the problem in the first place, so the behavior will be similar.

@nohajc
Copy link
Author

nohajc commented Aug 5, 2023

Now I tried with lazy unmount disabled and the hanging issue disappeared. I'm able to mount regardless of the export status but obviously two instances of mergerfs are running at the same time. Don't know what would happen if I killed the old one manually. I could try that too...

@trapexit
Copy link
Owner

trapexit commented Aug 5, 2023

containers/oci-umount#48

I found random other comments around that suggest that lazy umount of exported NFS shares isn't a good idea. So I guess I'll put a note in the argument description about that.

@nohajc
Copy link
Author

nohajc commented Aug 5, 2023

containers/oci-umount#48

I found random other comments around that suggest that lazy umount of exported NFS shares isn't a good idea. So I guess I'll put a note in the argument description about that.

Isn't this a bit different? We're not doing lazy unmounts of NFS mounts (this would be done by NFS client) but rather lazy unmounts of mergerfs pool backing a NFS share (on the server).

@trapexit
Copy link
Owner

trapexit commented Aug 5, 2023

I didn't provide links to everything I found. It's related. lazy umount removes the mount from the filesystem tree as it mentions in that link. Clearly that leads to issues with NFS. While there are special issues between FUSE and NFS I wouldn't be surprised if lazy umount an exported filesystem always screws up NFS.

@nohajc
Copy link
Author

nohajc commented Aug 5, 2023

Well, it seems it is still usable, as long as you remember to run exportfs after lazy unmount... though not ideal of course.

If I wanted to use NFS without mergerfs lazy unmount, does it mean I cannot do live upgrade anymore? Having old mergerfs processes accumulate until reboot is not a solution either.

@trapexit
Copy link
Owner

trapexit commented Aug 5, 2023

You have to run exportfs again because it is a new filesystem and the fact the old filesystem is half removed clearly causes issues with new clients.

Depends on what "live upgrade" means. If you're changing settings then you can just use the runtime API. If you mean restarting mergerfs then yeah... nothing can be done about it really.

@nohajc
Copy link
Author

nohajc commented Aug 5, 2023

Yes, the runtime API works great for most things. (If it didn't depend on the xattr interface, it would be even better).
I'm just used to making all modifications in /etc/fstab and by re-mounting I also verify the configuration is correct.

I can definitely live with this. It just took me some time to figure out where the problem was...

@trapexit
Copy link
Owner

trapexit commented Aug 5, 2023

I can definitely live with this.

There really isn't any alternative. It's not impossible but very much not practical to replace the running process. Even if I could certain options are only able to be set at the initialization phase with the kernel. All the runtime options that are read only are those. FUSE has no "remount" ability. It is up to the FUSE server to manage runtime changes which I do as much as I can already.

@trapexit
Copy link
Owner

trapexit commented Aug 5, 2023

As for the runtime API. I already support some things over ioctl and will be deprecating the xattr interface come v3. But will require changes to all existing tooling naturally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants