Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fluctuating write performance #1332

Open
mdkeil opened this issue May 2, 2024 · 13 comments
Open

fluctuating write performance #1332

mdkeil opened this issue May 2, 2024 · 13 comments

Comments

@mdkeil
Copy link

mdkeil commented May 2, 2024

Describe the bug

I can't remember the exact point in time, but since my Ubuntu-upgrade from 20.04 to 22.04 (+ nvme upgrade) I've been getting fluctuating write performance after ~120-180 seconds, starting with >200 MB/s down to <30 MB/s. (from a single nvme drive to mergerfs). hard drives) . I've played around with different mergefs settings but always with the same result.

  • I used an "old" mergerfs-version from ubuntu repo. (from 2.28.1-1 to 2.33.3-1)
  • I think, the behaviour is not related to the mergerfs-version.
  • I use mergerfs with snapraid

working mergerfs-settings with v 2.28 / 2.33
/mnt/data/* /mnt/mysnapraid fuse.mergerfs defaults, direct_io,allow_other,use_ino,category.create=epmfs,moveonenospc=true,minfreespace=20G,fsname=mergerfsPool 0 0

To Reproduce

Copy a amount of large files (e.g. tv recordings) from a single nvme drive to mergerfs hard disks. (cp with midnight commander); after ~120-180s I got the write performance problems.

System information:

  • OS, kernel version: Ubuntu 22.04.4 LTS, 5.15.0-92-generic
  • mergerfs version: v2.40.2
  • mergerfs settings:
/etc/fstab 
/mnt/data/* /mnt/mysnapraid fuse.mergerfs cache.files=partial,dropcacheonclose=true,noforget,inodecalc=path-hash,func.getattr=newest,category.create=epmfs,moveonenospc=true, minfreespace=20G,fsname=mergerfsPool,nonempty 0 0
  • List of drives, filesystems, & sizes:
df -h
/dev/sdd1       7,3T     38G  7,1T    1% /mnt/data/disk4
/dev/sde1        11T    7,2T  3,8T   66% /mnt/parity/1-parity
/dev/sdf1        11T    5,9T  4,9T   55% /mnt/data/disk1
/dev/sda1       7,3T    4,6T  2,7T   64% /mnt/data/disk3
/dev/sdb1       7,3T    4,6T  2,6T   65% /mnt/data/disk2
mergerfsPool     33T     16T   18T   47% /mnt/mysnapraid
@trapexit
Copy link
Owner

trapexit commented May 2, 2024

Can you explain why you think this is a mergerfs issue? Drives have caches, the OS caches, you have cache.files=partial meaning you have double page caching going on... and so having inconsistent speed after caches are filled is normal and expected. You could have SMR drives which and those can slow down to single MB speeds at times.

@mdkeil
Copy link
Author

mdkeil commented May 2, 2024

It doesn't necessarily have to be an issue related to mergerfs, but I'm a little bit "lost in space" to find the "real" issue (no dmsg/syslog errors), because in past (before OS upgrade) I didnt noted this behaviour. I started with cache.files=off with the same write performance.
At the moment the write speed is like a sinus-curve (without negative)-- (after ~180s).. and the filetransfer seem to completely stop for short period.. I dont have these behaviour if i copy directly to the "data-disk" without using mergerfs.

@trapexit
Copy link
Owner

trapexit commented May 2, 2024

If it is related to anything I mentioned there wouldn't be any log errors. It would be normal, expected behavior.

Are you copying the same amounts of data? To the same branches? Are you copying large files or many small files? Are you copying to a filesystem that is fuller or more empty? Have you checked your buffer sizes when it gets slow? When it gets slow have you done checks against all the branches directly? Have you created a minimal setup with only one branch and tried the same? Confirmed that it isn't a singular one of your filesystems/drives? All of that matters. Every drive, every different interconnect, etc. matters. SMR drives will completely drop their perf to single digit write speeds once caches fill and it starts flushing and needs to write back to disk. You have 8TB drives and SMR was pretty common for some brands of 8TB drives.

Having stops and goes sounds like a full cache being flushed and then normal resuming. I see this all the time on my HDDs including writes directly to their filesystem. There is a reason why tools like nocache exist for usage with rsync. Buffer cache bloat is a real problem in certain workloads like the one you describe.

There really isn't anything I can do without a lot more info about the situation.

@trapexit
Copy link
Owner

trapexit commented May 2, 2024

According to the strace of mergerfs your branches were clearly busy at times. Orders of magnitude slower between the fastest and slowest. mergerfs is just another app like any other so this is representative of any random app trying to interact with those filesystems.

grep pwrite64 mergerfs_cp.strace.txt | awk '{print $NF}' | sort -r | head 
<0.134618>
<0.133240>
<0.127098>
<0.096753>
<0.083339>
<0.077059>
<0.075867>
<0.071593>
<0.066411>

over a 1/10th a second for some writes to complete.

vs the following for when it is fast

<0.000124>
<0.000086>
<0.000093>
<0.000104>
<0.000216>
<0.000096>
<0.000144>
<0.000113>
<0.000097>

@mdkeil
Copy link
Author

mdkeil commented May 2, 2024

Are you copying the same amounts of data? To the same branches?

yes

Are you copying large files or many small files?

large

Are you copying to a filesystem that is fuller or more empty?

no different, if the filesystem (ext4) is about 50% or nearly empty.

Have you checked your buffer sizes when it gets slow?

No, how I can check this?

Have you created a minimal setup with only one branch and tried the same?

No, because it works in the past.. and if I directly copy to the branch (with the same amount of data) without using mergerfs, I have expected behaviour.

SMR drives will completely drop their perf to single digit write speeds once caches fill and it starts flushing and needs to write back to disk.

All my drives has CMR

@trapexit
Copy link
Owner

trapexit commented May 2, 2024

No, how I can check this?

Run free. It breaks down RAM usage.

@trapexit
Copy link
Owner

trapexit commented May 2, 2024

To properly test you should have a single branch, disable page caching because it being enabled will increase cache usage as mentioned in the docs, and since you have large files the dropcacheonclose won't mean as much.

@mdkeil
Copy link
Author

mdkeil commented May 2, 2024

To properly test you should have a single branch,

I will create one with my spare disk in the evening.

disable page caching

done!

@mdkeil
Copy link
Author

mdkeil commented May 2, 2024

no change with single branch..

some testings:

  • single disk
 sudo dd if=/dev/zero of=/mnt/testing/1GB.file bs=1M count=1024 oflag=nocache conv=fdatasync status=progress
933232640 Bytes (933 MB, 890 MiB) kopiert, 4 s, 232 MB/s
1024+0 Datensätze ein
1024+0 Datensätze aus
1073741824 Bytes (1,1 GB, 1,0 GiB) kopiert, 4,69794 s, 229 MB/s

cache.files=off,dropcacheonclose=true,noforget,inodecalc=path-hash,func.getattr=newest,category.create=epmfs,moveonenospc=true

  • single branch
sudo dd if=/dev/zero of=/mnt/mytesting/1GB.file bs=1M count=1024 oflag=nocache conv=fdatasync status=progress
569376768 Bytes (569 MB, 543 MiB) kopiert, 1 s, 569 MB/s
1024+0 Datensätze ein
1024+0 Datensätze aus
1073741824 Bytes (1,1 GB, 1,0 GiB) kopiert, 6,53895 s, 164 MB/s
  • 3disk branch (all drives 55-65% full)
sudo dd if=/dev/zero of=/mnt/mysnapraid/1GB.file bs=1M count=1024 oflag=nocache conv=fdatasync status=progress
571473920 Bytes (571 MB, 545 MiB) kopiert, 1 s, 571 MB/s
1024+0 Datensätze ein
1024+0 Datensätze aus
1073741824 Bytes (1,1 GB, 1,0 GiB) kopiert, 8,988 s, 119 MB/s
  • 4disk branch (dd to "empty" data disk)
sudo dd if=/dev/zero of=/mnt/mysnapraid/1GB.file bs=1M count=1024 oflag=nocache conv=fdatasync status=progress
651165696 Bytes (651 MB, 621 MiB) kopiert, 1 s, 651 MB/s
1024+0 Datensätze ein
1024+0 Datensätze aus
1073741824 Bytes (1,1 GB, 1,0 GiB) kopiert, 6,36715 s, 169 MB/s

@mdkeil
Copy link
Author

mdkeil commented May 26, 2024

I still have the problem.. the writespeed from mergerfs to external drive often is limited to 40MB/s after some time-- starting with 100MB+
I think its related to
https://www.reddit.com/r/DataHoarder/comments/1bujnka/mergerfs_pooling_external_drives_writes_extremely/

overall it should be a caching-issue.. but with Ubuntu 20.04 I never had that kind of issues.

@trapexit
Copy link
Owner

There's nothing I can do if I don't have information to work with. Even if I did... nothing has changed in mergerfs in terms of abilities or features in years related to writes. When it comes to writes there is almost zero logic. As I pointed out before your system / filesystems clearly are busy if a single write takes over 1/10 a second.

@mdkeil
Copy link
Author

mdkeil commented May 26, 2024

..yes i know.. but was is the cause of the "busy filesystem" if this caused the problem.. and how I can find the app/prozess/whatever..

@trapexit
Copy link
Owner

You have to look at the running system when it is occurring. mergerfs is just another piece of software like any other piece of software running on your machine in userspace as root. If it says that writes to the underlying filesystem are taking over 1/10 a second then you should just be able to run iotop or similar IO monitoring tools to see who is reading and writing. And if it is mergerfs then your problem is probably your hardware. Lower the number of threads or change mergerfs priorities or something to limit concurrency. Lots of consumer grade hardware don't handle concurrency well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants