fluctuating write performance #1332

mdkeil · 2024-05-02T10:27:50Z

Describe the bug

I can't remember the exact point in time, but since my Ubuntu-upgrade from 20.04 to 22.04 (+ nvme upgrade) I've been getting fluctuating write performance after ~120-180 seconds, starting with >200 MB/s down to <30 MB/s. (from a single nvme drive to mergerfs). hard drives) . I've played around with different mergefs settings but always with the same result.

I used an "old" mergerfs-version from ubuntu repo. (from 2.28.1-1 to 2.33.3-1)
I think, the behaviour is not related to the mergerfs-version.
I use mergerfs with snapraid

working mergerfs-settings with v 2.28 / 2.33
/mnt/data/* /mnt/mysnapraid fuse.mergerfs defaults, direct_io,allow_other,use_ino,category.create=epmfs,moveonenospc=true,minfreespace=20G,fsname=mergerfsPool 0 0

To Reproduce

Copy a amount of large files (e.g. tv recordings) from a single nvme drive to mergerfs hard disks. (cp with midnight commander); after ~120-180s I got the write performance problems.

System information:

OS, kernel version: Ubuntu 22.04.4 LTS, 5.15.0-92-generic
mergerfs version: v2.40.2
mergerfs settings:

/etc/fstab 
/mnt/data/* /mnt/mysnapraid fuse.mergerfs cache.files=partial,dropcacheonclose=true,noforget,inodecalc=path-hash,func.getattr=newest,category.create=epmfs,moveonenospc=true, minfreespace=20G,fsname=mergerfsPool,nonempty 0 0

List of drives, filesystems, & sizes:

df -h
/dev/sdd1       7,3T     38G  7,1T    1% /mnt/data/disk4
/dev/sde1        11T    7,2T  3,8T   66% /mnt/parity/1-parity
/dev/sdf1        11T    5,9T  4,9T   55% /mnt/data/disk1
/dev/sda1       7,3T    4,6T  2,7T   64% /mnt/data/disk3
/dev/sdb1       7,3T    4,6T  2,6T   65% /mnt/data/disk2
mergerfsPool     33T     16T   18T   47% /mnt/mysnapraid

A strace of the application having a problem:
cp.strace.txt.gz
strace of mergerfs while app tried to do it's thing:
mergerfs_cp.strace.txt.gz

The text was updated successfully, but these errors were encountered:

trapexit · 2024-05-02T11:38:13Z

Can you explain why you think this is a mergerfs issue? Drives have caches, the OS caches, you have cache.files=partial meaning you have double page caching going on... and so having inconsistent speed after caches are filled is normal and expected. You could have SMR drives which and those can slow down to single MB speeds at times.

mdkeil · 2024-05-02T12:10:36Z

It doesn't necessarily have to be an issue related to mergerfs, but I'm a little bit "lost in space" to find the "real" issue (no dmsg/syslog errors), because in past (before OS upgrade) I didnt noted this behaviour. I started with cache.files=off with the same write performance.
At the moment the write speed is like a sinus-curve (without negative)-- (after ~180s).. and the filetransfer seem to completely stop for short period.. I dont have these behaviour if i copy directly to the "data-disk" without using mergerfs.

trapexit · 2024-05-02T12:23:59Z

If it is related to anything I mentioned there wouldn't be any log errors. It would be normal, expected behavior.

Are you copying the same amounts of data? To the same branches? Are you copying large files or many small files? Are you copying to a filesystem that is fuller or more empty? Have you checked your buffer sizes when it gets slow? When it gets slow have you done checks against all the branches directly? Have you created a minimal setup with only one branch and tried the same? Confirmed that it isn't a singular one of your filesystems/drives? All of that matters. Every drive, every different interconnect, etc. matters. SMR drives will completely drop their perf to single digit write speeds once caches fill and it starts flushing and needs to write back to disk. You have 8TB drives and SMR was pretty common for some brands of 8TB drives.

Having stops and goes sounds like a full cache being flushed and then normal resuming. I see this all the time on my HDDs including writes directly to their filesystem. There is a reason why tools like nocache exist for usage with rsync. Buffer cache bloat is a real problem in certain workloads like the one you describe.

There really isn't anything I can do without a lot more info about the situation.

trapexit · 2024-05-02T13:02:48Z

According to the strace of mergerfs your branches were clearly busy at times. Orders of magnitude slower between the fastest and slowest. mergerfs is just another app like any other so this is representative of any random app trying to interact with those filesystems.

grep pwrite64 mergerfs_cp.strace.txt | awk '{print $NF}' | sort -r | head 
<0.134618>
<0.133240>
<0.127098>
<0.096753>
<0.083339>
<0.077059>
<0.075867>
<0.071593>
<0.066411>

over a 1/10th a second for some writes to complete.

vs the following for when it is fast

<0.000124>
<0.000086>
<0.000093>
<0.000104>
<0.000216>
<0.000096>
<0.000144>
<0.000113>
<0.000097>

mdkeil · 2024-05-02T13:19:55Z

Are you copying the same amounts of data? To the same branches?

yes

Are you copying large files or many small files?

large

Are you copying to a filesystem that is fuller or more empty?

no different, if the filesystem (ext4) is about 50% or nearly empty.

Have you checked your buffer sizes when it gets slow?

No, how I can check this?

Have you created a minimal setup with only one branch and tried the same?

No, because it works in the past.. and if I directly copy to the branch (with the same amount of data) without using mergerfs, I have expected behaviour.

SMR drives will completely drop their perf to single digit write speeds once caches fill and it starts flushing and needs to write back to disk.

All my drives has CMR

trapexit · 2024-05-02T13:21:17Z

No, how I can check this?

Run free. It breaks down RAM usage.

trapexit · 2024-05-02T13:23:26Z

To properly test you should have a single branch, disable page caching because it being enabled will increase cache usage as mentioned in the docs, and since you have large files the dropcacheonclose won't mean as much.

mdkeil · 2024-05-02T13:29:46Z

To properly test you should have a single branch,

I will create one with my spare disk in the evening.

disable page caching

done!

mdkeil · 2024-05-02T17:52:58Z

no change with single branch..

some testings:

single disk

 sudo dd if=/dev/zero of=/mnt/testing/1GB.file bs=1M count=1024 oflag=nocache conv=fdatasync status=progress
933232640 Bytes (933 MB, 890 MiB) kopiert, 4 s, 232 MB/s
1024+0 Datensätze ein
1024+0 Datensätze aus
1073741824 Bytes (1,1 GB, 1,0 GiB) kopiert, 4,69794 s, 229 MB/s

cache.files=off,dropcacheonclose=true,noforget,inodecalc=path-hash,func.getattr=newest,category.create=epmfs,moveonenospc=true

single branch

sudo dd if=/dev/zero of=/mnt/mytesting/1GB.file bs=1M count=1024 oflag=nocache conv=fdatasync status=progress
569376768 Bytes (569 MB, 543 MiB) kopiert, 1 s, 569 MB/s
1024+0 Datensätze ein
1024+0 Datensätze aus
1073741824 Bytes (1,1 GB, 1,0 GiB) kopiert, 6,53895 s, 164 MB/s

3disk branch (all drives 55-65% full)

sudo dd if=/dev/zero of=/mnt/mysnapraid/1GB.file bs=1M count=1024 oflag=nocache conv=fdatasync status=progress
571473920 Bytes (571 MB, 545 MiB) kopiert, 1 s, 571 MB/s
1024+0 Datensätze ein
1024+0 Datensätze aus
1073741824 Bytes (1,1 GB, 1,0 GiB) kopiert, 8,988 s, 119 MB/s

4disk branch (dd to "empty" data disk)

sudo dd if=/dev/zero of=/mnt/mysnapraid/1GB.file bs=1M count=1024 oflag=nocache conv=fdatasync status=progress
651165696 Bytes (651 MB, 621 MiB) kopiert, 1 s, 651 MB/s
1024+0 Datensätze ein
1024+0 Datensätze aus
1073741824 Bytes (1,1 GB, 1,0 GiB) kopiert, 6,36715 s, 169 MB/s

mdkeil · 2024-05-26T20:22:00Z

I still have the problem.. the writespeed from mergerfs to external drive often is limited to 40MB/s after some time-- starting with 100MB+
I think its related to
https://www.reddit.com/r/DataHoarder/comments/1bujnka/mergerfs_pooling_external_drives_writes_extremely/

overall it should be a caching-issue.. but with Ubuntu 20.04 I never had that kind of issues.

trapexit · 2024-05-26T20:26:06Z

There's nothing I can do if I don't have information to work with. Even if I did... nothing has changed in mergerfs in terms of abilities or features in years related to writes. When it comes to writes there is almost zero logic. As I pointed out before your system / filesystems clearly are busy if a single write takes over 1/10 a second.

mdkeil · 2024-05-26T20:37:46Z

..yes i know.. but was is the cause of the "busy filesystem" if this caused the problem.. and how I can find the app/prozess/whatever..

trapexit · 2024-05-26T20:40:28Z

You have to look at the running system when it is occurring. mergerfs is just another piece of software like any other piece of software running on your machine in userspace as root. If it says that writes to the underlying filesystem are taking over 1/10 a second then you should just be able to run iotop or similar IO monitoring tools to see who is reading and writing. And if it is mergerfs then your problem is probably your hardware. Lower the number of threads or change mergerfs priorities or something to limit concurrency. Lots of consumer grade hardware don't handle concurrency well.

mdkeil added bug investigating labels May 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fluctuating write performance #1332

fluctuating write performance #1332

mdkeil commented May 2, 2024 •

edited

trapexit commented May 2, 2024

mdkeil commented May 2, 2024

trapexit commented May 2, 2024

trapexit commented May 2, 2024

mdkeil commented May 2, 2024

trapexit commented May 2, 2024

trapexit commented May 2, 2024 •

edited

mdkeil commented May 2, 2024

mdkeil commented May 2, 2024 •

edited

mdkeil commented May 26, 2024

trapexit commented May 26, 2024

mdkeil commented May 26, 2024

trapexit commented May 26, 2024

fluctuating write performance #1332

fluctuating write performance #1332

Comments

mdkeil commented May 2, 2024 • edited

trapexit commented May 2, 2024

mdkeil commented May 2, 2024

trapexit commented May 2, 2024

trapexit commented May 2, 2024

mdkeil commented May 2, 2024

trapexit commented May 2, 2024

trapexit commented May 2, 2024 • edited

mdkeil commented May 2, 2024

mdkeil commented May 2, 2024 • edited

mdkeil commented May 26, 2024

trapexit commented May 26, 2024

mdkeil commented May 26, 2024

trapexit commented May 26, 2024

mdkeil commented May 2, 2024 •

edited

trapexit commented May 2, 2024 •

edited

mdkeil commented May 2, 2024 •

edited