Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bisync: another day, another set of integration test fixes #7795

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

nielash
Copy link
Collaborator

@nielash nielash commented Apr 22, 2024

What is the purpose of this change?

See #7743 (comment) and feel free to cherry-pick as you see fit if some of these warrant more discussion.

Was the change discussed in an issue or in the forum before?

Checklist

  • I have read the contribution guidelines.
  • I have added tests for all changes in this PR if appropriate.
  • I have added documentation for the changes if appropriate.
  • All commit messages are in house style.
  • I'm done, this Pull Request is ready for review :-)

@nielash
Copy link
Collaborator Author

nielash commented Apr 23, 2024

I wonder if the TestSFTPRclone: issue is related to the one identified on #7791?

And fix vfs cache encoding described in #7760
The problem is when we create and write cache, we use the "OS" encoded ("OS" encoding is hardcoded for different operating systems, not specified by users).
But when we copy it to the remote (backend), we use local.Fs, which accept a name already encoded in "Standard" way.
So we need to send a "Standard" encoded name.

I've reverted it for now.

@nielash nielash marked this pull request as ready for review April 23, 2024 00:51
@nielash
Copy link
Collaborator Author

nielash commented Apr 28, 2024

This was a real tricky one, but I think I finally figured out the TestSFTPRclone: issue. 15ff8ab

@nielash nielash requested a review from ncw April 29, 2024 08:07
Copy link
Member

@ncw ncw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @nielash there are some great fixes in there and apologies for taking ages to review them.

I'll cherry pick the ones that don't need discussion and we can discuss the rest!

See comments inline.

Thank you

@@ -939,6 +939,9 @@ func (b *bisyncTest) checkPreReqs(ctx context.Context, opt *bisync.Options) (con
b.fs2.Features().Disable("Copy") // API has longstanding bug for conflictBehavior=replace https://github.com/rclone/rclone/issues/4590
b.fs2.Features().Disable("Move")
}
if strings.Contains(strings.ToLower(fs.ConfigString(b.fs1)), "mailru") || strings.Contains(strings.ToLower(fs.ConfigString(b.fs2)), "mailru") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point in the future when we can specify global config in backend config we should move this config from here into the TestMailru remote. Let's see how well it works here.

@@ -47,7 +47,7 @@ var (
// ListRetries is the number of times to retry a listing to overcome eventual consistency
ListRetries = flag.Int("list-retries", 3, "Number or times to retry listing")
// MatchTestRemote matches the remote names used for testing
MatchTestRemote = regexp.MustCompile(`^rclone-test-[abcdefghijklmnopqrstuvwxyz0123456789]{24}$`)
MatchTestRemote = regexp.MustCompile(`^rclone-test-[abcdefghijklmnopqrstuvwxyz0123456789]{12}$`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs to change

var MatchTestRemote = regexp.MustCompile(`^rclone-test-[abcdefghijklmnopqrstuvwxyz0123456789]{24}(_segments)?$`)

Though maybe that one should say {12,24} to clean the old and the new!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made this change.

@@ -371,7 +371,9 @@ func (d *Dir) renameTree(dirPath string) {
// Make sure the path is correct for each node
if d.path != dirPath {
fs.Debugf(d.path, "Renaming to %q", dirPath)
delete(d.parent.items, name(d.path))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good fix, but it really needs a test. I try to keep the VFS coverage up as all these corner cases are really difficult to track down!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll try to add one! (Although FWIW, I found this because it was causing TestSFTPRclone: to fail!)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also btw, I am now seeing a bunch of expecting X to have MkdirMetadata method: optional feature not implemented and expecting X to have SetMetadata method errors that I'm pretty sure were not there before...maybe related to the recent --create-empty-src-dirs change?

https://pub.rclone.org/integration-tests/2024-05-13-010010/sftp-cmd.bisync-TestSFTPRclone-5.txt

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test.

func name(p string) string {
p = path.Base(p)
if p == "." {
p = "/"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should probably be p = "" to fit with rclone conventions, not 100% sure!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I copied this from here:

rclone/vfs/dir.go

Lines 163 to 172 in e9e9feb

// Name (base) of the directory - satisfies Node interface
func (d *Dir) Name() (name string) {
d.mu.RLock()
name = path.Base(d.path)
d.mu.RUnlock()
if name == "." {
name = "/"
}
return name
}

// Call this whenever modtime is updated.
func (b *s3Backend) storeModtime(fp string, meta map[string]string, val string) {
meta["X-Amz-Meta-Mtime"] = val
meta["mtime"] = val
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should only be necessary to set X-Amz-Meta-Mtime as the s3 backend turns it into the mtime metadata.

So I'm not 100% sure what this is needed for.

I think the serve s3 code is a bit confused about exactly which parameter to use so we should probably unconfuse it rather than make it worse!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a test to reproduce:

go run ./fstest/test_all  -run '^TestBisyncRemoteRemote/ext_paths.*$' -timeout 30s -verbose -maxtries 1 -remotes TestS3Rclone:

If I remember correctly, I think the problem is that there are blocks like this which try X-Amz-Meta-Mtime first but fall back to using mtime:

if val, ok := meta["X-Amz-Meta-Mtime"]; ok {
ti, err := swift.FloatStringToTime(val)
if err == nil {
return result, b.vfs.Chtimes(fp, ti, ti)
}
// ignore error since the file is successfully created
}
if val, ok := meta["mtime"]; ok {
ti, err := swift.FloatStringToTime(val)
if err == nil {
return result, b.vfs.Chtimes(fp, ti, ti)
}
// ignore error since the file is successfully created
}

And previously it was not always consistently saving the one it ended up using in memory.

And then there's also Last-Modified just to make things even more fun!

meta := map[string]string{
"Last-Modified": node.ModTime().Format(timeFormat),
"Content-Type": fs.MimeType(context.Background(), fobj),
}

@@ -1088,7 +1088,7 @@ func (f *Fs) list(ctx context.Context, containerName, directory, prefix string,
isDirectory := isDirectoryMarker(*file.Properties.ContentLength, file.Metadata, remote)
if isDirectory {
// Don't insert the root directory
if remote == directory {
if remote == f.opt.Enc.ToStandardPath(directory) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

directory should already be in Standard encoding here since it ultimately came from the user rather than the backend, so I'm not sure what is going on here.

Same for the other backends.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about that. For example here:

bucket, directory := f.split(dir)

directory is returned from f.split, which does FromStandardPath on it:

rclone/backend/s3/s3.go

Lines 2823 to 2826 in e9e9feb

func (f *Fs) split(rootRelativePath string) (bucketName, bucketPath string) {
bucketName, bucketPath = bucket.Split(bucket.Join(f.root, rootRelativePath))
return f.opt.Enc.FromStandardName(bucketName), f.opt.Enc.FromStandardPath(bucketPath)
}

Here's a test to reproduce the issue:

go test -timeout 3h -run '^TestBisyncRemoteRemote/extended_filenames$' github.com/rclone/rclone/cmd/bisync -remote TestS3,directory_markers: -v 

Use 12 random characters instead of 24, to avoid trouble on the bisync tests
Before this change, renaming a directory d failed to rename its key in
d.parent.items, which caused trouble later when doing Dir.Stat on a
subdirectory. This change fixes the issue.
Before this change, serve s3 did not consistently save the correct modtime value
in memory after putting or copying an object, which could sometimes cause an
incorrect modtime to be returned. This change fixes the issue by ensuring that
both "mtime" and "X-Amz-Meta-Mtime" are updated in b.meta when we have fresh data.

The issue was discovered on the TestBisyncRemoteRemote/ext_paths test.
…omparison

`remote` has been converted ToStandardPath a few lines above, so `directory`
needs to be converted the same way in order to be compared properly. This was
spotted on `TestBisyncRemoteRemote/extended_filenames` for
`TestS3,directory_markers:` and `TestGoogleCloudStorage,directory_markers:`
which tripped over a directory name containing a Line Feed symbol.
golangci-lint was complaining about this. `entry` can never be nil because
itemToDirEntry never returns a nil interface value
rclone@10eb474
introduced a bug that caused operations.CopyDirMetadata to sometimes be called
even when the dest Fs lacks metadata support. This resulted in errors such as
`expecting X to have MkdirMetadata method: optional feature not implemented` and
`expecting X to have SetMetadata method`. This change fixes the issue by instead
using operations.SetDirModTime when s.setDirMetadata is false.
@nielash
Copy link
Collaborator Author

nielash commented May 19, 2024

@ncw heads up -- potentially important bug fix here (1e8d42f) for a regression introduced by our recent --create-empty-src-dirs change (10eb474)

(See this log for many examples of the problem)

Previously these failed with:
directory "empty metadata sub dir" not found in "": directory not found
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
bisync
In progress
Development

Successfully merging this pull request may close these issues.

None yet

2 participants