Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework FileAttributes #1782

Open
SamMousa opened this issue Apr 17, 2024 · 1 comment
Open

Rework FileAttributes #1782

SamMousa opened this issue Apr 17, 2024 · 1 comment

Comments

@SamMousa
Copy link

SamMousa commented Apr 17, 2024

Feature Request

Currently file attribute retrieval in the adapter interface consists of several separate functions:

  • fileSize()
  • lastModified()
  • mimeType()
  • visibility()

All of these functions return a object of type FileAttributes.
There are several downsides to this approach:

  1. When you have a FileAttributes object you can not trust it unless you know via which function it was created (ie: is the lastModified property null because we obtained the object via mimeType() or is it null because it is not set in the underlying storage.
  2. The filesystem cannot assume that the object is fully hydrated, so calling fileSize() and lastModified() may lead to duplicate API calls to the underlying adapter.

In practice most adapters get all meta data in one request:

  • AsyncAwsS3Adapter
  • AwsS3V3Adapter
  • AzureBlobStorageAdapter
  • GoogleCloudStorageAdapter
  • SftpAdapter

Some don't:

  • FtpAdapter,
  • InMemoryFilesystemAdapter, note: this should not affect performance
  • Local, note: for listContents it is fully hydrated so performance wise this should be fine.
  • ZipArchiveAdapter
  • WebDAVAdapter, note: the underlying propFind method does support requesting multiple properties in one call

Possible solution

  1. Change the contract so that users may always assume that a StorageAttributes object is fully hydrated
  2. Create lazy loading implementations of StorageAttributes that load unset attributes if needed:
    class LazyFileAttributes {
        public function __construct(
            private string $path,
            private int|\Closure $fileSize,
            private string|\Closure $visibility,
            private int|\Closure $lastModified,
            private string|\Closure $mimeType,
            private array|\Closure $extraMetadata
        }
    }

This way we can:

  • Fully hydrate the object when the underlying API supports this efficiently
  • Fully hydrate the object when the filesystem is local (in the Local adapter we fully hydrate it for listContents but not for a single file)
  • Have a standard lazy implementation that passes simple closures for lazy loading attributes: new LazyFileAttributes($path, () => $this->fileSize($path), ...)
  1. Add a function metaData() to the adapter interface.
  2. Change FileSystem to use this new function to get the metadata and use it when specific properties are requested, since we know now the meta data is fully hydrated (lazily or greedily) it can reuse this metadata for other properties.
  3. Add a metaData() to the FilesystemReader that returns the StorageAttributes object for the path.
Q A
Flysystem Version v4
Adapter Name all
Adapter version n/a

Scenario / Use-case

This feature would reduce API calls when they matter most, since the remote APIs for cloud storage providers all support retrieving meta data in 1 call.

Summary

  1. Add a getMetaData(): StorageAttributes
  2. Rework internals to be more efficient with API calls if possible
  3. Make it explicit that a StorageAttributes method result of null means not set in the underlying storage instead of not set in the underlying storage, or not loaded by whatever method you used to create this object
@BilboTav
Copy link

BilboTav commented Jun 2, 2024

Yes, I second this!
Method getMetaData(): StorageAttributes to filesystem interface would be great. I wanted to implement custom metadata for my adapter and current design of interface does not support it.
Entity class FileAttributes has method extraMetadata(): array which seems similar idea was there, but it can not be used, because filesystem object can never return this full object via any of current interface methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants