Ferret File System: Difference between revisions

From Woozle Writes Code
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
==About==
==Conceptual Design==
[[Ferret File System]] (FFS) is actually two ideas which can work together or separately: ''storage agnosticism'' and ''file format agnosticism''.
[[Ferret File System]] (FFS) is actually two ideas which can work together or separately: ''storage agnosticism'' and ''file format agnosticism''.
===Storage Agnosticism===
===Storage Agnosticism===
Line 26: Line 26:


''writing in progress''
''writing in progress''
==as an Application==
Alternatively, these concepts can be implemented as a userland/desktop application rather than a core filesystem feature. This should make it possible to gain the benefits of these services without investing as much development time, allowing them to be tested and refined before being attempted as a kernel-level service.
The application would scan a given set of folders, recording in a database every file found and logging any changes noted. Each file-record would include:
* full path/name
* timestamps
* file size
* file ownership, system attributes, etc.
* contents fingerprint (hash, presumed to be unique)
During each scan, file entries would be checked against the DB by both hash and filespec, to identify changed contents or renaming. Every file found would be logged, along with any changes noted (i.e. anytime there is a partial match with an existing file, the current file will be logged as a modification of that file).
There would also be semantic data tables, to allow users to enter arbitrary information about each file.

Revision as of 01:05, 25 February 2024

Conceptual Design

Ferret File System (FFS) is actually two ideas which can work together or separately: storage agnosticism and file format agnosticism.

Storage Agnosticism

Traditional filesystems present "drives" or "volumes" that more or less correspond to physical devices, each with its own invariable maximum capacity. This is the metaphor used across all interfaces, including user (GUI, CLI) and application (API).

FFS presents the user with a single storage space that unifies all storage to which the system has access but tracks the availability characteristics of each file. Immediately obvious attributes include:

  • location: What is the physical or metaphorical "place" from which a file is being accessed?
    • Examples: a particular device, a LAN, a building, "the internet"
  • frequency: How often does the file need to be accessed from that location?
    • Examples:
      • archival storage (rarely accessed)
      • social media archive (may vary dynamically)
      • local work (needed a lot by one or more devices, less in demand elsewhere)
      • playing media (immediate read access; write access can be slower or restricted; make local temporary copy)
  • criticality: How important is it to make sure that this file is not lost?
    • This can range from "temporary copy" to "mission-critical".
  • versioning: On what schedule/scheme (if any) should old versions of this file be kept?

Ideally, FFS would determine the attributes of each file based on usage-patterns, but this will be done by a separate agent. There will also be an interface by which to set the initial (expected) availability attributes as well as to override the agent's algorithmic determinations.

FFS will move files around as needed to accommodate changing needs, as usage shifts and storage-device-space availability changes (due to being filled up, becoming unreliable, or other characteristics). It will also keep more extra copies, on different storage devices, of more critical data.

File Format Agnosticism

Traditional filesystems make certain assumptions about what metadata they need to track for each file – typically: creation date, modification date, file "extension" – which leaves applications to figure out how to present additional metadata they may need to store. This results in a proliferation of "file formats" which need to be understood in order to access that metadata or alter the file's contents without making it unusable.

FFS provides a mechanism for storing arbitrary semantic metadata, and a template system for repackaging that data back into a canonical file-format when needed. This allows applications and end-users to much more easily read and modify existing metadata as well as adding their own.

writing in progress

as an Application

Alternatively, these concepts can be implemented as a userland/desktop application rather than a core filesystem feature. This should make it possible to gain the benefits of these services without investing as much development time, allowing them to be tested and refined before being attempted as a kernel-level service.

The application would scan a given set of folders, recording in a database every file found and logging any changes noted. Each file-record would include:

  • full path/name
  • timestamps
  • file size
  • file ownership, system attributes, etc.
  • contents fingerprint (hash, presumed to be unique)

During each scan, file entries would be checked against the DB by both hash and filespec, to identify changed contents or renaming. Every file found would be logged, along with any changes noted (i.e. anytime there is a partial match with an existing file, the current file will be logged as a modification of that file).

There would also be semantic data tables, to allow users to enter arbitrary information about each file.