|
|
Line 3: |
Line 3: |
| The Human Futilities are a set of file-oriented command-line utilities, primarily useful for handling large filesets. | | The Human Futilities are a set of file-oriented command-line utilities, primarily useful for handling large filesets. |
|
| |
|
| Three need-cases have been the primary drivers for developing these: (1) Woozle-compatible grep/find, (2) merging of old folderset, and (3) replacing Nextcloud's desktop client.
| | * {{l/sub|goals}} |
| ===1. Woozle Compatibility===
| |
| I find it very difficult to remember how to use the "{{l/htyp|grep}}" and "find" utilities, and the available help pages (e.g. {{fmt/code|man grep}} and {{fmt/code|grep --help}}) overwhelming. I'm not sure how the situation ended up this way, but I do know how I'd like and expect a file-finding application ({{l/sub|ui standards|or any CLI utility, really}}) to behave -- so I wrote {{l/sub/lc|FF}} to behave in that way.
| |
| | |
| Its main purpose is ''finding/identifying files that meet specific criteria'', and the input and output formats reflect that. If you're looking for text within the file, it won't go into a lot of detail about what it finds that matches. If you're looking for a file by date, it will show the actual timestamps that matched, for the matching files it finds. It will also optionally show its progress in a non-overwhelming way that doesn't require any pre-crawling of the folderset.
| |
| | |
| The ''specific'' reason I needed this utility was to find a file I knew I had created on a specific date in 2017, but could not find in my Nextcloud folderset. I didn't know exactly what I had called it, but I knew the date it would have been created and what the file extension would almost certainly be.
| |
| | |
| ===2. Archive Merging===
| |
| When I found it, it was in an archive of a folderset from OwnCloud (the predecessor to Nextcloud) which had apparently never been completely merged into our current (Nextcloud) folderset. Since the contents had been rearranged since the time when we last used OwnCloud, I couldn't just merge the folders in by the usual method (using the same relative paths) without ending up with a lot of duplication and/or misplaced files. It's 651.2 GB, according to Caja, so we can't really afford to just have duplicates until we can manually sort things out. (At ~1 TB, the current Nextcloud folderset is already straining or past the limits of various devices that use it.
| |
| | |
| FTI, FIC, and FCC were developed as a way of accomplishing this kind of merge. By indexing foldersets, comparing them ("what's in A but missing from B"), and then being able to do a folder-relative copy on the comparison results, we can accomplish this in a series of relatively simple and transparent (and therefore debuggable) steps.
| |
| ===3. Firing the Nextcloud Client===
| |
| While I was in the process of working all this out, it became apparent that the Nextcloud desktop client simply wasn't able to keep up -- and even when it is, the UI it provides is often deeply problematic. Initially, it kept crashing part way through the "checking" process (which, I'm guessing, is where it compares the local folderset to the one on the server in order to determine what needs synching from A to B and vice-versa) and always having to start over. Weeks went by when no new files on either side were being copied to the other.
| |
| | |
| After an upgrade of my local system (from Mint 20 to Mint 21), it appeared to start working again, and did in fact synchronize at least ''some'' files from the server, but it got stuck this time dealing with "file conflicts" which for some reason it couldn't resolve. There seemed to be two main cases:
| |
| * files with the same name, timestamp, and size
| |
| ** Nextcloud does not indicate whether it checked the contents, but I have no reason to think they aren't identical. Why couldn't Nextcloud determine this, and just skip them?
| |
| * files where one version is zero bytes
| |
| ** In this case, I could see maybe being a little cautious -- but I'd think a reasonable default would be to assume the file with zero bytes should be overwritten, as it's very easy to reconstruct a zero-byte file. Perhaps the sync client could write out a list of such overwrites, in case it was important to know ''which'' files were zero bytes... but this seems like a very unlikely edge-case.
| |
| | |
| In any case, there was a ''very large number'' of these files -- more than would fit in the non-resizable dialog box they provide -- and there are a number of issues with the UI provided for resolving the problem:
| |
| * Each file has to be examined individually, in a separate dialogue (there's no way to select a group of files and say which version (server vs. local) to use).
| |
| * The popup dialog seems to crash a lot -- comes up blank, goes into the background, is invisible...
| |
| * After approving a single file, the whole process seems to reset -- making it impossible to approve additional files until it regenerates the list. This makes the whole process very slow, to say the least.
| |
| | |
| I want something which will (a) not crash, (b) let me write rules for handling "conflicts", and (c) actually compare contents to see if there even really ''is'' a conflict.
| |
| | |
| Writing to address this need-case is still in progress, but the basic idea is that we will have FTI running on both sides (server and client), a process will download the server's latest index and use FIC to compare it to the local index, and FCC will somehow provide a means to synchronize the necessary files (there are at least a couple of different ways to do this).
| |
| ===Future=== | | ===Future=== |
| Thinking about how Nextcloud works has led me to realize its shortcomings and how overspecialized it is. I'm thinking that each piece of it can eventually be replaced by much more flexible tools. | | Thinking about how Nextcloud works has led me to realize its shortcomings and how overspecialized it is. I'm thinking that each piece of it can eventually be replaced by much more flexible tools. |
| | |
| | I'm also repeatedly seeing the usefulness of having a searchable index of files that includes filename/path, timestamps, hash, and ideally the ability to tag files and folders with keywords -- all of which would be a key part of [[FileFerret]] -- and how relatively easy it is to create and maintain such an index using these utilities. (It also makes me wonder why there are, apparently, no filesystems with a queryable file database built in -- or, at least, a way to add one on.) |
| ==Commands== | | ==Commands== |
| * {{l/sub/lc|FF}}: find files by mask, date, contents | | * {{l/sub/lc|FF}}: find files by mask, date, contents |
Purpose
The Human Futilities are a set of file-oriented command-line utilities, primarily useful for handling large filesets.
Future
Thinking about how Nextcloud works has led me to realize its shortcomings and how overspecialized it is. I'm thinking that each piece of it can eventually be replaced by much more flexible tools.
I'm also repeatedly seeing the usefulness of having a searchable index of files that includes filename/path, timestamps, hash, and ideally the ability to tag files and folders with keywords -- all of which would be a key part of FileFerret -- and how relatively easy it is to create and maintain such an index using these utilities. (It also makes me wonder why there are, apparently, no filesystems with a queryable file database built in -- or, at least, a way to add one on.)
Commands
- FF: find files by mask, date, contents
- FCC: file collection copy
- FIC: file index comparison
- FTI: file tree index
- FTS: file tree sync
Other