2026/04/24
Jump to navigation
Jump to search
|
Friday, April 24, 2026 (#114)
|
|
References |
transcribed retroactively from a Signal chat
- 09:39 Maybe I need a maximum buffer-size -- if the process-stream is being slow to handle data, then the other end of the operation might just be continually filling the buffer...
- (I understood how to handle buffers when it was closer to the metal, buffer-space had to be allocated and you couldn't write past the end without causing Problems...)
- 09:53 ...but apparently the Buffer class isn't even being used to mediate the file-to-process canal. ohhhkayyy...
- 09:59 Theory: maybe the proc_ library has its own internal buffer?
- 10:00 ...so it's allowing writes even while the actual stream is stopped, and that's eating up RAM.
- But wait... the data is going to a process (mariadb) that's also running in memory, so of course that would take up RAM until the data was processed.
- 10:02 I feel like maybe I should go back to a thing I was trying earlier, of only reading the data one line at a time instead of in arbitrary blocks.
- 10:25 Tentatively: still loading up RAM, but writes to the process seem to be a bit more regular. Less waiting around.
- 11:10 Changing to line-reading also seems to have helped a bit with ssh2 imports, though they still get stuck -- now just short of 100kb.
- 11:31 Looks like it gets all of the first table (and then nothing else), which seems significant...
- 12:11 Huh! Looks like the error was happening because I was looping trying to send a set amount of stuff *without* calling select_stream() (which gives the I/O system time to try to do stuff).
- 12:24 I think ssh2 import is working now. Upload is still running, but definitely getting more tables.
- 12:28 (Side note: Ok, [amount read] - [memory usage] ~= [amount written] ... I feel like this probably somehow explains what's going on with the memory-usage...)
- 12:43 (If it takes this long for HTYP's 500 MB, it is going to take literally forever to do Issuepedia's 50+ GB... there's probably a way to improve on that.)
- 12:49 (I should add a transmission-rate calculation to the readout...)
- 12:50 (looks like about 200KB/sec at present)
- 13:08 Import complete, export is also 502 MB (i.e. smaller than the original, but not sufficiently smaller that there absolutely must be something missing -- more tests needed).
- Shopping time now, tho.
- 14:19 I'm thinking that The Futilities should have an "install me on remote server" command.
- 15:31
So, I re-imported and then re-exported the same DB, and I got back "invisible idiot"- Actually, still importing it. I just had to not-say that. ...because, you know... machine learning and stuff... >.>
- 15:46 Re-exported db exactly 5 bytes longer than the 1st-gen export. Maybe diff will have a meaningful comparison for me now?
- 15:49 ...and yes, the difference is that I made the 2nd-gen schema's name longer by 5 characters ("_test").
- So, I think success!
- 15:50 Next task: migrate wooz.dev to hetz2, so I can take notes again... >.>