Brainstorming/HostFerret/SQL/hf checkins

From Woozle Writes Code
Jump to navigation Jump to search

About

Purpose: to keep a record of (1) changing IP addresses and (2) internet outages, by location.

  • I had previously just dumped timestamp, IP address, and "from" string to a textfile, but this limited how often checkins could be done without creating an overwhelmingly large textfile. (I generally set the remotes to check in once per hour.)
  • So now, rather than log each and every single checkin, this system instead keeps statistics about stretches of checkins that fall within expected time-limits. That is, as long as a checkin isn't too late, we'll just treat it as part of a stretch of checkins and keep stats about the stretch.
  • (Each record is actually a stretch of checkins; maybe I should rename the table.)
  • At present, the system compares the current interval (current time minus time of previous checkin) with the mean; if the interval is larger than X (a preset constant) times the mean, then a new record is started. This provides an ongoing record of outages.

Security

For now, there will be no security – i.e. you don't have to have an account or permission to log a check-in (though you will need to have permission to access the logs). We will need security in order to implement DNS host record updating (see below), otherwise any client which guesses an active name could cause a domain to be redirected to their IP.

Future

Ultimately, I'd like to be able to automatically update DNS records for the locations in question if their IP addresses change. This will require some security, however, to prevent spoofers from causing the system to point domains at their IP addresses.

Fields

  • ClientAddr: IP address of caller (IPv6 can be up to 61 characters; allowing 63 because it's a nice round number and JIC)
  • Stats*: These collect enough information to allow mean and variance/standard deviation to be calculated.
    • Stats_Count (N): number of checkins total
    • Stats_DiffSum: total of all time-intervals (note that this should be divided by (N-1) to produce mean
    • Stats_DiffSSq: sum of time-interval squares

History

  • 2017-12-10 Removed ID_Session, because (mainly) we'd have to save cookies somehow on the wget side or else we'd create a new session each time.
    • Also added missing "NOT" to NameGiven
  • 2017-12-11
    • Added WhenLast field, so we can keep track of the length of the run
    • made Stats_DiffSum and Stats_DiffSq default to NULL because they can't start accumulating until the 2nd check-in
    • added ClientAddr because (duh) we need to record the IP address
    • added WhenFinal so we can filter for active checkins

SQL

CREATE TABLE `hf_checkins` (
  `ID`                 INT     NOT NULL AUTO_INCREMENT,
  `ClientAddr` VARCHAR(63)     NOT NULL COMMENT "IP address (v4 or v6)",
  `NameGiven` VARCHAR(255)     NOT NULL COMMENT "self-identifying string offered by remote",
  `WhenStart`     DATETIME     NOT NULL COMMENT "when the stretch started",
  `WhenLast`      DATETIME     NOT NULL COMMENT "when the last checkin was recorded",
  `WhenFinal`     DATETIME DEFAULT NULL COMMENT "when this checkin series was stopped",
  `Stats_Count`        INT     NOT NULL COMMENT "number of checkins in this stretch",
  `Stats_DiffSum`      INT DEFAULT NULL COMMENT "SUM(X) where X is interval since previous checkin",
  `Stats_DiffSSq`      INT DEFAULT NULL COMMENT "SUM(X^2)",
   PRIMARY KEY(`ID`)
)
ENGINE = InnoDB;