Brainstorming/HostFerret/SQL/hf checkins: Difference between revisions

From Woozle Writes Code
Jump to navigation Jump to search
(some corrections)
(schema changes)
Line 4: Line 4:
Also for now, there will be no security – i.e. you don't have to have an account or permission to log a check-in (though you will need to have permission to access the logs). '''We will need security''' in order to implement DNS host record updating, otherwise any client which guesses an active name could cause a domain to be redirected to their IP.
Also for now, there will be no security – i.e. you don't have to have an account or permission to log a check-in (though you will need to have permission to access the logs). '''We will need security''' in order to implement DNS host record updating, otherwise any client which guesses an active name could cause a domain to be redirected to their IP.
==Fields==
==Fields==
* '''ClientAddr''': IP address of caller (IPv6 can be up to 61 characters; allowing 63 because it's a nice round number and JIC)
* '''Stats*''': These collect enough information to allow mean and [[variance]]/[[standard deviation]] to be calculated.
* '''Stats*''': These collect enough information to allow mean and [[variance]]/[[standard deviation]] to be calculated.
** '''Stats_Count''' (N): number of checkins total
** '''Stats_Count''' (N): number of checkins total
Line 11: Line 12:
* '''2017-12-10''' Removed ID_Session, because (mainly) we'd have to save cookies somehow on the wget side or else we'd create a new session each time.
* '''2017-12-10''' Removed ID_Session, because (mainly) we'd have to save cookies somehow on the wget side or else we'd create a new session each time.
** Also added missing "NOT" to NameGiven
** Also added missing "NOT" to NameGiven
* '''2017-12-11''' Added WhenLast field, so we can keep track of the length of the run; made Stats_DiffSum and Stats_DiffSq default to NULL because they can't start accumulating until the 2nd check-in; added ClientAddr because (duh) we need to record the IP address
==SQL==
==SQL==
<syntaxhighlight lang=mysql>CREATE TABLE `hf_checkins` (
<syntaxhighlight lang=mysql>CREATE TABLE `hf_checkins` (
   `ID`                INT NOT NULL AUTO_INCREMENT,
   `ID`                INT NOT NULL AUTO_INCREMENT,
  `ClientAddr` VARCHAR(63) NOT NULL COMMENT "IP address (v4 or v6)",
   `NameGiven` VARCHAR(255) NOT NULL COMMENT "self-identifying string offered by remote",
   `NameGiven` VARCHAR(255) NOT NULL COMMENT "self-identifying string offered by remote",
   `WhenStart`    DATETIME NOT NULL COMMENT "when the stretch started",
   `WhenStart`    DATETIME NOT NULL COMMENT "when the stretch started",
  `WhenLast`      DATETIME NOT NULL COMMENT "when the last checkin was recorded",
   `Stats_Count`        INT NOT NULL COMMENT "number of checkins in this stretch",
   `Stats_Count`        INT NOT NULL COMMENT "number of checkins in this stretch",
   `Stats_DiffSum`      INT NOT NULL COMMENT "SUM(X) where X is interval since previous checkin",
   `Stats_DiffSum`      INT DEFAULT NULL COMMENT "SUM(X) where X is interval since previous checkin",
   `Stats_DiffSSq`      INT NOT NULL COMMENT "SUM(X^2)",
   `Stats_DiffSSq`      INT DEFAULT NULL COMMENT "SUM(X^2)",
   PRIMARY KEY(`ID`)
   PRIMARY KEY(`ID`)
)
)
ENGINE = InnoDB;</syntaxhighlight>
ENGINE = InnoDB;</syntaxhighlight>

Revision as of 15:57, 11 December 2017

About

Purpose: Rather than log each and every single checkin, we'll just keep statistics about stretches of checkins that fall within expected time-limits. That is, as long as a checkin isn't too late, we'll just treat it as part of a stretch of checkins and keep stats about the stretch. Each record is actually a stretch; maybe I should rename the table.

Also for now, there will be no security – i.e. you don't have to have an account or permission to log a check-in (though you will need to have permission to access the logs). We will need security in order to implement DNS host record updating, otherwise any client which guesses an active name could cause a domain to be redirected to their IP.

Fields

  • ClientAddr: IP address of caller (IPv6 can be up to 61 characters; allowing 63 because it's a nice round number and JIC)
  • Stats*: These collect enough information to allow mean and variance/standard deviation to be calculated.
    • Stats_Count (N): number of checkins total
    • Stats_DiffSum: total of all time-intervals (note that this should be divided by (N-1) to produce mean
    • Stats_DiffSSq: sum of time-interval squares

History

  • 2017-12-10 Removed ID_Session, because (mainly) we'd have to save cookies somehow on the wget side or else we'd create a new session each time.
    • Also added missing "NOT" to NameGiven
  • 2017-12-11 Added WhenLast field, so we can keep track of the length of the run; made Stats_DiffSum and Stats_DiffSq default to NULL because they can't start accumulating until the 2nd check-in; added ClientAddr because (duh) we need to record the IP address

SQL

CREATE TABLE `hf_checkins` (
  `ID`                 INT NOT NULL AUTO_INCREMENT,
  `ClientAddr` VARCHAR(63) NOT NULL COMMENT "IP address (v4 or v6)",
  `NameGiven` VARCHAR(255) NOT NULL COMMENT "self-identifying string offered by remote",
  `WhenStart`     DATETIME NOT NULL COMMENT "when the stretch started",
  `WhenLast`      DATETIME NOT NULL COMMENT "when the last checkin was recorded",
  `Stats_Count`        INT NOT NULL COMMENT "number of checkins in this stretch",
  `Stats_DiffSum`      INT DEFAULT NULL COMMENT "SUM(X) where X is interval since previous checkin",
  `Stats_DiffSSq`      INT DEFAULT NULL COMMENT "SUM(X^2)",
   PRIMARY KEY(`ID`)
)
ENGINE = InnoDB;