Brainstorming/HostFerret/SQL/hf checkins: Difference between revisions

From Woozle Writes Code
Jump to navigation Jump to search
No edit summary
(content somehow got lost during import/export; here it is)
 
(7 intermediate revisions by the same user not shown)
Line 1: Line 1:
==About==
==About==
'''Purpose''': Rather than log each and every single checkin, we'll just keep statistics about stretches of checkins that fall within expected time-limits. That is, as long as a checkin isn't too late, we'll just treat it as part of a stretch of checkins and keep stats about the stretch. Each record is actually a stretch; maybe I should rename the table.
'''Purpose''': to keep a record of (1) changing IP addresses and (2) internet outages, by location.
 
* I had previously just dumped timestamp, IP address, and "from" string to a textfile, but this limited how often checkins could be done without creating an overwhelmingly large textfile. (I generally set the remotes to check in once per hour.)
Also for now, there will be no security – i.e. you don't have to have an account or permission to log a check-in (though you will need to have permission to access the logs). '''We will need security''' in order to implement DNS host record updating; otherwise an attacker could claim to be one of our hosts and put up a page which looks just like one of our login pages.
* So now, rather than log each and every single checkin, this system instead keeps statistics about stretches of checkins that fall within expected time-limits. That is, as long as a checkin isn't too late, we'll just treat it as part of a stretch of checkins and keep stats about the stretch.
 
* (Each record is actually a stretch of checkins; maybe I should rename the table.)
* At present, the system compares the current interval (current time minus time of previous checkin) with the mean; if the interval is larger than X (a preset constant) times the mean, then a new record is started. This provides an ongoing record of outages.
==Security==
For now, there will be no security – i.e. you don't have to have an account or permission to log a check-in (though you will need to have permission to access the logs). '''We will need security''' in order to implement DNS host record updating (see below), otherwise any client which guesses an active name could cause a domain to be redirected to their IP.
==Future==
Ultimately, I'd like to be able to automatically update DNS records for the locations in question if their IP addresses change. This will require some security, however, to prevent spoofers from causing the system to point domains at their IP addresses.
==Fields==
==Fields==
* '''ClientAddr''': IP address of caller (IPv6 can be up to 61 characters; allowing 63 because it's a nice round number and JIC)
* '''Stats*''': These collect enough information to allow mean and [[variance]]/[[standard deviation]] to be calculated.
* '''Stats*''': These collect enough information to allow mean and [[variance]]/[[standard deviation]] to be calculated.
** '''Stats_Count''' (N): number of checkins total
** '''Stats_Count''' (N): number of checkins total
** '''Stats_DiffSum''': total of all time-intervals (note that this should be divided by (N-1) to produce [[mean]]
** '''Stats_DiffSum''': total of all time-intervals (note that this should be divided by (N-1) to produce [[mean]]
** '''Stats_DiffSSq''': sum of time-interval squares
** '''Stats_DiffSSq''': sum of time-interval squares
==History==
* '''2017-12-10''' Removed ID_Session, because (mainly) we'd have to save cookies somehow on the wget side or else we'd create a new session each time.
** Also added missing "NOT" to NameGiven
* '''2017-12-11'''
** Added WhenLast field, so we can keep track of the length of the run
** made Stats_DiffSum and Stats_DiffSq default to NULL because they can't start accumulating until the 2nd check-in
** added ClientAddr because (duh) we need to record the IP address
** added WhenFinal so we can filter for active checkins
==SQL==
==SQL==
<mysql>CREATE TABLE `hf_checkins` (
<source lang=mysql>CREATE TABLE `hf_checkins` (
   `ID`                   INT NOT NULL AUTO_INCREMENT,
   `ID`                 INT     NOT NULL AUTO_INCREMENT,
   `NameGiven`       VARCHAR(255) NULL COMMENT "self-identifying string offered by remote",
  `ClientAddr` VARCHAR(63)    NOT NULL COMMENT "IP address (v4 or v6)",
   `WhenStart`       DATETIME NOT NULL COMMENT "when the stretch started",
   `NameGiven` VARCHAR(255)     NOT NULL COMMENT "self-identifying string offered by remote",
   `ID_Session`           INT NOT NULL COMMENT "session.ID of checkin connection",
   `WhenStart`     DATETIME     NOT NULL COMMENT "when the stretch started",
   `Stats_Count`         INT NOT NULL COMMENT "number of checkins in this stretch",
   `WhenLast`     DATETIME    NOT NULL COMMENT "when the last checkin was recorded",
   `Stats_DiffSum`       INT NOT NULL COMMENT "SUM(X) where X is interval since previous checkin",
  `WhenFinal`    DATETIME DEFAULT NULL COMMENT "when this checkin series was stopped",
   `Stats_DiffSSq`       INT NOT NULL COMMENT "SUM(X^2)",
   `Stats_Count`       INT     NOT NULL COMMENT "number of checkins in this stretch",
   `Stats_DiffSum`     INT DEFAULT NULL COMMENT "SUM(X) where X is interval since previous checkin",
   `Stats_DiffSSq`     INT DEFAULT NULL COMMENT "SUM(X^2)",
   PRIMARY KEY(`ID`)
   PRIMARY KEY(`ID`)
)
)
ENGINE = InnoDB;</mysql>
ENGINE = InnoDB;</source>

Latest revision as of 16:39, 29 February 2020

About

Purpose: to keep a record of (1) changing IP addresses and (2) internet outages, by location.

  • I had previously just dumped timestamp, IP address, and "from" string to a textfile, but this limited how often checkins could be done without creating an overwhelmingly large textfile. (I generally set the remotes to check in once per hour.)
  • So now, rather than log each and every single checkin, this system instead keeps statistics about stretches of checkins that fall within expected time-limits. That is, as long as a checkin isn't too late, we'll just treat it as part of a stretch of checkins and keep stats about the stretch.
  • (Each record is actually a stretch of checkins; maybe I should rename the table.)
  • At present, the system compares the current interval (current time minus time of previous checkin) with the mean; if the interval is larger than X (a preset constant) times the mean, then a new record is started. This provides an ongoing record of outages.

Security

For now, there will be no security – i.e. you don't have to have an account or permission to log a check-in (though you will need to have permission to access the logs). We will need security in order to implement DNS host record updating (see below), otherwise any client which guesses an active name could cause a domain to be redirected to their IP.

Future

Ultimately, I'd like to be able to automatically update DNS records for the locations in question if their IP addresses change. This will require some security, however, to prevent spoofers from causing the system to point domains at their IP addresses.

Fields

  • ClientAddr: IP address of caller (IPv6 can be up to 61 characters; allowing 63 because it's a nice round number and JIC)
  • Stats*: These collect enough information to allow mean and variance/standard deviation to be calculated.
    • Stats_Count (N): number of checkins total
    • Stats_DiffSum: total of all time-intervals (note that this should be divided by (N-1) to produce mean
    • Stats_DiffSSq: sum of time-interval squares

History

  • 2017-12-10 Removed ID_Session, because (mainly) we'd have to save cookies somehow on the wget side or else we'd create a new session each time.
    • Also added missing "NOT" to NameGiven
  • 2017-12-11
    • Added WhenLast field, so we can keep track of the length of the run
    • made Stats_DiffSum and Stats_DiffSq default to NULL because they can't start accumulating until the 2nd check-in
    • added ClientAddr because (duh) we need to record the IP address
    • added WhenFinal so we can filter for active checkins

SQL

CREATE TABLE `hf_checkins` (
  `ID`                 INT     NOT NULL AUTO_INCREMENT,
  `ClientAddr` VARCHAR(63)     NOT NULL COMMENT "IP address (v4 or v6)",
  `NameGiven` VARCHAR(255)     NOT NULL COMMENT "self-identifying string offered by remote",
  `WhenStart`     DATETIME     NOT NULL COMMENT "when the stretch started",
  `WhenLast`      DATETIME     NOT NULL COMMENT "when the last checkin was recorded",
  `WhenFinal`     DATETIME DEFAULT NULL COMMENT "when this checkin series was stopped",
  `Stats_Count`        INT     NOT NULL COMMENT "number of checkins in this stretch",
  `Stats_DiffSum`      INT DEFAULT NULL COMMENT "SUM(X) where X is interval since previous checkin",
  `Stats_DiffSSq`      INT DEFAULT NULL COMMENT "SUM(X^2)",
   PRIMARY KEY(`ID`)
)
ENGINE = InnoDB;