Ferreteria/v0.6/clade/Sys/Data/Codec

From Woozle Writes Code
< Ferreteria‎ | v0.6‎ | clade‎ | Sys‎ | Data
Jump to navigation Jump to search
clade: Sys\Data\Codec
Clade Family
StandardBase Codec
Clade Aliases
Alias Clade
Base* [c,i] Aux\StandardBase
DadexIface Sys\Data\Things\Array\aux\Index
QStr* [c,i] Data\Mem\QVar\Str
SpecIface Sys\Data\Codec\aux\CommSpec
Subpages

About

  • Purpose: in this context, a Codec is a thing which converts between a string and structured data (typically an array of key/value pairs)

Thinking

  • 2024-08-07 New premise: all Codecs can, in theory, use a string for the encoded output.
    • Where needed, other encodings can be used (need case: $argv[]).
    • Where not needed, EncodeString() or DecodeString() can be stubbed off with a code prompt.

Namespaces

  • aux: auxiliary clades

History

when what
2024-01-28 reparenting this to descend from DataMarks, because... what are they without each other?
2024-01-29 commenting out AssembleSegment(), because apparently nothing uses it.
2024-02-15 [Codec\String] changing some names so we can use MarksIface:
  • ItemSep -> SepMarkPieceArrayToString(PieceArrayToString(
  • AssignOp -> SetMark
  • EscapeOp -> EscMark
2024-02-19 DecodeFromString() xTODO: rename to StringToRequest()
2024-07-11 [Codec\String] much rearranging; I hope I don't regret this... but I need clearer nomenclature/structure.
  • Renamed DecodeFromString() -> StringToRequest()
  • Renamed EncodeToString() -> RequestToString()
2024-07-30
  • moving from [WFe]Sys\Kiosk -> [WFe]Data\Mem
  • Renamed RequestToString() -> PiecesToString() -> PieceArrayToString()
2024-08-03 [Codec\String] renamed from Marked to String because I need another family that uses an array for extern data
2024-08-07 much rearrangement of things --
  • moved from [WFe]Data to [WFe]Sys\Data
  • consolidated Codec\String back up into Codec
  • String documentation WAS:
    • PURPOSE: base clades for Codecs that use a marked (demarcated) string as the encoded form
    • THINKING: I'm not envisioning any other ways of decoding/encoding that wouldn't need this set of marks, but it seems possible -- so let's just put all the marks-related stuff in a descendent (i.e. this) trait (formerly class).
2024-08-10 Renamed PiecesToString(PiecesIface) -> DataToString(DataIface)
2024-08-21 more switching around: Codec objects now create Spec objects on demand, class configurable; Spec objects cannot create Codec objects.
2024-09-21 changed SplitIntoSegments() and SplitSegment() from PUBLIC to PROTECTED because nothing external seems to be using them, and it seems like that's how it should be
  • Added ArrayToRequest()
2024-09-22 Method-names ending in "ToRequest" actually operate on a DataIndex (from a Request), not the Request itself -- so renaming:
  • ArrayToRequest() -> ArrayToDadex()
  • StringToRequest() -> StringToDadex()
  • DataIface (alias) -> DadexIface
2024-12-13 Removing all the logging stuff because it's broken and just doesn't seem worth maintaining at this point.

Subspaces

  • aux: auxiliary clades

Code

interface iCodec extends BaseIface {
    // SETUP
    static function FromCommSpec(SpecIface $o) : self;
    // I/O
    function DecodeString(string $s); // string -> internal storage
    function EncodeString() : string; // internal storage -> string
    function DecodeArray(array $ar);  // array -> internal storage
    // ENCODE
    function DataToString(DadexIface $oData) : string;
    function PieceArrayToString(array $arSeq) : string;
    // DECODE
    function StringToDadex(string $uri, DadexIface $oReq);
    // DIAGS
    function QErrorMessage() : QStrIface;
}
abstract class caCodec extends BaseClass implements iCodec {

    // ++ CONFIG ++ //

    protected function CommSpecClass() : string { self::PromptForMethod(); }

    protected function VenueEncode(string $s) : string { self::PromptForMethod(); }
    protected function VenueDecode(string $s) : string { self::PromptForMethod(); }

    // -- CONFIG -- //
    // ++ SETUP ++ //

    protected function __construct(){}

    static public function FromCommSpec(SpecIface $o) : self {
        $oThis = new static;
        $oThis->WithCommSpec($o);
        return $oThis;
    }

    // ++ SETUP: dynamic ++ //

    private $oSpec;
    protected function WithCommSpec(SpecIface $o) { $this->oSpec = $o; }
    protected function CommSpec() : SpecIface { return $this->oSpec; }

    // -- SETUP -- //
    // ++ I/O API ++ //

    public function DecodeString(string $s) { $this->StringToDadex($s,$this->CommSpec()->Reader()->Dadex()); }
    public function EncodeString() : string { return $this->DataToString($this->CommSpec()->Writer()->Dadex()); }
    public function DecodeArray(array $ar) { return $this->ArrayToDadex($ar,$this->CommSpec()->Reader()->Dadex()); }

    // -- I/O API -- //
    // ++ DECODE (string->data) - API ++ //

    /**
     * USAGE: Caller passes string to decode and a writable Pieces object;
     *  the decoded Pieces are added to the object.
     *  Passing the Seq() part of a Request Reader seems to work ok.
     * ASSUMES: $oDx is blank, or at any rate it's ok to not clear it before writing.
     */
    public function StringToDadex(string $uri,DadexIface $oDx) {
        $arSegs = $this->SplitIntoSegments($uri);
        foreach ($arSegs as $idx => $sSeg) {
            if ($sSeg != '') {  // ignore empty segments
                $this->ProcessSegment($sSeg,$oDx);
            }
        }
    }
    /**
     * THINKING: The main difference between this and StringToDadex() is in:
     *  * How $sSeg is derived (Here: array element-value; There: split from string)
     *  * Which segments we ignore (Here: 0th element; There: empty values)
     */
    public function ArrayToDadex(array $ar,DadexIface $oDx) {
        foreach ($ar as $idx => $sSeg) {
            if ($idx > 0) { // skip [0], which is just the executable path
                #echo "[#$idx]=[$sSeg]".CRLF;
                $this->ProcessSegment($sSeg,$oDx);
            }
        }
    }

    // ++ DECODE - internal helpers ++ //

    /**
     * NOTE: Not sure if this properly handles escaped SepMark() characters.
     */
    protected function SplitIntoSegments(string $s) : array {
        $cSep = $this->CommSpec()->Marks()->SepMark();
        $sTidy = trim($s,$cSep." \n\r\t\v\x00");  // mainly to remove lead/trail slashes, but also whitespace
        return explode($cSep, $sTidy);
    }
    protected function ProcessSegment(string $sSeg,DadexIface $oReq) {
        $arPiece = $this->SplitSegment($sSeg);
        $nPieces = count($arPiece);
        if ($nPieces == 1) {
            $oReq->AddSlug($arPiece[0]);
            #$sPlur = '';
        } else {
            $sKey = $arPiece[0];
            $sEncoded = $arPiece[1];
            $sDecoded = $this->DecodePart($sEncoded);
            $oPair = $oReq->AddPair($sKey,$sDecoded);
            if ($nPieces > 2) {
                for ($idx=2; $idx<$nPieces; $idx++) {
                    $oPair->AddExtra($arPiece[$idx]);
                }
            }
        }
    }
    /**
     * INPUT: an unescaped URI segment (string)
     * OUTPUT: an enumerated array containing each element found
     *  For key:value segments:
     *    array[0] = key
     *    array[1] = value, if it exists
     *      Caller should check length of array, rather than assuming this element will exist.
     * OPPOSITE: AssembleSegment()
     */
    protected function SplitSegment(string $s) : array {
        $chOp = $this->CommSpec()->Marks()->SetMark();
        return explode($chOp,$s);  // split out key and value
    }
    /**
     * OPPOSITE: EncodePart()
     */
    protected function DecodePart(string $s) : string {
        $oMarks = $this->CommSpec()->Marks();
        $sEsc = $oMarks->EscMark();
        $sOut = $this->VenueDecode($s); // do any venue-specific unescaping
        $sOut = str_replace($sEsc.$sEsc, $sEsc, $sOut);  // replace double-escape with single
        $sOut = str_replace($sEsc.'S', $oMarks->SepMark(), $sOut);
        $sOut = str_replace($sEsc.'A', $oMarks->SetMark(), $sOut);
        return $sOut;
    }

    // -- DECODE -- //
    // ++ ENCODE (data->string) - API ++ //

    public function DataToString(DadexIface $oData) : string {
        return $this->PieceArrayToString($oData->Seq()->GetStore());
    }

    // ++ ENCODE - utility helpers ++ //

    /**
     * INPUT: array of Piece objects
     * CEMENT root class/iface
     */
    public function PieceArrayToString(array $arSeq) : string {
      #echo 'PIECES:'"`UNIQ--pre-00000003-QINU`"'';
        $oMarks = $this->CommSpec()->Marks();
        $sSep = $oMarks->SepMark();
        $sSet = $oMarks->SetMark();
        $sOut = '';
        foreach ($arSeq as $nIdx => $oPiece) {
            $sOut .= $sSep.$oPiece->Name();
            if ($oPiece->HasIt()) {
                $sEncoded = $this->EncodePart($oPiece->Value());
                $sOut .= $sSet.$sEncoded;
            }
        }
        return $sOut;
    }

    // ++ ENCODE - internal helpers ++ //

    /**
     * OPPOSITE: DecodePart()
     */
    protected function EncodePart(string $s) : string {
        $oMarks = $this->CommSpec()->Marks();
        $sEsc = $oMarks->EscMark();
        // escape any bare escape characters
        $sOut = str_replace($sEsc, $sEsc.$sEsc, $s);
        // escape any assign-operator chars
        $sOut = str_replace($oMarks->SetMark(), $sEsc.'A', $sOut);
        // escape any item-separator chars
        $sOut = str_replace($oMarks->SepMark(), $sEsc.'S', $sOut);
        // do any necessary venue-specific encoding
        $uriOut = $this->VenueEncode($sOut);
        return $uriOut;
    }
    protected function AppendSegment(string $sSequence, string $sSegment) : string {
        if (strlen($sSegment) > 0) {
            // if the result isn't empty, separate and append it:
            if (strLen($sSequence > 0)) {
                // if the existing string isn't empty, add the separator; otherwise, it's not needed:
                $sSequence .= $this->CommSpec()->Marks()->SepMark();
            }
            $sSequence .= $sSegment;
        }
        return $sSequence;
    }

    // -- ENCODE -- //
    // ++ DIAGS ++ //

    private $osErr = NULL; public function QErrorMessage() : QStrIface { return $this->osErr ?? ($this->osErr = new QStrClass); }

    // -- DIAGS -- //
}