Class TMultiPageTemplate

Unit

Declaration

type TMultiPageTemplate = class(TObject)

Description

A multi-page template, which defines which and how web pages are processed.

A multi-page template defines a list of actions, each action listing webpages to download and queries to run on those webpages.
You can then call an action, let it run its queries, and read the result as variables.

(In the past patterns, were called templates, too, but they are very different from the multi-page template of this unit.
A multi-page template is a list of explicit actions that are performed in order, like an algorithm or script;
A pattern (single-page template) is an implicit pattern that is matched against the page, like a regular expression)

The syntax of a multi-page template is inspired by the XSLT/XProc syntax and looks like this:

<actions>
<action id="action-1">
  <variable name="foobar" value="xyz"/>

  <page url="url to send the request to">
    <header name="header name">value...</header>
    <post name="post variable name"> value... </post>
  </page>
  <pattern> ...to apply to the previous page (inline)... </pattern>
  <pattern href="to apply to the previous page (from a file)"/>

  ...

</action>
<action id="action-2">

  ...
</action>
 ...
</actions>

<actions> contains a list/map of named actions, each <action> can contain:

Details for each element:

<page url="request url">

Specifies a page to download and process.
You can use <post name="..name.." value="..value..">..value..</post> child elements under <page> to add variables for a post request to send to the url.
If the name attribute exists, the content is url-encoded, otherwise not.
(currently, the value attribute and the contained text are treated as a string to send. In future versions, the contained text will be evaluated as XPath expression.)
If no <post> children exist, a GET request is sent.

The patterns that should be applied to the downloaded page, can be given directly in a <pattern> element, or in a separate file linked by the pattern-href attribute. (see THtmlTemplateParser for a description of the pattern-matching single-page template.)

The attribute test="xpath" can be used to skip a page if the condition in the attribute evaluates to false().

<pattern href="file" name=".."> inline pattern </variable>

This applies a pattern to the last page.

The pattern can be given inline or loaded from a file in the href attribute.

The name attribute is only used for debugging.

<variable name="name" value="str value">xpath expression</variable>

This sets the value of the variable with name $name.

If the value attribute is given, it is set to the string value of the attribute, otherwise, the xpath expression is evaluated and its result is used.

The last downloaded webpage is available as the root element in the XPath expression.

<loop var="variable name" list="list (xpath)" test="condition (xpath)">

Repeats the children of this element.
It can be used like a foreach loop by giving the var/list attributes, like a while loop by using test, or like a combination of both.
In the first case, the expression in list is evaluated, each element of the resulting sequence is assigned once to the variable with the name $var, and the loop body is evaluated each time.
In the second case, the loop is simply repeated forever, until the expression in the test attributes evaluates to false.

<call action="name">

Calls the action of the given name.

<if test="...">

Evaluates the children of this element, if the test evaluates to true().

<choose> <when test="..."/> <otherwise/> </choose>

Evaluates the tests of the when-elements and the children of the first <when> that is true.
If no test evaluates to true(), the children of <otherwise> are evaluated.

<s>...</s>

Evaluates an XPath/XQuery expression (which can set global variables with :=).

<try> ... <catch errors="...">...</catch> </s>

Iff an error occurs during the evaluation of the non-<catch> children of the <try>-element, the children of matching <catch>-element are evaluated. This behaves similar to the try-except statement in Pascal and <try><catch> in XSLT.

The errors attribute is a whitespace-separated list of error codes caught by that <catch> element. XPath/XQuery errors have the form err:* with the value of * given in the XQuery standard.
HTTP errors have the internal form pxp:http123 where pxp: is the default prefix. Nevertheless, they can be matched using the namespace prefix http as http:123. Partial wildcards are accepted like http:4* to match the range 400 to 499.
pxp:pattern is used for pattern matching failures.

<include href="filename">

Includes another XML file. It behaves as if the elements of the other file were copy-pasted here.

Within all string attributes, you can access the previously defined variables by writing {$variable} .
Within an XPath expression, you can access the variable with $variable.

Hierarchy

Overview

Fields

Public baseActions: TTemplateAction;
Public name:string;

Methods

Public constructor create();
Public procedure loadTemplateFromDirectory(_dataPath: string; aname: string = 'unknown');
Public procedure loadTemplateFromString(template: string; aname: string = 'unknown'; path: string = '');
Public procedure loadTemplateWithCallback(loadSomething: TLoadTemplateFile; _dataPath: string; aname: string = 'unknown');
Public destructor destroy; override;
Public function findAction(_name:string): TTemplateAction;
Public function findVariableValue(aname: string): string;
Public function clone: TMultiPageTemplate;

Description

Fields

Public baseActions: TTemplateAction;

The primary <actions> element (or the first <action> element, if only one exists)

Public name:string;

A name for the template, for debugging

Methods

Public constructor create();
 
Public procedure loadTemplateFromDirectory(_dataPath: string; aname: string = 'unknown');

Loads a template from a directory.
The multipage template is read from the file template.

Public procedure loadTemplateFromString(template: string; aname: string = 'unknown'; path: string = '');

Loads a template directly from a string.
If the template loads additional files like include files, you need to give a path.

Public procedure loadTemplateWithCallback(loadSomething: TLoadTemplateFile; _dataPath: string; aname: string = 'unknown');

Loads a template using a callback function. The callback function is called with different files names to load the corresponding file.

Public destructor destroy; override;
 
Public function findAction(_name:string): TTemplateAction;

Returns the <action> element with the given id.

Public function findVariableValue(aname: string): string;

Find the first <variable> element definining a variable with the given name.
Only returns the value of the value attribute, ignoring any contained xpath expression

Public function clone: TMultiPageTemplate;
 

Generated by PasDoc 0.16.0.