Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Insert excerpt
_Banners
_Banners
nameanalysis
nopaneltrue

This page is for a data modeller who needs to load data from an external source via HTTP.

Overview

An HTTP

Collector

collector reads data from a HTTP Datasource. The collector defines how the data needed from the datasource

will be

is extracted to be used in PhixFlow.

To add a new HTTP collector to an analysis model:

  1. Go to the model's toolbar → Create group.
  2. Click
    Insert excerpt
    _http
    _http
    nopaneltrue
     to expand the menu.
  3. Drag a 
    Insert excerpt
    _http_collector
    _http_collector
    nopaneltrue
     onto the analysis model.

To add an existing HTTP collector to an analysis model, in the model diagram toolbar:

  1. Go to the model toolbar → List group.
  2. Click
    Insert excerpt
    _http
    _http
    nopaneltrue
     to expand the menu.
  3. Click 
    Insert excerpt
    _http_collector
    _http_collector
    nopaneltrue
     to open the list of available collectors.
  4. Drag an HTTP collector into the analysis model.

Insert excerpt
_http_newlines
_http_newlines
nopaneltrue

Stream

Table Values in a HTTP Collector

To drive the lookups made by a HTTP

Collector

collector from a

stream

table, the two must be connected using a lookup pipe. For example

- If

, a URL for a server

is

can either be captured

,

or calculated

, on a stream

in an attribute called "ServerURL"

and passed

. The URL is then passed via a lookup pike to the HTTP

Collector

collector to be used in its URL Expression

, the pipe connecting the two must be a lookup pipe.

If the pipe is called in, here is how the URL Expression would be written on the HTTP

Collector

collector
  {substring(in.ServerUrl,9)}

Panel
borderColor#7da054
titleColorwhite
titleBGColor#7da054
borderStylesolid
titleSections on this page

Table of Contents
indent12px
stylenone

HTTP Collector Properties


Insert excerpt
_property_toolbar
_property_toolbar
nopaneltrue

Insert excerpt
_property_tabs
_property_tabs
namebasic-h
nopaneltrue

Insert excerpt
_

standard_settings

parent
_

standard_settings

parent
nopaneltrue

Basic Settings

FieldDescription
Name
Name
Enter the name of the HTTP
Collector
collector.
Enabled
Tick when
Insert excerpt
_check_box_tick
_check_box_tick
nopaneltrue
 when the configuration is complete and the collector is ready to be used.
HTTP Data SourceThe HTTP datasource that this collector will collect from.URL Expression

The URL to be used, without the leading http:// prefix. The URL may contain embedded expressions within { }. If this field is blank, the url field on the httpDatasourceInstance is used directly.

For Example, this expression adds to the base url provided by the HTTP Datasource Instance :

{_url}/sub1/sub2?param1=3

The HTTP Collector will follow any HTTP redirections and return

Send Message

Define details of the HTTP request sent to the HTTP Datasource to get the data required.

FieldDescription
HTTP Request Method


Excerpt

Select one of the following HTTP methods to use for the request:

  • GET or POST
  • GET
  • POST
  • DELETE
  • OPTIONS
  • PUT
  • PATCH

We recommend that you select a method but if you do not, PhixFlow uses GET or POST by default. If the Send Message → Statement Expression:

  • evaluates to null or empty string, PhixFlow uses GET
  • is not empty, PhixFlow uses POST.

For information, see the w3schools page about HTTP methods.

IconThe Icon to display in controls when this collector is used.Timeout (secs)The number of seconds to wait for a response from the corresponding HTTP datasource before a timeout is recorded.Allow Non-Scheduled CollectionIf this is turned on, then the collector will run as part of any ad-hoc Analysis Engine run which requires this data. If not, it will only run as part of a scheduled Task Plan under the Analysis Engine.Datasource Instance ExpressionThe datasource to which this collector is connected may list multiple instances from which the data may be accessed. Each HTTP Datasource Instance is identified by a unique string. This expression should evaluate to a string which allows the collector to determine the specific instance to use. If the expression is blank then the collector will assume that there is only one instance and will use that one by default. If there is more than one instance and no expression is provided here then an error will be thrown during analysis since the collector will be unable to determine which source to use.

Send Message

Define details of the HTTP request sent to the HTTP Datasource to get the data required.

FieldDescription


URL

Insert excerpt
_url_expression
_url_expression
nopaneltrue

The HTTP collector follows any HTTP redirections and returns the final response.

Statement Expression

An expression to generate the data that will be sent by the

exporter

collector to the datasource.

For Example

Expressions should be embedded in ${value}


Examplex

Code Block
titleXML Statement
<?xml version ="1.0"?>
<!DOCTYPE CORPORATE DASHBOARD "corpDash.dtd">
<results user="%USERNAME%" password="%PASSWORD%">
	<monthlyTotals region=${'"' + Region + '"'} division=${'"' + Division + '"'}>
<totalBilled>

		<totalBilled>${'"' + TotalBilled + '"'}<
\totalBilled> <totalCollected>
/totalBilled>
		<totalCollected>${'"' + TotalCollected + '"'}<
\totalCollected> <monthlyTotals> <\results>

The username and password for the HTTP Datasource Instance are available as %USERNAME% and %PASSWORD%.

The data will be encoded using the charset parameter specified by the Content-Type Header if one is present. If no Content-Type Header is set then ISO-8859-1 will be used. If the Content-Type header is set, but does not specify a charset then PhixFlow will use a default character set dependant on the content type.
/totalCollected>
	</monthlyTotals>
</results>


Code Block
titleJSON Statement
{
	user: '${user}',
	code:  ${'{size: "big"}'}
	price: ${price},
	currency: '${currency}'
}


Insert excerpt
_expression_lang
_expression_lang
nopaneltrue

Insert excerpt
_secret
_secret
nopaneltrue

See also HTTP datasource instance and Expressions and PhixScripts.

HTTP Headers

This section has a toolbar

with standard

with standard buttonsThe grid contains a list of the HTTP headers defined for this collector. To add a HTTP header to the list, click 

Insert excerpt
_new
_new
nopaneltrue
. PhixFlow opens a new HTTP Header

Properties

properties tab. To remove a HTTP header, use the

Insert excerpt
_delete
_delete
nopaneltrue
 in the toolbar.

Some headers will be set to default values if not provided. Automatically added headers may not appear in the debug log. The Content-Length header will be added to all requests and cannot be overridden by providing a value.

Response

Define

Defines the data response type/format that will be returned

:HTML: response type allows an XPath Expression to be specified in order to retrieve just specified sections of the data into XML structures

and the desired location of the data.

See Response Examples below for how the returned data can be used and evaluated in the corresponding attribute expressions.

FieldDescription
Return  Type

Select the data type expected as the response from the API call:

  • String: response type will return the full data as a string value. This is available using the value in a subsequent table.
  • XML: response type allows an XPath Expression to be specified in order to retrieve just specified sections of the data into XML structures. XML response types also support XML namespaces. The Xml Namspaces tab will be available when this response type is chosen.
String
  • HTML: response type
will return the full data as a string value.

Please see Response Examples for how the returned data can be used and evaluated in the corresponding stream attribute expressions.

FieldDescriptionReturn  TypeThe type of the expected response : XML/HTTP/StringXPathThe XPath
  • allows an XPath Expression to be specified in order to retrieve just specified sections of the data into XML structures.
  • JSON: response type allows a (JSON) Path expression to be specified in order to retrieve the desired JSON values.

XPath or JSONPath

Available when Return Type is XML, HTTP or JSON.

The expression used to resolve or filter the data that comes back in

XML or HTML

the selected format.

Note that Xpath

Note

Use XPath namespaces syntax

can only be used

for XML response types only.


Lenient Allows for leniency in the interpretation of the JSON data returned by an API. This help interpret poorly formed JSON data. 

XML Namespaces

Available when Return Type is XML or HTTP. This section has a toolbar

with standard

with standard buttonsThe grid contains a list of the namespaces defined in an XML response.

To add a namespace to the list, click 

Insert excerpt
_new
_new
nopaneltrue
. PhixFlow opens a new XML Namespace

Properties properties tab

property pane. To remove a namespace, use the

Insert excerpt
_delete
_delete
nopaneltrue
 in the toolbar.

Insert excerpt
_model_prop
_model_prop
nopaneltrue

Advanced

FieldDescription
HTTP DatasourceSelect the HTTP datasource that this collector will collect from. For how to add a new one, see HTTP DatasourceTo select from a list, click 
Insert excerpt
_http_datasource
_http_datasource
namelist
nopaneltrue
.
Datasource Instance Expression

The datasource to which this collector is connected may list multiple instances from which the data may be accessed. Each HTTP datasource instance is identified by a unique string. This expression should evaluate to a string which allows the collector to determine the specific instance to use. If the expression is blank then the collector will assume that there is only one instance and will use that one by default. If there is more than one instance and no expression is provided here then an error will be thrown during analysis since the collector will be unable to determine which source to use.

See also  HTTP datasource instance and Expressions and PhixScripts.

Timeout (secs)The number of seconds to wait for a response from the corresponding HTTP datasource before a timeout is recorded.
Collector Icon

PhixFlow displays this icon in controls when the HTTP collector is used.

Excerpt
nameicon

An icon can be set by uploading an

Insert excerpt
_image_list
_image_list
nameimage
nopaneltrue
 or using an image already uploaded to the application. If this field is left blank, PhixFlow checks for an icon set on the associated HTTP Datasource and uses that icon, and if that too is empty, then the default icon is used.


Allow Non-Scheduled Collection

Insert excerpt
_check_box_tick
_check_box_tick
nopaneltrue
 to run the HTTP collector as part of any ad-hoc analysis run that requires this data. If not, it will only run as part of a scheduled Task Plan.

Use Raw URL

If enabled the URL Template value is sent in exactly the format it is provided to the HTTP Node. If not enabled PhixFlow will transpose values to form a valid URL, such as replacing spaces with %20.

Log Traffic

Insert excerpt
_log_traffic2
_log_traffic2
nopaneltrue

Note

PhixFlow always logs HTTP responses and requests for HTTP collectors, whatever is set here.


Insert excerpt
_model_prop
_model_prop
nopaneltrue

Insert excerpt
_description
_description
nopaneltrue

Insert excerpt
_audit
_audit
nopaneltrue

Anchor
responseExamples
responseExamples
Response Examples


This example uses the following
XML and HTML
example data.

XML Data

<?xml version ="1.0"?> <root

Standard
Code Block
languagexml
<root> 
	<main page="PF Main Page"> 
		<title name="PF Title">PF Title Text
			<datarow> 
				<data initials="AA">Alistair Andrews</data>
				<data initials="BB">Bert Brown</data> 
			</datarow> 
		</title> 
	</main> 
</root>
With Namespace
Code Block
languagexml
<root xmlns:h="http://
www
example.
w3.org/TR/html4/
com/schema"> 
	<main page="PF Main Page"> 
		<h:title
h:
 name="PF Title">PF Title Text
			<h:datarow> 
				<h:data h:initials="AA">Alistair Andrews</
h:
data>
				<h:data h:initials="BB">Bert Brown</
h:
data> 
			</h:datarow> 
		</h:title>
<title name="Non namespace Title">Non namespace Title Text</title>
 
	</main> 
</root>

HTML Data

Code Block
languagexml
<html>
	<body nodename="Html Body"> 
		<table> 
			<tbody> 
				<tr> 
					<td initials="AA"
>Alistair
>Alistair Andrews</td> 
					<td initials="BB">Bert Brown</td> 
				</tr> 
			</tbody> 
		</table> 
	</body> 
</html>

JSON

Code Block
languagexml
{
	"main_page": {
		"page": "PF Main Page",
		"title" : {
			"name" : "PF Title Text,
			"data" : [
				{"initials": "AA", "value" : "Alistair Andrews"},
				{"initials": "BB", "value" : "Bert Brown"}
			]
		}
	}
}


The data is being pointed to by either HTTP datasources or XML/HTML File collectors respectively.

The following table shows the different types of responses that can be returned from an HTTP Collector and how these can be used in the corresponding

stream

attribute expressions. A HTTP Collector response type of XML/HTML will mimic the responses from XML/HTML Collectors respectively.

Response Type
XPath
Path ExpressionExplanation
Stringn/aA String response should be referenced in the
stream
attribute expressions as in.value Note that in.value will contain the complete string data referenced above.
XML
/root/main/
h:
title




Excerpt
Note

The namspace prefix used here 'h' must be configured in the XML Namespace Properties.

This XPath
name_xmlXPath

This path expression will bring back all elements matching the

xpath

XPath expression including the parent/grandparents and all child elements/

subelements. i.e

<root xmlns:h="http://www.w3.org/TR/html4/"> <main page="PF Main Page" > <h:title h:name="PF Title">PF Title Text <h:datarow> <h:data h:initials="AA">Alistair Andrews</h:data> <h:data h:initials="BB">Bert Brown</h:data> </h:datarow> </h:title> </main> </root>

sub elements. 

The following examples show how to reference the returned

xpaths

x paths html/xml data structure in

stream

attribute expressions:-

  • Xpath element
text value
  • attributes: in.
value -> returns 'PF Title Text'Xpath element attibutes: in.h$name
  • name -> returns 'PF Title'
  • Xpath parent attributes: in.^.page -> returns 'PF Main Page'
  • Xpath child attributes: listToString(in.
h$datarow
  • datarow.
h$data
  • data.
h$initials
  • initials) -> returns 'AA,BB'
  • Xpath child attribute text values: listToString(in.
h$datarow
  • datarow.
h$data
  • data.value) -> returns 'Alistair Andrews,Bert Brown'

Note the use of

  • $ instead of our usual : namespace notation.
    • ^ to traverse to the immediate parent element.
    • to traverse to the immediate child element within:
      • XPath uses /
      • Attribute expressions use the dot notation . 
    • the listToString function to handle multiple matching child elements/attributes.

    Namespaces

    XML documents containing namespaces are supported.

    • Within path expressions they are referred to using semicolons. 
      • /root/main/h:title
    • Within attribute expressions a $ is used instead of the normal : namespace notation
      • Xpath element attibutes: in.h$name -> returns 'PF Title''
      • Xpath child attributes: listToString(in.h$datarow.h$data.h$initials) -> returns 'AA,BB'
      • Xpath child attribute text values: listToString(in.h$datarow.h$data.value) -> returns 'Alistair Andrews,Bert Brown'


    Note

    The namespace prefix used here 'h' must be configured in the XML Namespace.



    HTML

    /html/body/table


    Note

    Namspaces are not supported in the Xpath expression for HTML response types.


    This

    XPath

    Path expression will bring back all elements matching the xpath expression including the parent/grandparents and all child elements/

    subelements. i.e

    <html> <body nodename="Html Body"> <table> <tbody> <tr> <td initials="AA">Alistair Andrews</td> <td initials="BB">Bert Brown</td> </tr> </tbody> </table> </body> </html>

    sub elements. 


    The following examples show how to reference the returned xpaths html/xml data structure in
    stream
    attribute expressions:-
    • Xpath parent attributes: in.^.nodename -> returns 'Html Body'
    • Xpath child attributes: listToString(in.tbody.tr.td.initials) -> returns 'AA,BB'
    • Xpath child attribute text values: listToString(in.tbody.tr.td.value) -> returns 'Alistair Andrews,Bert Brown'

    Note the use of:

    • ^ to traverse to the immediate parent element.
    • the listToString function to handle multiple matching child elements/attributes.
    • the optional html <tbody> tags. If these are not in your html data, then PhixFlow will insert them to conform with the HTML standards. Therefore when using absolute XPath expressions, the tbody tags need to be included even if they are not present in the incoming HTML data. That means /html/title/table/tr should be replaced with /html/title/table/tbody/tr or use //tr
      The same applies when referencing parent/child nodes within the
    stream
    • attribute expressions.
    JSON$.main_page.

    Advanced

    FieldDescriptionLog Traffic Insert excerpt_log_traffic2_log_traffic2nopaneltrue
    • Log HTTP Collector Connection Details : when ticked, PhixFlow always logs HTTP responses and requests for HTTP collectors, whatever is set here.
    Insert excerpt_log_traffic1_log_traffic1nopaneltrue
    title

    This path expression will bring back all elements matching the json path expression including the parent/grandparents and all child elements/sub elements.

    The following examples show how to reference the returned json path data structure in attribute expressions:-

    • Immediate properties text value: in.name-> returns 'PF Title Text'
    • Parent properties: in.^.page -> returns 'PF Main Page'
    • Array properties: listToString(in.data.initials) -> returns 'AA,BB'
    • Array index in.data.initials.1 → returns 'AA'

    Note the use of:

    • ^ to traverse to the immediate parent element.
    • the listToString function to handle multiple matching child elements/attributes.
    • Array indexes can be used, these are 1 based.


    Live Search
    spaceKey@self
    additionalnone
    placeholderSearch all help pages
    typepage

    Panel
    borderColor#00374F
    titleColorwhite
    titleBGColor#00374F
    borderStylesolid
    titleSections on this page

    Table of Contents
    maxLevel3
    indent12px
    stylenone


    Learn More

    For links to all pages in this topic, see Analysis Models