This page is for a data modeller who needs to load data from an external source via HTTP.
Overview
An HTTP collector reads data from a HTTP Datasource. The collector defines how the data needed from the datasource is extracted to be used in PhixFlow.
To add a new HTTP collector to an analysis model:
- Go to the model's toolbar → Create group.
- Click to expand the menu.
- Drag a
Insert excerpt |
---|
| _http_collector |
---|
| _http_collector |
---|
nopanel | true |
---|
|
onto the analysis model.
To add an existing HTTP collector to an analysis model, in the model diagram toolbar:
- Go to the model toolbar → List group.
- Click to expand the menu.
- Click
Insert excerpt |
---|
| _http_collector |
---|
| _http_collector |
---|
nopanel | true |
---|
|
to open the list of available collectors. - Drag an HTTP collector into the analysis model.
Insert excerpt |
---|
| _http_newlines |
---|
| _http_newlines |
---|
nopanel | true |
---|
|
Table Values in a HTTP Collector
To drive the lookups made by a HTTP collector from a table, the two must be connected using a lookup pipe. For example, a URL for a server can either be captured or calculated in an attribute called "ServerURL". The URL is then passed via a lookup pike to the HTTP collector to be used in its URL Expression.
If the pipe is called in
, here is how the URL Expression would be written on the HTTP collector:
{substring(in.ServerUrl,9)}
Insert excerpt |
---|
| _property_toolbar |
---|
| _property_toolbar |
---|
nopanel | true |
---|
|
Insert excerpt |
---|
| _property_tabs |
---|
| _property_tabs |
---|
name | basic-h |
---|
nopanel | true |
---|
|
Insert excerpt |
---|
| _parent |
---|
| _parent |
---|
nopanel | true |
---|
|
Basic Settings
Field | Description |
---|
Name | Enter the name of the HTTP collector. |
Enabled | Insert excerpt |
---|
| _check_box_tick |
---|
| _check_box_tick |
---|
nopanel | true |
---|
| when the configuration is complete and the collector is ready to be used. |
Send Message
Define details of the HTTP request sent to the HTTP Datasource to get the data required.
Field | Description |
---|
HTTP Request Method |
Excerpt |
---|
Select one of the following HTTP methods to use for the request: GET or POST GET POST DELETE
OPTIONS PUT - PATCH
We recommend that you select a method but if you do not, PhixFlow uses GET or POST by default. If the Send Message → Statement Expression: - evaluates to null or empty string, PhixFlow uses GET
- is not empty, PhixFlow uses POST.
For information, see the w3schools page about HTTP methods. |
|
URL Expression | Insert excerpt |
---|
| _url_expression |
---|
| _url_expression |
---|
nopanel | true |
---|
|
The HTTP collector follows any HTTP redirections and returns the final response. |
Statement Expression | An expression to generate the data that will be sent by the collector to the datasource.
Example Code Block |
---|
| <?xml version ="1.0"?>
<!DOCTYPE CORPORATE DASHBOARD "corpDash.dtd">
<results user="%USERNAME%" password="%PASSWORD%">
<monthlyTotals region=${'"' + Region + '"'} division=${'"' + Division + '"'}>
<totalBilled>${'"' + TotalBilled + '"'}</totalBilled>
<totalCollected>${'"' + TotalCollected + '"'}</totalCollected>
</monthlyTotals>
</results> |
Code Block |
---|
title | Example JSON Statement |
---|
| {
user: '${user}',
code: ${'{size: "big"}'}
price: ${price},
currency: '${currency}'
} |
Insert excerpt |
---|
| _expression_lang |
---|
| _expression_lang |
---|
nopanel | true |
---|
|
Insert excerpt |
---|
| _secret |
---|
| _secret |
---|
nopanel | true |
---|
|
See also HTTP datasource instance and Expressions and PhixScripts. |
This section has a toolbar with standard buttons. The grid contains a list of the HTTP headers defined for this collector. To add a HTTP header to the list, click
. PhixFlow opens a new
HTTP Header properties tab. To remove a HTTP header, use the
Insert excerpt |
---|
| _delete |
---|
| _delete |
---|
nopanel | true |
---|
|
in the toolbar.
Some headers will be set to default values if not provided. Automatically added headers may not appear in the debug log. The Content-Length header will be added to all requests and cannot be overridden by providing a value.
Response
Define the data response type/format that will be returned:
- HTML: response type allows an XPath Expression to be specified in order to retrieve just specified sections of the data into XML structures.
- XML: response type allows an XPath Expression to be specified in order to retrieve just specified sections of the data into XML structures. XML response types also support XML namespaces. The Xml Namspaces tab will be available when this response type is chosen.
- String: response type will return the full data as a string value.
Please see Response Examples for how the returned data can be used and evaluated in the corresponding attribute expressions.
Field | Description |
---|
Return Type | Select the type of the expected response: |
XPath or JsonPath | Available when Return Type is XML, HTTP or JSON. The expression used to resolve or filter the data that comes back in the selected format. Note |
---|
Use XPath namespaces syntax for XML response types only. |
|
XML Namespaces
Available when Return Type is XML or HTTP. This section has a toolbar with standard buttons. The grid contains a list of the namespaces defined in an XML response.
To add a namespace to the list, click
. PhixFlow opens a new
XML Namespace panel. To remove a namespace, use the
Insert excerpt |
---|
| _delete |
---|
| _delete |
---|
nopanel | true |
---|
|
in the toolbar.
Insert excerpt |
---|
| _model_prop |
---|
| _model_prop |
---|
nopanel | true |
---|
|
Advanced
Field | Description |
---|
HTTP Datasource | Select the HTTP datasource that this collector will collect from. For how to add a new one, see HTTP Datasource. To select from a list, click Insert excerpt |
---|
| _http_datasource |
---|
| _http_datasource |
---|
name | list |
---|
nopanel | true |
---|
| . |
Datasource Instance Expression | The datasource to which this collector is connected may list multiple instances from which the data may be accessed. Each HTTP datasource instance is identified by a unique string. This expression should evaluate to a string which allows the collector to determine the specific instance to use. If the expression is blank then the collector will assume that there is only one instance and will use that one by default. If there is more than one instance and no expression is provided here then an error will be thrown during analysis since the collector will be unable to determine which source to use. See also HTTP datasource instance and Expressions and PhixScripts. |
Timeout (secs) | The number of seconds to wait for a response from the corresponding HTTP datasource before a timeout is recorded. |
Icon | Enter the path for an image file that has been uploaded to the PhixFlow database. PhixFlow displays this icon in controls when the HTTP collector is used. |
Allow Non-Scheduled Collection | Insert excerpt |
---|
| _check_box_tick |
---|
| _check_box_tick |
---|
nopanel | true |
---|
|
to run the HTTP collector as part of any ad-hoc analysis run that requires this data. If not, it will only run as part of a scheduled Task Plan. |
Log Traffic | Insert excerpt |
---|
| _log_traffic2 |
---|
| _log_traffic2 |
---|
nopanel | true |
---|
|
Note |
---|
PhixFlow always logs HTTP responses and requests for HTTP collectors, whatever is set here. |
|
Insert excerpt |
---|
| _model_prop |
---|
| _model_prop |
---|
nopanel | true |
---|
|
Insert excerpt |
---|
| _description |
---|
| _description |
---|
nopanel | true |
---|
|
Anchor |
---|
| responseExamples |
---|
| responseExamples |
---|
|
Response Examples
This example uses the following XML and HTML example data.
XML Data
<?xml version ="1.0"?> <root xmlns:h="http://www.w3.org/TR/html4/"> <main page="PF Main Page" > <h:title h:Standard
Code Block |
---|
|
<root>
<main page="PF Main Page">
<title name="PF Title">PF Title |
Text <h:datarow> <h:data h:initials="AA">Alistair Andrews</h:data> <h:data h: Text">
<datarow>
<data initials="AA">Alistair Andrews<data>
<data initials="BB">Bert Brown</ |
h:h:h:title> <title name="Non namespace Title">Non namespace Title Text</title> </main> </root> HTML Data
<html> <body nodename="Html Body"> <table> <tbody> <tr> <td initials="AA">Alistair
Andrews</td> <td initials="BB">Bert Brown</td> </tr> </tbody> </table> </body> </html>
JSON
Todo Anthony to place JSON here
The data is being pointed to by either HTTP datasources or XML/HTML File collectors respectively.
The following table shows the different types of responses that can be returned from an HTTP Collector and how these can be used in the corresponding attribute expressions. A HTTP Collector response type of XML/HTML will mimic the responses from XML/HTML Collectors respectively.
Response Type | XPath Expression | Explanation |
---|
String | n/a | A String response should be referenced in the attribute expressions as in.value Note that in.value will contain the complete string data referenced above. |
XML | /root/main/h:title
Note |
---|
The namspace prefix used here 'h ' must be configured in the XML Namespace. |
This XPath expression will bring back all elements matching the xpath expression including the parent/grandparents and all child elements/subelements. i.e<root xmlns:h="http://www.w3.org/TR/html4/"> <main page="PF Main Page" > <h:title h:name="PF Title">PF Title Text <h:datarow> <h:data h:initials="AA">Alistair
Andrews
</h:data> <h:data h:initials="BB">Bert Brown</h:data> </h:datarow> </h:title> </main> </root>
With Namespace
Code Block |
---|
|
<root xmlns:h="http://example.com/schema">
<main page="PF Main Page">
<h:title name="PF Title">PF Title Text">
<h:datarow>
<h:data h:initials="AA">Alistair Andrews<data>
<h:data h:initials="BB">Bert Brown</data>
</h:datarow>
</h:title>
</main>
</root> |
HTML Data
Code Block |
---|
|
<html>
<body nodename="Html Body">
<table>
<tbody>
<tr>
<td initials="AA">Alistair Andrews</td>
<td initials="BB">Bert Brown</td>
</tr>
</tbody>
</table>
</body>
</html> |
JSON
Code Block |
---|
|
{
"main_page": {
"page": "PF Main Page",
"title" : {
"name" : "PF Title Text,
"data" : [
{"initials": "AA", "value" : "Alistair Andrews"},
{"initials": "BB", "value" : "Bert Brown"}
]
}
}
}
|
The data is being pointed to by either HTTP datasources or XML/HTML File collectors respectively.
The following table shows the different types of responses that can be returned from an HTTP Collector and how these can be used in the corresponding attribute expressions. A HTTP Collector response type of XML/HTML will mimic the responses from XML/HTML Collectors respectively.
Response Type | Path Expression | Explanation |
---|
String | n/a | A String response should be referenced in the attribute expressions as in.value Note that in.value will contain the complete string data referenced above. |
XML | | This path expression will bring back all elements matching the xpath expression including the parent/grandparents and all child elements/subelements.
The following examples show how to reference the returned xpaths html/xml data structure in attribute expressions:- - Xpath element text value: in.value -> returns 'PF Title Text'
- Xpath element attibutes: in.name -> returns 'PF Title'
- Xpath parent attributes: in.^.page -> returns 'PF Main Page'
- Xpath child attributes: listToString(in.datarow.data.initials) -> returns 'AA,BB'
- Xpath child attribute text values: listToString(in.datarow.data.value) -> returns 'Alistair Andrews,Bert Brown'
Note the use of Namespaces XML documents containing namespaces are supported. - Within path expressions they are referred to using semicolons.
- Within attribute expressions a $ is used instead of the normal : namespace notation
- Xpath element attibutes: in.h$name -> returns 'PF Title''
- Xpath child attributes: listToString(in.h$datarow.h$data.h$initials) -> returns 'AA,BB'
- Xpath child attribute text values: listToString(in.h$datarow.h$data.value) -> returns 'Alistair Andrews,Bert Brown'
Note |
---|
The namespace prefix used here 'h ' must be configured in the XML Namespace. |
|
HTML | /html/body/table
Note |
---|
Namspaces are not supported in the Xpath expression for HTML response types. |
| This Path expression will bring back all elements matching the xpath expression including the parent/grandparents and all child elements/subelements.
The following examples show how to reference the returned xpaths html/xml data structure in attribute expressions :-Xpath element text value: in.value -> returns 'PF Title Text'Xpath element attibutes: in.h$name - > returns 'PF Title'- Xpath parent attributes: in.^.
pagePF Main Page- Html Body'
- Xpath child attributes: listToString(in.tbody.
h$datarowh$datah$initials- initials) -> returns 'AA,BB'
- Xpath child attribute text values: listToString(in.tbody.
h$datarowh$data- td.value) -> returns 'Alistair Andrews,Bert Brown'
Note the use of $ instead of our usual : namespace notation.^ to traverse to the immediate parent element.- the
listToString function to handle multiple matching child elements/attributes.
| HTML | /html/body/table
Note |
---|
Namspaces are not supported in the Xpath expression for HTML response types. |
This XPath : ^ to traverse to the immediate parent element.- the
listToString function to handle multiple matching child elements/attributes. - the optional html
<tbody> tags. If these are not in your html data, then PhixFlow will insert them to conform with the HTML standards. Therefore when using absolute XPath expressions, the tbody tags need to be included even if they are not present in the incoming HTML data. That means /html/title/table/tr should be replaced with /html/title/table/tbody/tr or use //tr The same applies when referencing parent/child nodes within the attribute expressions.
|
JSON | $.main_page.title | This path expression will bring back all elements matching the | xpath jsonpath expression including the parent | /grandparents and all child elements/subelements. i.e<html> <body nodename="Html Body"> <table> <tbody> <tr> <td initials="AA">Alistair Andrews</td> <td initials="BB">Bert Brown</td> </tr> </tbody> </table> </body> </html>
/grandparents and all child elements/subelements. The following examples show how to reference the returned | xpaths jsonpath html/xml data structure in attribute expressions | :-Xpath parent attributes: in.^.nodename -> returns 'Html Body'Xpath child attributes | listToString(in.tbody.tr.td.initials) | > returns 'AA,BB'Xpath child attribute text values: listToString(in.tbody.tr.td.value) - Immediate properties text value: in.name-> returns '
| Alistair Andrews,Bert Brown'Note the use of:
^
to traverse to the immediate parent element.- the
listToString
function to handle multiple matching child elements/attributes. - the optional html
<tbody>
tags. If these are not in your html data, then PhixFlow will insert them to conform with the HTML standards. Therefore when using absolute XPath expressions, the tbody tags need to be included even if they are not present in the incoming HTML data. That means /html/title/table/tr
should be replaced with /html/title/table/tbody/tr or
use //tr
The same applies when referencing parent/child nodes within the attribute expressions.
JSON | todo-Fiona: example requested from ThomasS- PF Title Text'
- Parent properties: in.^.page -> returns 'PF Main Page'
- Array properties: listToString(in.data.initials) -> returns 'AA,BB'
- Array index in.data.initials.1 → returns 'AA'
Note the use of: ^ to traverse to the immediate parent element.- the
listToString function to handle multiple matching child elements/attributes. - Array indexes can be used, these are 1 based.
|