...

Form: HTTP Collector Details

The form provides the standard form icons

Configure the following fields to set up an HTTP Collector:

...

Excerpt

Select one of the following HTTP methods to use for the request:

GET or Post
GET
POST
PUT
DELETE
OPTIONS

We recommend that you selecte a method but if you do not, PhixFlow uses GET or POST by default. If the Send Message → Statement Expression:

evaluates to null or empty string, PhixFlow uses GET
is not empty, PhixFlow uses POST.

For information, see the w3schools page about HTTP methods.

...

Send Message

Define details of the HTTP request sent to the HTTP Datasource to get the data required.

...

The URL to be used, without the leading http:// prefix. The URL may contain embedded expressions within { }. If this field is blank, the url field on the httpDatasourceInstance is used directly.

For Example, this expression adds to the base url provided by the HTTP Datasource Instance :

{_url}/sub1/sub2?param1=3

...

<?xml version ="1.0"?> <!DOCTYPE CORPORATE DASHBOARD "corpDash.dtd"> <results user="%USERNAME%" password="%PASSWORD%"> <monthlyTotals region={'"' + Region + '"'} division={'"' + Division + '"'}> <totalBilled>{'"' + TotalBilled + '"'}<\totalBilled> <totalCollected>{'"' + TotalCollected + '"'}<\totalCollected> <monthlyTotals> <\results>

The username and password for the HTTP Datasource Instance are available as %USERNAME% and %PASSWORD%.

The data will be encoded using the charset parameter specified by the Content-Type Header if one is present. If no Content-Type Header is set then ISO-8859-1 will be used. If the Content-Type header is set, but does not specify a charset then PhixFlow will use a default character set dependant on the content type.

...

HTTP Headers

For an HTTP request, define name value pairs to include as part of the HTTP header. (e.g. content-type)

...

Name of the HTTP Collector HTTP Header Item. You must not include a colon after the name. For example, Content-Type is a valid name, whereas Content-Type: is not. For example:

Code Block
Content-Type

...

Value of the HTTP Collector HTTP Header Item. For example:

Code Block
text/xml; charset=UTF-8

...

Response

Define the data response type/format that will be returned:

HTML: response type allows an XPath Expression to be specified in order to retrieve just specified sections of the data into XML structures.
XML: response type allows an XPath Expression to be specified in order to retrieve just specified sections of the data into XML structures. XML response types also support XML namespaces. The Xml Namspaces tab will be available when this response type is chosen.
String: response type will return the full data as a string value.

Please see Response Examples for how the returned data can be used and evaluated in the corresponding stream attribute expressions.

...

Xml Namespaces

The namespaces defined in an XML response. The names given to these namespaces must match those used in any Xpath expressions used to extract data from an XML response. See examples below.

...

Name of the XML namespace. By convention, it is recommended that you use the name used in the XML response. E.g. if the XML response contains xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" then make this soapenv.

However, this is not mandatory - you can give any namespace any name - all that matters is that the names defined here match those you use in XPath expressions to extract data from the XML response.

In particular, default namespaces, e.g. xmlns="urn:xmlns:company-com:message" can be given any name, providing that you use this name in XPath expressions.

See HTTP Collectors for examples of using namespaces in XPath expressions to extract data from XML responses.

For example:

Code Block
soapenv

...

Value of the XML namespace. For example:

Code Block
http://schemas.xmlsoap.org/soap/envelope/

...

Stream Values in a HTTP Collector

To drive the lookups made by a HTTP Collector from a stream, the two must be connected using a lookup pipe. For example - If a URL for a server is captured, or calculated, on a stream in an attribute called "ServerURL" and passed to the HTTP Collector to be used in its URL Expression, the pipe connecting the two must be a lookup pipe. If the pipe is called "in", here is how the URL Expression would be written on the HTTP Collector: {substring(in.ServerUrl,9)}

...

Given the following XML and HTML data that is being pointed to by either HTTP datasources or XML/HTML File collectors respectively.

XML Data <?xml version ="1.0"?> <root xmlns:h="http://www.w3.org/TR/html4/"> <main page="PF Main Page" > <h:title h:name="PF Title">PF Title Text <h:datarow> <h:data h:initials="AD">Ali Dawson</h:data> <h:data h:initials="GP">Gary Parden</h:data> </h:datarow> </h:title> <title name="Non namespace Title">Non namespace Title Text</title> </main> </root>
HTML Data <html> <body nodename="Html Body"> <table> <tbody> <tr> <td initials="AD">Ali Dawson</td> <td initials="GP">Gary Parden</td> </tr> </tbody> </table> </body> </html>

The following table shows the different types of responses that can be returned from an HTTP Collector and how these can be used in the corresponding stream attribute expressions. A HTTP Collector response type of XML/HTML will mimic the responses from XML/HTML Collectors respectively.

...

Note that the namspace prefix used here 'h' must be configured in the HTTP XML Namspaces form

...

<root xmlns:h="http://www.w3.org/TR/html4/"> <main page="PF Main Page" > <h:title h:name="PF Title">PF Title Text <h:datarow> <h:data h:initials="AD">Ali Dawson</h:data> <h:data h:initials="GP">Gary Parden</h:data> </h:datarow> </h:title> </main> </root>

...

Insert excerpt

	_Banners
	_Banners
name	analysis
nopanel	true

This page is for a data modeller who needs to load data from an external source via HTTP.

Overview

An HTTP collector reads data from a HTTP Datasource. The collector defines how the data needed from the datasource is extracted to be used in PhixFlow.

To add a new HTTP collector to an analysis model:

Go to the model's toolbar → Create group.
Click
Insert excerpt
_http
_http
nopanel true
to expand the menu.
Drag a
Insert excerpt
_http_collector
_http_collector
nopanel true
onto the analysis model.

To add an existing HTTP collector to an analysis model, in the model diagram toolbar:

Go to the model toolbar → List group.
Click
Insert excerpt
_http
_http
nopanel true
to expand the menu.
Click
Insert excerpt
_http_collector
_http_collector
nopanel true
to open the list of available collectors.
Drag an HTTP collector into the analysis model.

Insert excerpt

	_http_newlines
	_http_newlines
nopanel	true

Table Values in a HTTP Collector

To drive the lookups made by a HTTP collector from a table, the two must be connected using a lookup pipe. For example, a URL for a server can either be captured or calculated in an attribute called "ServerURL". The URL is then passed via a lookup pike to the HTTP collector to be used in its URL Expression.

If the pipe is called in, here is how the URL Expression would be written on the HTTP collector:
{substring(in.ServerUrl,9)}

Insert excerpt

	_property_toolbar
	_property_toolbar
nopanel	true

Insert excerpt
_property_tabs
_property_tabs
name basic-h
nopanel true

Insert excerpt

	_parent
	_parent
nopanel	true

Basic Settings

Field

Description

Name

Enter the name of the HTTP collector.

Enabled

Insert excerpt

	_check_box_tick
	_check_box_tick
nopanel	true

when the configuration is complete and the collector is ready to be used.

Send Message

Define details of the HTTP request sent to the HTTP Datasource to get the data required.

Field Description

HTTP Request Method

Excerpt

Select one of the following HTTP methods to use for the request:

GET or POST
GET
POST
DELETE
OPTIONS
PUT
PATCH

We recommend that you select a method but if you do not, PhixFlow uses GET or POST by default. If the Send Message → Statement Expression:

evaluates to null or empty string, PhixFlow uses GET
is not empty, PhixFlow uses POST.

For information, see the w3schools page about HTTP methods.

URL

Insert excerpt

	_url_expression
	_url_expression
nopanel	true

The HTTP collector follows any HTTP redirections and returns the final response.

Statement Expression

An expression to generate the data that will be sent by the collector to the datasource. Expressions should be embedded in ${value}

Examplex

Code Block

title	XML Statement

<?xml version ="1.0"?>
<!DOCTYPE CORPORATE DASHBOARD "corpDash.dtd">
<results user="%USERNAME%" password="%PASSWORD%">
	<monthlyTotals region=${'"' + Region + '"'} division=${'"' + Division + '"'}>
		<totalBilled>${'"' + TotalBilled + '"'}</totalBilled>
		<totalCollected>${'"' + TotalCollected + '"'}</totalCollected>
	</monthlyTotals>
</results>

Code Block

title	JSON Statement

{
	user: '${user}',
	code:  ${'{size: "big"}'}
	price: ${price},
	currency: '${currency}'
}

Insert excerpt

	_expression_lang
	_expression_lang
nopanel	true

Insert excerpt

	_secret
	_secret
nopanel	true

HTTP Headers

This section has a toolbar with standard buttons. The grid contains a list of the HTTP headers defined for this collector. To add a HTTP header to the list, click

Insert excerpt

	_new
	_new
nopanel	true

. PhixFlow opens a new HTTP Header properties tab. To remove a HTTP header, use the

Insert excerpt

	_delete
	_delete
nopanel	true

in the toolbar.

Some headers will be set to default values if not provided. Automatically added headers may not appear in the debug log. The Content-Length header will be added to all requests and cannot be overridden by providing a value.

Response

Defines the data response type/format that will be returned and the desired location of the data.

See Response Examples below for how the returned data can be used and evaluated in the corresponding attribute expressions.

Field

Description

Return Type

Select the data type expected as the response from the API call:

String: response type will return the full data as a string value. This is available using the value in a subsequent table.
XML: response type allows an XPath Expression to be specified in order to retrieve just specified sections of the data into XML structures. XML response types also support XML namespaces. The Xml Namspaces tab will be available when this response type is chosen.
HTML: response type allows an XPath Expression to be specified in order to retrieve just specified sections of the data into XML structures.
JSON: response type allows a (JSON) Path expression to be specified in order to retrieve the desired JSON values.

XPath or JSONPath

Available when Return Type is XML, HTTP or JSON.

The expression used to resolve or filter the data that comes back in the selected format.

Note
Use XPath namespaces syntax for XML response types only.

Lenient

Allows for leniency in the interpretation of the JSON data returned by an API. This help interpret poorly formed JSON data.

XML Namespaces

Available when Return Type is XML or HTTP. This section has a toolbar with standard buttons. The grid contains a list of the namespaces defined in an XML response.

To add a namespace to the list, click

Insert excerpt

	_new
	_new
nopanel	true

. PhixFlow opens a new XML Namespace property pane. To remove a namespace, use the

Insert excerpt

	_delete
	_delete
nopanel	true

in the toolbar.

Insert excerpt

	_model_prop
	_model_prop
nopanel	true

Advanced

Field

Description

HTTP Datasource

Select the HTTP datasource that this collector will collect from. For how to add a new one, see HTTP Datasource. To select from a list, click

Insert excerpt

	_http_datasource
	_http_datasource
name	list
nopanel	true

.

Datasource Instance Expression

The datasource to which this collector is connected may list multiple instances from which the data may be accessed. Each HTTP datasource instance is identified by a unique string. This expression should evaluate to a string which allows the collector to determine the specific instance to use. If the expression is blank then the collector will assume that there is only one instance and will use that one by default. If there is more than one instance and no expression is provided here then an error will be thrown during analysis since the collector will be unable to determine which source to use.

Timeout (secs)

The number of seconds to wait for a response from the corresponding HTTP datasource before a timeout is recorded.

Collector Icon

PhixFlow displays this icon in controls when the HTTP collector is used.

Excerpt

name	icon

An icon can be set by uploading an

Insert excerpt

	_image_list
	_image_list
name	image
nopanel	true

or using an image already uploaded to the application. If this field is left blank, PhixFlow checks for an icon set on the associated HTTP Datasource and uses that icon, and if that too is empty, then the default icon is used.

Allow Non-Scheduled Collection

Insert excerpt

	_check_box_tick
	_check_box_tick
nopanel	true

to run the HTTP collector as part of any ad-hoc analysis run that requires this data. If not, it will only run as part of a scheduled Task Plan.

Use Raw URL

If enabled the URL Template value is sent in exactly the format it is provided to the HTTP Node. If not enabled PhixFlow will transpose values to form a valid URL, such as replacing spaces with %20.

Log Traffic

Insert excerpt

	_log_traffic2
	_log_traffic2
nopanel	true

Note
PhixFlow always logs HTTP responses and requests for HTTP collectors, whatever is set here.

Insert excerpt

	_model_prop
	_model_prop
nopanel	true

Insert excerpt

	_description
	_description
nopanel	true

Insert excerpt

	_audit
	_audit
nopanel	true

Anchor
responseExamples
responseExamples
Response Examples

This example uses the following example data.

XML Data

Standard

Code Block

language	xml

<root> 
	<main page="PF Main Page"> 
		<title name="PF Title">PF Title Text
			<datarow> 
				<data initials="AA">Alistair Andrews</data>
				<data initials="BB">Bert Brown</data> 
			</datarow> 
		</title> 
	</main> 
</root>

With Namespace

Code Block

language	xml

<root xmlns:h="http://example.com/schema"> 
	<main page="PF Main Page"> 
		<h:title name="PF Title">PF Title Text
			<h:datarow> 
				<h:data h:initials="AA">Alistair Andrews</data>
				<h:data h:initials="BB">Bert Brown</data> 
			</h:datarow> 
		</h:title> 
	</main> 
</root>

HTML Data

Code Block

language	xml

<html>
	<body nodename="Html Body"> 
		<table> 
			<tbody> 
				<tr> 
					<td initials="AA">Alistair Andrews</td> 
					<td initials="BB">Bert Brown</td> 
				</tr> 
			</tbody> 
		</table> 
	</body> 
</html>

JSON

Code Block

language	xml

{
	"main_page": {
		"page": "PF Main Page",
		"title" : {
			"name" : "PF Title Text,
			"data" : [
				{"initials": "AA", "value" : "Alistair Andrews"},
				{"initials": "BB", "value" : "Bert Brown"}
			]
		}
	}
}

The data is being pointed to by either HTTP datasources or XML/HTML File collectors respectively.

The following table shows the different types of responses that can be returned from an HTTP Collector and how these can be used in the corresponding attribute expressions. A HTTP Collector response type of XML/HTML will mimic the responses from XML/HTML Collectors respectively.

Response Type Path Expression Explanation

String n/a A String response should be referenced in the attribute expressions as in.value Note that in.value will contain the complete string data referenced above.

XML

/root/main/title

Excerpt

name	_xmlXPath

This path expression will bring back all elements matching the XPath expression including the parent/grandparents and all child elements/sub elements.

The following examples show how to reference the returned x paths html/xml data structure in attribute expressions:-

Xpath element attributes: in.name -> returns 'PF Title'
Xpath parent attributes: in.^.page -> returns 'PF Main Page'
Xpath child attributes: listToString(in.datarow.data.initials) -> returns 'AA,BB'
Xpath child attribute text values: listToString(in.datarow.data.value) -> returns 'Alistair Andrews,Bert Brown'

Note the use of

^ to traverse to the immediate parent element.
to traverse to the immediate child element within:
- XPath uses /
- Attribute expressions use the dot notation .
the listToString function to handle multiple matching child elements/attributes.

Namespaces

XML documents containing namespaces are supported.

Within path expressions they are referred to using semicolons.
- /root/main/h:title
Within attribute expressions a $ is used instead of the normal : namespace notation
- Xpath element attibutes: in.h$name -> returns 'PF Title

...

Note the use of a $ instead of our usual : namespace notation.

Note the use of a ^ to traverse to the immediate parent element.

...

Note the use of a ^ to traverse to the immediate parent element.

Note the use of the listToString function to handle multiple matching child elements/attributes.

Note the use of the optional html <tbody> tags. If these are not in your html data, then phixflow will insert them to conform with the HTML standards. Therefore when using absolute XPath expressions, note that the tbody tags need to be included even if they are not present in the incoming HTML data.
i.e /html/title/table/tr should be replaced with /html/title/table/tbody/tr. Alternativley you can use //tr
The same applies when referencing parent/child nodes within the stream attribute expressions.

...

The following example illustrates how a default namespace defined in an XML response is handled.

<?xml version="1.0" encoding="UTF-8"?> <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns="urn:xmlns:company-com:message" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:pf="urn:phixflow.message.com"> <soapenv:Body> <queryResponse> <result xsi:type="QueryResult"> <done>true</done> <queryLocator xsi:nil="true"/> <record xsi:type="pf:sObject"> <pf:type>Account</pf:type> <pf:Name>Company1</pf:Name> <pf:CreatedDate>2013-10-31T11:26:21.000Z</pf:CreatedDate> </record> <record xsi:type="pf:sObject"> <pf:type>Account</pf:type> <pf:Name>Company2</pf:Name> <pf:CreatedDate>2013-10-31T11:26:21.000Z</pf:CreatedDate> </record> <size>2</size> </result> </queryResponse> </soapenv:Body> </soapenv:Envelope>

Since this document uses a default namespace ("urn:xmlns:company-com:message"), to refer to any element in an XPath expression that does not explicitly use a namespace you must use the default namespace. So to extract all record elements, you must first define a namespace in the collector for the default namespace. Suppose that you do and give this the name def. Then the XPath (defined in the XPath field on the Response tab) to extract all record elements will be //def:record

In a Stream that reads from this collector, any other namespaces defined in the document are used as normal. For example, to write the value from the element Name within each record element to a Stream attribute, you would use the attribute expression in.sf$Name

Html Body' Xpath child attributes: listToString(in.tbody.tr.td.initials) -> returns 'AA,BB' Xpath child attribute text values: listToString(in.tbody.tr.td.value) -> returns 'Alistair Andrews,Bert Brown' Note the use of: `^` to traverse to the immediate parent element. the `listToString` function to handle multiple matching child elements/attributes. the optional html `<tbody>` tags. If these are not in your html data, then PhixFlow will insert them to conform with the HTML standards. Therefore when using absolute XPath expressions, the tbody tags need to be included even if they are not present in the incoming HTML data. That means `/html/title/table/tr` should be replaced with `/html/title/table/tbody/tr or` use `//tr` The same applies when referencing parent/child nodes within the attribute expressions.
JSON	$.main_page.title	This path expression will bring back all elements matching the json path expression including the parent/grandparents and all child elements/sub elements. The following examples show how to reference the returned json path data structure in attribute expressions:- Immediate properties text value: in.name-> returns 'PF Title Text' Parent properties: in.^.page -> returns 'PF Main Page' Array properties: listToString(in.data.initials) -> returns 'AA,BB' Array index in.data.initials.1 → returns 'AA' Note the use of: `^` to traverse to the immediate parent element. the `listToString` function to handle multiple matching child elements/attributes. Array indexes can be used, these are 1 based.

Versions Compared

Old Version 15

New Version Current

Key

Form: HTTP Collector Details

Stream Values in a HTTP Collector

Overview

Table Values in a HTTP Collector

Insert excerpt
_property_tabs
_property_tabs
name basic-h
nopanel true

Basic Settings

Send Message

HTTP Headers

Response

XML Namespaces

Advanced

Anchor
responseExamples
responseExamples
Response Examples

XML Data

Standard

With Namespace

HTML Data

See Also

Live Search
spaceKey @self
additional none
placeholder Search all help pages
type page

Page Comparison

Versions Compared

Old Version 15

New Version Current

Key

Form: HTTP Collector Details

Stream Values in a HTTP Collector

Overview

Table Values in a HTTP Collector

Insert excerpt_property_tabs_property_tabsnamebasic-hnopaneltrue

Basic Settings

Send Message

HTTP Headers

Response

XML Namespaces

Advanced

AnchorresponseExamplesresponseExamplesResponse Examples

XML Data

Standard

With Namespace

HTML Data

See Also

Live SearchspaceKey@selfadditionalnoneplaceholderSearch all help pagestypepage

Insert excerpt
_property_tabs
_property_tabs
name basic-h
nopanel true

Anchor
responseExamples
responseExamples
Response Examples

Live Search
spaceKey @self
additional none
placeholder Search all help pages
type page