PhixFlow Help

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Forms: HTTP Collector

An HTTP Collector reads data from a HTTP Datasource. The collector defines how the data needed from the datasource will be extracted to be used in PhixFlow.

Form: HTTP Collector Details

The HTTP Collector form contains a number of tabs:

FieldDescription
DetailsThe main details required for HTTP Collector configuration.
Send MessageDefine details of the HTTP request sent to the HTTP Datasource to get the data required.
HTTP HeadersFor an HTTP request, define name value pairs to include as part of the HTTP header. (e.g. content-type)
ResponseDefine the data response type/format that will be returned:
  • HTML: response type allows an XPath Expression to be specified in order to retrieve just specified sections of the data into XML structures.
  • XML: response type allows an XPath Expression to be specified in order to retrieve just specified sections of the data into XML structures. XML response types also support XML namespaces. The Xml Namspaces tab will be available when this response type is chosen.
  • String: response type will return the full data as a string value.

Please see Response Examples for how the returned data can be used and evaluated in the corresponding stream attribute expressions.

XML NamespacesThe namespaces defined in an XML response. The names given to these namespaces must match those used in any Xpath expressions used to extract data from an XML response. See examples below.
DescriptionA free text field for you to enter a description of the HTTP Collector.

The following fields are configured on the Details tab:

FieldDescription
NameName of the HTTP Collector.
HTTP Data SourceThe HTTP datasource that this collector will collect from.
EnabledTick when the configuration is complete and the collector is ready to be used.
Allow Non-Scheduled CollectionIf this is turned on, then the collector will run as part of any ad-hoc Analysis Engine run which requires this data. If not, it will only run as part of a scheduled Task Plan under the Analysis Engine.
IconThe Icon to display in controls when this collector is used.
Timeout (secs)The number of seconds to wait for a response from the corresponding HTTP datasource before a timeout is recorded.
Datasource Instance ExpressionThe datasource to which this collector is connected may list multiple instances from which the data may be accessed. Each HTTP Datasource Instance is identified by a unique string. This expression should evaluate to a string which allows the collector to determine the specific instance to use. If the expression is blank then the collector will assume that there is only one instance and will use that one by default. If there is more than one instance and no expression is provided here then an error will be thrown during analysis since the collector will be unable to determine which source to use.

The following fields are configured on the Send Message tab:

FieldDescription
URL Expression

The URL to be used, without the leading http:// prefix. The URL may contain embedded expressions within { }. If this field is blank, the url field on the httpDatasourceInstance is used directly.

For Example, this expression adds to the base url provided by the HTTP Datasource Instance :

{_url}/sub1/sub2?param1=3

Statement ExpressionAn expression to generate the data that will be sent by the exporter to the datasource. For Example

<?xml version ="1.0"?> <!DOCTYPE CORPORATE DASHBOARD "corpDash.dtd"> <results user="%USERNAME%" password="%PASSWORD%"> <monthlyTotals region={'"' + Region + '"'} division={'"' + Division + '"'}> <totalBilled>{'"' + TotalBilled + '"'}<\totalBilled> <totalCollected>{'"' + TotalCollected + '"'}<\totalCollected> <monthlyTotals> <\results>

 

 

The following fields are configured on the Response tab:

FieldDescription
Response TypeThe type of the expected response : XML/HTTP/String
XPathThe XPath expression used to resolve or filter the data that comes back in XML or HTML format. Note that Xpath namespaces syntax can only be used for XML response types.

The following fields are configured on the Description tab:

FieldDescription
DescriptionA freeform description of the HTTP collector.

Forms: HTTP Collector HTTP Header

HTTP Collector HTTP Header definition form - for creating name-value pairs to be included as part of the HTTP header of the request.

Form: HTTP Headers Detail

The following fields are configured on this form:

FieldDescription
NameName of the HTTP Collector HTTP Header Item. You must not include a colon after the name. For example, Content-Type is a valid name, whereas Content-Type: is not.
ValueValue of the HTTP Collector HTTP Header Item.

Examples

NameValue
Content-Typetext/xml; charset=UTF-8

Forms: HTTP Collector XML Namespace


HTTP Collector XML namespace definition form - for declaring namespaces used in an XML response.


Form: HTTP Headers Detail


The following fields are configured on this form:


FieldDescription
Name

Name of the XML namespace. By convention, it is recommended that you use the name used in the XML response. E.g. if the XML response contains xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" then make this soapenv.

However, this is not mandatory - you can give any namespace any name - all that matters is that the names defined here match those you use in XPath expressions to extract data from the XML response.

In particular, default namespaces, e.g. xmlns="urn:xmlns:company-com:message" can be given any name, providing that you use this name in XPath expressions.

See HTTP Collectors for examples of using namespaces in XPath expressions to extract data from XML responses.

ValueValue of the XML namespace. E.g. http://schemas.xmlsoap.org/soap/envelope/


Examples



Reponse Examples

 

Given the following XML and HTML data that is being pointed to by either HTTP datasources or XML/HTML File collectors respectively.

XML Data <?xml version ="1.0"?> <root xmlns:h="http://www.w3.org/TR/html4/"> <main page="PF Main Page" > <h:title h:name="PF Title">PF Title Text <h:datarow> <h:data h:initials="AD">Ali Dawson</h:data> <h:data h:initials="GP">Gary Parden</h:data> </h:datarow> </h:title> <title name="Non namespace Title">Non namespace Title Text</title> </main> </root>
HTML Data <html> <body nodename="Html Body"> <table> <tbody> <tr> <td initials="AD">Ali Dawson</td> <td initials="GP">Gary Parden</td> </tr> </tbody> </table> </body> </html>

The following table shows the different types of responses that can be returned from an HTTP Collector and how these can be used in the corresponding stream attribute expressions. A HTTP Collector response type of XML/HTML will mimic the responses from XML/HTML Collectors respectively.

Response TypeXPath ExpressionExplanation
Stringn/aA String response should be referenced in the stream attribute expressions as in.value Note that in.value will contain the complete string data referenced above.
XML/root/main/h:title

Note that the namspace prefix used here 'h' must be configured in the HTTP XML Namspaces form

This XPath expression will bring back all elements matching the xpath expression including the parent/grandparents and all child elements/subelements. i.e

<root xmlns:h="http://www.w3.org/TR/html4/"> <main page="PF Main Page" > <h:title h:name="PF Title">PF Title Text <h:datarow> <h:data h:initials="AD">Ali Dawson</h:data> <h:data h:initials="GP">Gary Parden</h:data> </h:datarow> </h:title> </main> </root>

The following examples show how to reference the returned xpaths html/xml data structure in stream attribute expressions:-
  • Xpath element text value: in.value -> returns 'PF Title Text'
  • Xpath element attibutes: in.h$name -> returns 'PF Title'
  • Xpath parent attributes: in.^.page -> returns 'PF Main Page'
  • Xpath child attributes: listToString(in.h$datarow.h$data.h$initials) -> returns 'AD,GP'
  • Xpath child attribute text values: listToString(in.h$datarow.h$data.value) -> returns 'Ali Dawson,Gary Parden'

Note the use of a $ instead of our usual : namespace notation.

Note the use of a ^ to traverse to the immediate parent element.

Note the use of the listToString function to handle multiple matching child elements/attributes.

HTML/html/body/table

Note that namspaces are not supported in the Xpath expression for HTML response types

This XPath expression will bring back all elements matching the xpath expression including the parent/grandparents and all child elements/subelements. i.e

<html> <body nodename="Html Body"> <table> <tbody> <tr> <td initials="AD">Ali Dawson</td> <td initials="GP">Gary Parden</td> </tr> </tbody> </table> </body> </html>

The following examples show how to reference the returned xpaths html/xml data structure in stream attribute expressions:-
  • Xpath parent attributes: in.^.nodename -> returns 'Html Body'
  • Xpath child attributes: listToString(in.tbody.tr.td.initials) -> returns 'AD,GP'
  • Xpath child attribute text values: listToString(in.tbody.tr.td.value) -> returns 'Ali Dawson,Gary Parden'

Note the use of a ^ to traverse to the immediate parent element.

Note the use of the listToString function to handle multiple matching child elements/attributes.

Note the use of the optional html <tbody> tags. If these are not in your html data, then phixflow will insert them to conform with the HTML standards. Therefore when using absolute XPath expressions, note that the tbody tags need to be included even if they are not present in the incoming HTML data.
i.e /html/title/table/tr should be replaced with /html/title/table/tbody/tr. Alternativley you can use //tr
The same applies when referencing parent/child nodes within the stream attribute expressions.

 

Default XML namespaces

 

The following example illustrates how a default namespace defined in an XML response is handled.

<?xml version="1.0" encoding="UTF-8"?> <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns="urn:xmlns:company-com:message" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:pf="urn:phixflow.message.com"> <soapenv:Body> <queryResponse> <result xsi:type="QueryResult"> <done>true</done> <queryLocator xsi:nil="true"/> <record xsi:type="pf:sObject"> <pf:type>Account</pf:type> <pf:Name>Company1</pf:Name> <pf:CreatedDate>2013-10-31T11:26:21.000Z</pf:CreatedDate> </record> <record xsi:type="pf:sObject"> <pf:type>Account</pf:type> <pf:Name>Company2</pf:Name> <pf:CreatedDate>2013-10-31T11:26:21.000Z</pf:CreatedDate> </record> <size>2</size> </result> </queryResponse> </soapenv:Body> </soapenv:Envelope>

Since this document uses a default namespace ("urn:xmlns:company-com:message"), to refer to any element in an XPath expression that does not explicitly use a namespace you must use the default namespace. So to extract all record elements, you must first define a namespace in the collector for the default namespace. Suppose that you do and give this the name def. Then the XPath (defined in the XPath field on the Response tab) to extract all record elements will be //def:record

In a Stream that reads from this collector, any other namespaces defined in the document are used as normal. For example, to write the value from the element Name within each record element to a Stream attribute, you would use the attribute expression in.sf$Name

Form Icons

The form provides the standard form icons

See Also

Forms: HTTP Collector

An HTTP Collector reads data from a HTTP Datasource. The collector defines how the data needed from the datasource will be extracted to be used in PhixFlow.

Form: HTTP Collector Details

The HTTP Collector form contains a number of tabs:

FieldDescription
DetailsThe main details required for HTTP Collector configuration.
Send MessageDefine details of the HTTP request sent to the HTTP Datasource to get the data required.
HTTP HeadersFor an HTTP request, define name value pairs to include as part of the HTTP header. (e.g. content-type)
ResponseDefine the data response type/format that will be returned:
  • HTML: response type allows an XPath Expression to be specified in order to retrieve just specified sections of the data into XML structures.
  • XML: response type allows an XPath Expression to be specified in order to retrieve just specified sections of the data into XML structures. XML response types also support XML namespaces. The Xml Namspaces tab will be available when this response type is chosen.
  • String: response type will return the full data as a string value.

Please see Response Examples for how the returned data can be used and evaluated in the corresponding stream attribute expressions.

XML NamespacesThe namespaces defined in an XML response. The names given to these namespaces must match those used in any Xpath expressions used to extract data from an XML response. See examples below.
DescriptionA free text field for you to enter a description of the HTTP Collector.

The following fields are configured on the Details tab:

FieldDescription
NameName of the HTTP Collector.
HTTP Data SourceThe HTTP datasource that this collector will collect from.
EnabledTick when the configuration is complete and the collector is ready to be used.
Allow Non-Scheduled CollectionIf this is turned on, then the collector will run as part of any ad-hoc Analysis Engine run which requires this data. If not, it will only run as part of a scheduled Task Plan under the Analysis Engine.
IconThe Icon to display in controls when this collector is used.
Timeout (secs)The number of seconds to wait for a response from the corresponding HTTP datasource before a timeout is recorded.
Datasource Instance ExpressionThe datasource to which this collector is connected may list multiple instances from which the data may be accessed. Each HTTP Datasource Instance is identified by a unique string. This expression should evaluate to a string which allows the collector to determine the specific instance to use. If the expression is blank then the collector will assume that there is only one instance and will use that one by default. If there is more than one instance and no expression is provided here then an error will be thrown during analysis since the collector will be unable to determine which source to use.

The following fields are configured on the Send Message tab:

FieldDescription
URL Expression

The URL to be used, without the leading http:// prefix. The URL may contain embedded expressions within { }. If this field is blank, the url field on the httpDatasourceInstance is used directly.

For Example, this expression adds to the base url provided by the HTTP Datasource Instance :

{_url}/sub1/sub2?param1=3

Statement ExpressionAn expression to generate the data that will be sent by the exporter to the datasource. For Example

<?xml version ="1.0"?> <!DOCTYPE CORPORATE DASHBOARD "corpDash.dtd"> <results user="%USERNAME%" password="%PASSWORD%"> <monthlyTotals region={'"' + Region + '"'} division={'"' + Division + '"'}> <totalBilled>{'"' + TotalBilled + '"'}<\totalBilled> <totalCollected>{'"' + TotalCollected + '"'}<\totalCollected> <monthlyTotals> <\results>

 

 

The following fields are configured on the Response tab:

FieldDescription
Response TypeThe type of the expected response : XML/HTTP/String
XPathThe XPath expression used to resolve or filter the data that comes back in XML or HTML format. Note that Xpath namespaces syntax can only be used for XML response types.

The following fields are configured on the Description tab:

FieldDescription
DescriptionA freeform description of the HTTP collector.

Reponse Examples

 

Given the following XML and HTML data that is being pointed to by either HTTP datasources or XML/HTML File collectors respectively.

XML Data <?xml version ="1.0"?> <root xmlns:h="http://www.w3.org/TR/html4/"> <main page="PF Main Page" > <h:title h:name="PF Title">PF Title Text <h:datarow> <h:data h:initials="AD">Ali Dawson</h:data> <h:data h:initials="GP">Gary Parden</h:data> </h:datarow> </h:title> <title name="Non namespace Title">Non namespace Title Text</title> </main> </root>
HTML Data <html> <body nodename="Html Body"> <table> <tbody> <tr> <td initials="AD">Ali Dawson</td> <td initials="GP">Gary Parden</td> </tr> </tbody> </table> </body> </html>

The following table shows the different types of responses that can be returned from an HTTP Collector and how these can be used in the corresponding stream attribute expressions. A HTTP Collector response type of XML/HTML will mimic the responses from XML/HTML Collectors respectively.

Response TypeXPath ExpressionExplanation
Stringn/aA String response should be referenced in the stream attribute expressions as in.value Note that in.value will contain the complete string data referenced above.
XML/root/main/h:title

Note that the namspace prefix used here 'h' must be configured in the HTTP XML Namspaces form

This XPath expression will bring back all elements matching the xpath expression including the parent/grandparents and all child elements/subelements. i.e

<root xmlns:h="http://www.w3.org/TR/html4/"> <main page="PF Main Page" > <h:title h:name="PF Title">PF Title Text <h:datarow> <h:data h:initials="AD">Ali Dawson</h:data> <h:data h:initials="GP">Gary Parden</h:data> </h:datarow> </h:title> </main> </root>

The following examples show how to reference the returned xpaths html/xml data structure in stream attribute expressions:-
  • Xpath element text value: in.value -> returns 'PF Title Text'
  • Xpath element attibutes: in.h$name -> returns 'PF Title'
  • Xpath parent attributes: in.^.page -> returns 'PF Main Page'
  • Xpath child attributes: listToString(in.h$datarow.h$data.h$initials) -> returns 'AD,GP'
  • Xpath child attribute text values: listToString(in.h$datarow.h$data.value) -> returns 'Ali Dawson,Gary Parden'

Note the use of a $ instead of our usual : namespace notation.

Note the use of a ^ to traverse to the immediate parent element.

Note the use of the listToString function to handle multiple matching child elements/attributes.

HTML/html/body/table

Note that namspaces are not supported in the Xpath expression for HTML response types

This XPath expression will bring back all elements matching the xpath expression including the parent/grandparents and all child elements/subelements. i.e

<html> <body nodename="Html Body"> <table> <tbody> <tr> <td initials="AD">Ali Dawson</td> <td initials="GP">Gary Parden</td> </tr> </tbody> </table> </body> </html>

The following examples show how to reference the returned xpaths html/xml data structure in stream attribute expressions:-
  • Xpath parent attributes: in.^.nodename -> returns 'Html Body'
  • Xpath child attributes: listToString(in.tbody.tr.td.initials) -> returns 'AD,GP'
  • Xpath child attribute text values: listToString(in.tbody.tr.td.value) -> returns 'Ali Dawson,Gary Parden'

Note the use of a ^ to traverse to the immediate parent element.

Note the use of the listToString function to handle multiple matching child elements/attributes.

Note the use of the optional html <tbody> tags. If these are not in your html data, then phixflow will insert them to conform with the HTML standards. Therefore when using absolute XPath expressions, note that the tbody tags need to be included even if they are not present in the incoming HTML data.
i.e /html/title/table/tr should be replaced with /html/title/table/tbody/tr. Alternativley you can use //tr
The same applies when referencing parent/child nodes within the stream attribute expressions.

 

Default XML namespaces

 

The following example illustrates how a default namespace defined in an XML response is handled.

<?xml version="1.0" encoding="UTF-8"?> <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns="urn:xmlns:company-com:message" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:pf="urn:phixflow.message.com"> <soapenv:Body> <queryResponse> <result xsi:type="QueryResult"> <done>true</done> <queryLocator xsi:nil="true"/> <record xsi:type="pf:sObject"> <pf:type>Account</pf:type> <pf:Name>Company1</pf:Name> <pf:CreatedDate>2013-10-31T11:26:21.000Z</pf:CreatedDate> </record> <record xsi:type="pf:sObject"> <pf:type>Account</pf:type> <pf:Name>Company2</pf:Name> <pf:CreatedDate>2013-10-31T11:26:21.000Z</pf:CreatedDate> </record> <size>2</size> </result> </queryResponse> </soapenv:Body> </soapenv:Envelope>

Since this document uses a default namespace ("urn:xmlns:company-com:message"), to refer to any element in an XPath expression that does not explicitly use a namespace you must use the default namespace. So to extract all record elements, you must first define a namespace in the collector for the default namespace. Suppose that you do and give this the name def. Then the XPath (defined in the XPath field on the Response tab) to extract all record elements will be //def:record

In a Stream that reads from this collector, any other namespaces defined in the document are used as normal. For example, to write the value from the element Name within each record element to a Stream attribute, you would use the attribute expression in.sf$Name

Form Icons

The form provides the standard form icons

See Also

  • No labels