By the end of this chapter page you will be able to:
- Create File Collectors and Streams to read data from files into PhixFlow
- Run Analysis on Streams
- View Stream data
...
Table of Contents
Create a new file collector
Before starting this exercise, you will need to download train.zip. Extract the files to a convenient location on your local drive. Throughout the course, these will be referred to as the input files.
A file containing details of customer addresses can be found in the input files, at: …
[unzipped location]\inputData\AddressCheck\custAddrFiles\input\custAddr_20090322_1.txt
Open this file and have a look at the data in it.
Notice, in particular, that there is a header line in the file with the column names of the data in the file. Add
You will now add a new File Collector file collector to your model to read this file into PhixFlow:
...
.
- In the model toolbar, drag
into the model.Insert excerpt _file_collector _file_collector nopanel true - In the new
...
- file collector settings tab that opens, enter the Name:
Customer Addresses
...
.
- Click
.Insert excerpt _finish _finish nopanel true You will now see the new
...
file collector on the
...
model.
In the model toolbar, click
to make sure that the newInsert excerpt _save _save nopanel true
...
file collector will be shown when you re-open this model. Remember to save the model layout every time you add a new modelling component.
...
...
...
Set up
...
a file collector and stream
- Hover over your new
...
- file collector, and click
.Insert excerpt _upload_file _upload_file nopanel true - In the file explorer, go to: …\inputData\AddressCheck\custAddrFiles\input.
- Select the file custAddr_20090322_1.txt and
...
- click Open.
- In the Upload Managed File
...
- window, click the
button.Insert excerpt _upload_button _upload_button nopanel true - PhixFlow adds a new stream to your model. PhixFlow has automatically configured the
...
- file collector and the
...
- stream to load the file into PhixFlow.
...
...
Pause file data
In this course you will build up a model by adding components and running them. In PhixFlow, running a component causes each of its inputs to also run. This is so that you only need to run the final component in a model and it will, in turn, run everything needed you need to populate this component it, all the way back to the raw inputs to PhixFlow (via file collectors and database collectors).
In this case, once you have loaded the file, you will "pause" the file data so that you do not need to reload the file from your computer every time you run the model.
We will cover pausing components in PhixFlow in more detail in the Modelling Concepts course.
To pause the data loaded from the file:
...
- In your model, hover your mouse pointer over the stream
...
- icon
...
- for
CustomerAddresses
. - In the
...
- pop-up toolbar, click
.Insert excerpt _static _static nopanel true
Run analysis on the stream
You will now run Analysis analysis on your new Streamstream. Analysis is the process that does all PhixFlow's data processing.
To run Analysisanalysis:
...
- In your model, hover your mouse pointer over the stream
...
- icon .
- In the pop-up toolbar, select the
. PhixFlow runs the model, reading in data from the file of customer addresses via theInsert excerpt _run_analysis _run_analysis nopanel true
...
- file collector.
- You will
...
- get a confirmation message when this has completed. We will look at messages in the console later.
View stream data
To view the data you just loaded:
...
- In your model, hover your mouse pointer over the stream icon .
- In the pop-up toolbar, click
.Insert excerpt _stream_views _stream_views nopanel true - In the drop-down list,
...
- select
.Insert excerpt _view_default _view_default nopanel true