Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Scenario

...

Panel
titleWhat counts as a duplicate?

There are three types of duplicate record:

  1. Two or more records have identical values in each and every field (true duplicates).
  2. Two or more records have identical values in some fields, and the fields that do not have matching values are of no consequence (it does not matter which value we take).
  3. Two or more records have identical values in some fields, and one of the

...

  1. variable fields gives us a vale we can select on (in practice, usually a datetime field like 'last updated time').

Files (or database records) can often show up with duplicate data. Often it is OK, and sometimes it is required to ignore duplicate records.

Duplicate data, or data with duplicate keys are a feature of most enterprise systems. PhixFlow has lots of ways of dealing with duplicated data, and how a model uses them depends entirely on the system requirements. In this case, we just want to ignore duplicates.

...

Panel
titleWhat counts as a duplicate?

Apply If this is case 3, apply sorting on another attribute, depending on which record you want. E.g. to get the latest record, sort by the last updated date.

...