Posts

Showing posts with the label Remove Duplicates

T) Explain the Run time behaviour DEDUP SORTED component ?

  Step-by-Step Process: Reads Grouped Data Takes in records that are already sorted by a specific key (like customer ID or date) Optional Filter Check If you have a filter expression: Filter returns FALSE (0) : Record is skipped Filter returns NULL : Record goes to reject port with error message Filter returns TRUE : Record continues processing If no filter: All records are processed Handles Record Groups Groups consecutive records with the same key value together For single-record groups : Record goes directly to output For multi-record groups (duplicates) : Uses your "keep" setting to decide which record to keep If keep = "first" or "last" : Keeps one record (sends to output), sends others to duplicate port If keep = "unique-only" : Sends ALL records from duplicate groups to duplicate port (output gets nothing) Key Points: Data MUST be sorted first for this to work correctly Works with consecutive duplicate records You control which duplicate t...

S) scan ,rollup and dedup with null key and unique key

Image
  Scan with null key ---11 rec   Rollup with null key –1 record( depends if the data is sorted then last if data is not sorted then fst record)   Dedup with null key —( keep fst )—1 record   Dedup with null key —( keep lst )—1 record   Dedup with null key —( keep unique)—0 record ( key didn’t get the unique record bcoz it treats every record is one group      

S) How to remove duplicate records with out using dedup sort?

Image
  Ans :- Rollup     Key :id