Given a summary Dataframe that computed the stats.
Given a summary Dataframe that computed the stats. Add derived data (example: null rate, median, etc)
Navigate the dataframe and compute statistics partitioned by date stamp
Navigate the dataframe and compute statistics partitioned by date stamp
Partitioned by day version of the normalized summary. Useful for scheduling a job that computes daily stats. Returns a KvRdd to be able to be pushed into a KvStore for fetching and merging. As well as a dataframe for storing in hive.
For entity on the left we use daily partition as the key. For events we bucket by timeBucketMinutes (def. 1 hr) Since the stats are mergeable coarser granularities can be obtained through fetcher merging.