Package com.whylogs.core
Class DatasetProfile
- java.lang.Object
-
- com.whylogs.core.DatasetProfile
-
- All Implemented Interfaces:
java.io.Serializable
public class DatasetProfile extends java.lang.Object implements java.io.Serializable
Representing a DatasetProfile that tracks- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
TAG_PREFIX
-
Constructor Summary
Constructors Constructor Description DatasetProfile(@NonNull java.lang.String sessionId, @NonNull java.time.Instant sessionTimestamp, @NonNull java.util.Map<java.lang.String,java.lang.String> tags)
Create a new Dataset profileDatasetProfile(@NonNull java.lang.String sessionId, @NonNull java.time.Instant sessionTimestamp, java.time.Instant dataTimestamp, @NonNull java.util.Map<java.lang.String,java.lang.String> tags, @NonNull java.util.Map<java.lang.String,ColumnProfile> columns)
DEVELOPER API.DatasetProfile(java.lang.String sessionId, java.time.Instant sessionTimestamp)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static DatasetProfile
fromProtobuf(com.whylogs.core.message.DatasetProfileMessage message)
java.util.Map<java.lang.String,ColumnProfile>
getColumns()
ModelProfile
getModelProfile()
DatasetProfile
merge(@NonNull DatasetProfile other)
Merge the data of anotherDatasetProfile
into this one.DatasetProfile
mergeStrict(@NonNull DatasetProfile other)
static DatasetProfile
parse(java.io.InputStream in)
byte[]
toBytes()
java.util.Iterator<com.whylogs.core.message.MessageSegment>
toChunkIterator()
com.whylogs.core.message.DatasetProfileMessage.Builder
toProtobuf()
com.whylogs.core.message.DatasetSummary
toSummary()
void
track(java.lang.String columnName, java.lang.Object data)
void
track(java.util.Map<java.lang.String,?> columns)
DatasetProfile
withAllMetadata(java.util.Map<java.lang.String,java.lang.String> metadata)
DatasetProfile
withClassificationModel(java.lang.String prediction, java.lang.String target, java.lang.String score)
DatasetProfile
withClassificationModel(java.lang.String prediction, java.lang.String target, java.lang.String score, java.lang.Iterable<java.lang.String> additionalOutputFields)
Returns a new dataset profile with the same backing datastructure.DatasetProfile
withMetadata(java.lang.String key, java.lang.String value)
DatasetProfile
withRegressionModel(java.lang.String prediction, java.lang.String target)
DatasetProfile
withRegressionModel(java.lang.String prediction, java.lang.String target, java.lang.Iterable<java.lang.String> additionalOutputFields)
DatasetProfile
withTag(java.lang.String key, java.lang.String value)
void
writeTo(java.io.OutputStream out)
-
-
-
Field Detail
-
TAG_PREFIX
public static final java.lang.String TAG_PREFIX
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
DatasetProfile
public DatasetProfile(@NonNull @NonNull java.lang.String sessionId, @NonNull @NonNull java.time.Instant sessionTimestamp, @Nullable java.time.Instant dataTimestamp, @NonNull @NonNull java.util.Map<java.lang.String,java.lang.String> tags, @NonNull @NonNull java.util.Map<java.lang.String,ColumnProfile> columns)
DEVELOPER API. DO NOT USE DIRECTLY- Parameters:
sessionId
- dataset namesessionTimestamp
- the timestamp for the current profiling sessiondataTimestamp
- the timestamp for the dataset. Used to aggregate across different cadencestags
- tags of the datasetcolumns
- the columns that we're copying over. Note that the source of columns should stop using these column objects as they will back this DatasetProfile instead
-
DatasetProfile
public DatasetProfile(@NonNull @NonNull java.lang.String sessionId, @NonNull @NonNull java.time.Instant sessionTimestamp, @NonNull @NonNull java.util.Map<java.lang.String,java.lang.String> tags)
Create a new Dataset profile- Parameters:
sessionId
- the name of the dataset profilesessionTimestamp
- the timestamp for this runtags
- the tags to track the dataset with
-
DatasetProfile
public DatasetProfile(java.lang.String sessionId, java.time.Instant sessionTimestamp)
-
-
Method Detail
-
getColumns
public java.util.Map<java.lang.String,ColumnProfile> getColumns()
-
getModelProfile
public ModelProfile getModelProfile()
-
withTag
public DatasetProfile withTag(java.lang.String key, java.lang.String value)
-
withMetadata
public DatasetProfile withMetadata(java.lang.String key, java.lang.String value)
-
withAllMetadata
public DatasetProfile withAllMetadata(java.util.Map<java.lang.String,java.lang.String> metadata)
-
track
public void track(java.lang.String columnName, java.lang.Object data)
-
track
public void track(java.util.Map<java.lang.String,?> columns)
-
withClassificationModel
public DatasetProfile withClassificationModel(java.lang.String prediction, java.lang.String target, java.lang.String score, java.lang.Iterable<java.lang.String> additionalOutputFields)
Returns a new dataset profile with the same backing datastructure. However, this new object contains a ClassificationMetrics object- Returns:
- a new DatasetProfile object
-
withClassificationModel
public DatasetProfile withClassificationModel(java.lang.String prediction, java.lang.String target, java.lang.String score)
-
withRegressionModel
public DatasetProfile withRegressionModel(java.lang.String prediction, java.lang.String target)
-
withRegressionModel
public DatasetProfile withRegressionModel(java.lang.String prediction, java.lang.String target, java.lang.Iterable<java.lang.String> additionalOutputFields)
-
toSummary
public com.whylogs.core.message.DatasetSummary toSummary()
-
toChunkIterator
public java.util.Iterator<com.whylogs.core.message.MessageSegment> toChunkIterator()
-
mergeStrict
public DatasetProfile mergeStrict(@NonNull @NonNull DatasetProfile other)
-
merge
public DatasetProfile merge(@NonNull @NonNull DatasetProfile other)
Merge the data of anotherDatasetProfile
into this one.We will only retain the shared tags and share metadata. The timestamps are copied over from this dataset. It is the responsibility of the user to ensure that the two datasets are matched on important grouping information
- Parameters:
other
- aDatasetProfile
- Returns:
- a merged
DatasetProfile
with summed up columns
-
toProtobuf
public com.whylogs.core.message.DatasetProfileMessage.Builder toProtobuf()
-
writeTo
public void writeTo(java.io.OutputStream out) throws java.io.IOException
- Throws:
java.io.IOException
-
toBytes
public byte[] toBytes() throws java.io.IOException
- Throws:
java.io.IOException
-
fromProtobuf
@Nullable public static DatasetProfile fromProtobuf(@Nullable com.whylogs.core.message.DatasetProfileMessage message)
-
parse
public static DatasetProfile parse(java.io.InputStream in) throws java.io.IOException
- Throws:
java.io.IOException
-
-