ksb.csle.didentification.privacy
Object that contains message ksb.csle.common.proto.StreamDidentProto.PartialAggrInfo PartialAggrInfo contains attributes as follows:
enum AggregationMethod { MIN = 0; AVG = 1; MAX = 2; STD = 3; COUNT = 4; MANUAL = 5; } enum OutlierMethod { ZSCORE = 0; BOXPLOT = 1; } message PartialAggrInfo { repeated int32 selectedColumnId = 1; required AggregationMethod method = 2 [default = AVG]; required OutlierMethod outlierMethod = 3 [default = ZSCORE]; repeated FieldInfo fieldInfo = 4; optional PrivacyCheckInfo check = 5; }
Anonymizes the column specified in src dataframe using generic 'Type' method.
Anonymizes the column specified in src dataframe using generic 'Type' method. The 'Type' is decided by inherited object module.
Dataframe to anonymize
Column to be anonymized
DataFrame The dataframe which replaces original column with anonymized column
Discriminates outliers among the records in the column of src dataframe and then replaces them with stat.
Discriminates outliers among the records in the column of src dataframe and then replaces them with stat. info.
Dataframe to encrypt
Column to be encrypted
DataFrame Anonymized dataframe
Discriminates outliers in a column containing numerical data and then only replaces them with stat.
Discriminates outliers in a column containing numerical data and then only replaces them with stat. info.
Dataframe to encrypt
Column to be encrypted
Methods of statistic, e.x., min, max, avg, std, and count
outlier method such as z-score and boxplot
DataFrame Anonymized dataframe
Discriminates outliers in a column containing text data and then replaces them with the string with maximum frequency.
Discriminates outliers in a column containing text data and then replaces them with the string with maximum frequency.
Dataframe to encrypt
Column to be encrypted
outlier method such as z-score and boxplot
DataFrame Anonymized dataframe
Returns column name from src dataframe specified by the column ID defined by protobuf.
Returns column name from src dataframe specified by the column ID defined by protobuf.
dataframe to get names of columns.
column ID to anonymize.
String.
Returns column names from src dataframe specified by column IDs.
Returns column names from src dataframe specified by column IDs. Note that the column with invalid IDs are ignored.
dataframe to get names of columns.
Array[String].
Checks the given column ID is valid.
Checks the given column ID is valid.
dataframe to get names of columns.
Boolean.
Checks the given column Name is valid.
Checks the given column Name is valid.
dataframe to get names of columns.
column Name.
Boolean.
Operates partial aggregation module for basic de-identification
Operates partial aggregation module for basic de-identification
Input dataframe
DataFrame Anonymized dataframe
:: ApplicationDeveloperApi ::
Operator that implements the partial aggregation module in the Aggregation algorithm. It discriminates outliers (currently, boxplot, z-score methods are supported) and then only replaces them with statistic information (e.x., min, max, avg, std, and count).