ksb.csle.didentification.privacy
Object that contains message ksb.csle.common.proto.StreamDidentProto.HidingInfo HidingInfo contains attributes as follows:
enum AggregationMethod { MIN = 0; AVG = 1; MAX = 2; STD = 3; COUNT = 4; MANUAL = 5; } message HidingInfo { repeated int32 selectedColumnId = 1; required AggregationMethod method = 2 [default = AVG]; optional bool isDataRange = 3 [default = false]; repeated FieldInfo fieldInfo = 4; optional PrivacyCheckInfo check = 5; }
Anonymizes the column specified in src dataframe using generic 'Type' method.
Anonymizes the column specified in src dataframe using generic 'Type' method. The 'Type' is decided by inherited object module.
Dataframe to anonymize
Column to be anonymized
DataFrame The dataframe which replaces original column with anonymized column
Anonymizes the column in src dataframe using 'hidingType' method.
Anonymizes the column in src dataframe using 'hidingType' method. - If the data range mode is on, call the DataRange module internally - If the specified column is numerical type, then this function calls the function in the Aggregation module internally. - If the specified column is only string type, it calls the function in the RecordReduction module internally. - If the specified column is both numeric and string mixed, it separates the numeric values and applies the hiding function only on them. And then, it combines the above hiding values and separated string.
Dataframe to anonymize
Column to be anonymized
DataFrame Anonymized dataframe
Returns column name from src dataframe specified by the column ID defined by protobuf.
Returns column name from src dataframe specified by the column ID defined by protobuf.
dataframe to get names of columns.
column ID to anonymize.
String.
Returns column names from src dataframe specified by column IDs.
Returns column names from src dataframe specified by column IDs. Note that the column with invalid IDs are ignored.
dataframe to get names of columns.
Array[String].
Hides the column containing both numerical and string values in src dataframe using 'aggrType' method.
Hides the column containing both numerical and string values in src dataframe using 'aggrType' method.
Dataframe to anonymize
Column to be anonymized
Methods to hide. ex., min, max, avg, and std
DataFrame Anonymized dataframe
Hides the column containing only string values in src dataframe.
Hides the column containing only string values in src dataframe.
Dataframe to anonymize
Column to be anonymized
DataFrame Anonmized dataframe
Hides the string column using 'aggrType' method.
Hides the string column using 'aggrType' method. The values in this column may be comprised of string only, or of both numerical and string mixed.
Dataframe to anonymize
Column to be anonymized
Methods to hide. ex., min, max, avg, and std
Checks the given column ID is valid.
Checks the given column ID is valid.
dataframe to get names of columns.
Boolean.
Checks the given column Name is valid.
Checks the given column Name is valid.
dataframe to get names of columns.
column Name.
Boolean.
Operates hiding module for basic de-identification
Operates hiding module for basic de-identification
Input dataframe
DataFrame Anonymized dataframe
:: ApplicationDeveloperApi ::
Operator that implements the hiding module in the Data Suppression algorithm. It replaces (or hides) the values of the data with some statistic values such as min, max, or avg. Compared with aggregation module, which is only applicable to numerical data, this module can be applied on string data containing numerical values such as 20K, $40.