ksb.csle.didentification.privacy
Object that contains message ksb.csle.common.proto.StreamDidentProto.HeuristicInfo HeuristicInfo contains attributes as follows:
enum GenHeuristicTableMethod { HEUR_RANDOM = 0; HEUR_MANUAL = 1; HEUR_FILE = 2; // not used } message ManualInfo { required string value = 1; required string replaceValue = 2; } message RandomInfo { required RandomMethod randMethod = 1 [default = MIXED]; required int32 length = 2 [default = 6]; } message HeuristicInfo { repeated int32 selectedColumnId = 1; required GenHeuristicTableMethod method = 2 [default = HEUR_RANDOM]; repeated ManualInfo manualInfo = 3; optional RandomInfo randInfo = 4; repeated FieldInfo fieldInfo = 5; optional PrivacyCheckInfo check = 6; }
Anonymizes the column specified in src dataframe using generic 'Type' method.
Anonymizes the column specified in src dataframe using generic 'Type' method. The 'Type' is decided by inherited object module.
Dataframe to anonymize
Column to be anonymized
DataFrame The dataframe which replaces original column with anonymized column
Pseudo-anonymizes, that is, replaces the values of the column in src dataframe with heuristic table.
Pseudo-anonymizes, that is, replaces the values of the column in src dataframe with heuristic table. 'heuristicType' method denotes how to make this table.
Dataframe to anonymize
Column to be pseudo-anonymized
How to make table
DataFrame Anonymized dataframe
Pseudo-anonymizes, that is, replaces the values of the column in src dataframe with heuristic table.
Pseudo-anonymizes, that is, replaces the values of the column in src dataframe with heuristic table. 'heuristicType' method denotes how to make this table.
Dataframe to anonymize
Column to be pseudo-anonymized
DataFrame Anonymized dataframe
De-anonymizes the anonymized dataframe by using the given heuristic table
De-anonymizes the anonymized dataframe by using the given heuristic table
Dataframe to de-anonymize
Column name to be de-anonymized
The heuristic table which was used to anonymize the records
DataFrame The de-anonymized dataframe
Returns column name from src dataframe specified by the column ID defined by protobuf.
Returns column name from src dataframe specified by the column ID defined by protobuf.
dataframe to get names of columns.
column ID to anonymize.
String.
Returns column names from src dataframe specified by column IDs.
Returns column names from src dataframe specified by column IDs. Note that the column with invalid IDs are ignored.
dataframe to get names of columns.
Array[String].
Checks the given column ID is valid.
Checks the given column ID is valid.
dataframe to get names of columns.
Boolean.
Checks the given column Name is valid.
Checks the given column Name is valid.
dataframe to get names of columns.
column Name.
Boolean.
generates heuristic table in order to change the original data by referring to the generated table.
generates heuristic table in order to change the original data by referring to the generated table. It is composed of [original data, anonymized data] and can be made by the following methods. - RANDOM: makes the table generated by random strings. - DESIGNATED: TO BE DEFINED
Dataframe to anonymize
Column to be pseudo-anonymized
How to make heuristic table
Map[String, String] Generated heuristic table
Operates heuristic module for basic de-identification
Operates heuristic module for basic de-identification
Input dataframe
DataFrame Anonymized dataframe
:: ApplicationDeveloperApi ::
Operator that implements the heuristic module in the PseudoAnonymization algorithm. It replaces the values of data according to the generated heuristic tables. This tables can be generated by following methods. - Randomized Table: generated by random string (number, alphabet) - Manual: insert the heuristic table list manually. It manages the heuristic table internally and hence it is also capable of de-anonymizing the anonmized records.