ksb.csle.didentification.privacy
Object that contains message ksb.csle.common.proto.StreamDidentProto.SwappingInfo SwappingInfo contains attributes as follows:
enum GenSwappingTableMethod { SWAP_RANDOM = 0; SWAP_FILE = 1; } enum FileLocationType { DIRECTORY = 0; URL = 1; } message SwappingFileInfo { required string filePath = 1; required int32 columnIndex = 2; optional FileLocationType fileType = 3 [default = DIRECTORY]; } message SwappingInfo { repeated int32 selectedColumnId = 1; required GenSwappingTableMethod method = 2 [default = SWAP_RANDOM]; optional SwappingFileInfo fileInfo = 3; repeated FieldInfo fieldInfo = 4; optional PrivacyCheckInfo check = 5; }
Anonymizes the column specified in src dataframe using generic 'Type' method.
Anonymizes the column specified in src dataframe using generic 'Type' method. The 'Type' is decided by inherited object module.
Dataframe to anonymize
Column to be anonymized
DataFrame The dataframe which replaces original column with anonymized column
Swaps the original records with pre-defined values.
Swaps the original records with pre-defined values. To do this, it makes the swap tables on the basis of given file content.
Dataframe to anonymize
Column to be pseudo-anonymized
DataFrame Anonymized dataframe
Returns column name from src dataframe specified by the column ID defined by protobuf.
Returns column name from src dataframe specified by the column ID defined by protobuf.
dataframe to get names of columns.
column ID to anonymize.
String.
Returns column names from src dataframe specified by column IDs.
Returns column names from src dataframe specified by column IDs. Note that the column with invalid IDs are ignored.
dataframe to get names of columns.
Array[String].
Checks the given column ID is valid.
Checks the given column ID is valid.
dataframe to get names of columns.
Boolean.
Checks the given column Name is valid.
Checks the given column Name is valid.
dataframe to get names of columns.
column Name.
Boolean.
Makes swap tables by referring to the given file information.
Makes swap tables by referring to the given file information.
Dataframe to anonymize
Column to be pseudo-anonymized
the information containing how to make swap tables
Map[String, String] Generated heuristic table
Makes swap tables comprising of [original data, random string].
Makes swap tables comprising of [original data, random string].
Dataframe to anonymize
Column to be pseudo-anonymized
How to make swap lists
Map[String, String] Generated heuristic table
Operates heuristic module for basic de-identification
Operates heuristic module for basic de-identification
Input dataframe
DataFrame Anonymized dataframe
:: ApplicationDeveloperApi ::
Operator that implements the swapping module in the PseudoAnonymization algorithm. Compared to heuristic module which makes the heuristic table using randomized string or manually, this module makes the heuristic table by referring to the given input file. It replaces the values of data with some random strings.