ksb.csle.didentification.privacy
Object that contains message ksb.csle.common.proto.StreamDidentProto.RandomRoundingInfo RandomRoundingInfo contains attributes as follows:
enum RoundingMethod { ROUND = 0; ROUND_UP = 1; ROUND_DOWN = 2; } message RandomRoundingInfo { repeated int32 selectedColumnId = 1; required int32 roundStep = 2 [default = 10]; required RoundingMethod method = 3 [default = ROUND]; repeated FieldInfo fieldInfo = 4; optional PrivacyCheckInfo check = 5; }
Anonymizes the column specified in src dataframe using generic 'Type' method.
Anonymizes the column specified in src dataframe using generic 'Type' method. The 'Type' is decided by inherited object module.
Dataframe to anonymize
Column to be anonymized
DataFrame The dataframe which replaces original column with anonymized column
Rounds the value of column in src dataframe w.r.t.
Rounds the value of column in src dataframe w.r.t. 'roundType' method
Dataframe to anonymize
Column to apply rounding
DataFrame Anonymized dataframe
Returns column name from src dataframe specified by the column ID defined by protobuf.
Returns column name from src dataframe specified by the column ID defined by protobuf.
dataframe to get names of columns.
column ID to anonymize.
String.
Returns column names from src dataframe specified by column IDs.
Returns column names from src dataframe specified by column IDs. Note that the column with invalid IDs are ignored.
dataframe to get names of columns.
Array[String].
Checks the given column ID is valid.
Checks the given column ID is valid.
dataframe to get names of columns.
Boolean.
Checks the given column Name is valid.
Checks the given column Name is valid.
dataframe to get names of columns.
column Name.
Boolean.
Operates random rounding module for basic de-identification
Operates random rounding module for basic de-identification
Input dataframe
DataFrame Anonymized dataframe
Same as roundingStringColumn(src, columnName, roundType).
Same as roundingStringColumn(src, columnName, roundType).
Dataframe to anonymize
Column to apply rounding
Unit of rounding, initial value is 10.
Method of rounding modules (ex., round, round-up, round-down)
DataFrame Anonymized dataframe
Some columns may contain both numerical and non-numerical values simultaneously, i.e., 21K, $45, etc.
Some columns may contain both numerical and non-numerical values simultaneously, i.e., 21K, $45, etc. In this case, this function extracts the numerical parts, and then only replaces them with calculated statistic info defined by 'roundType' method.
Dataframe to anonymize
Column to apply rounding
Method of rounding modules (ex., round, round-up, round-down)
DataFrame Anonymized dataframe
:: ApplicationDeveloperApi ::
Operator that implements the random rounding module in the Data Suppression algorithm. Compared to rounding module in Algorithm algorithm which is only applicable on numerical values, it can be applied on variables containing both numerical and string values.