Class/Object

ksb.csle.didentification.privacy

HeuristicOperator

Related Docs: object HeuristicOperator | package privacy

Permalink

class HeuristicOperator extends BasePrivacyAnonymizer with DeAnonymizer

:: ApplicationDeveloperApi ::

Operator that implements the heuristic module in the PseudoAnonymization algorithm. It replaces the values of data according to the generated heuristic tables. This tables can be generated by following methods. - Randomized Table: generated by random string (number, alphabet) - Manual: insert the heuristic table list manually. It manages the heuristic table internally and hence it is also capable of de-anonymizing the anonmized records.

Linear Supertypes
DeAnonymizer, BasePrivacyAnonymizer, DataFrameCheck, BaseDataOperator[StreamOperatorInfo, DataFrame], BaseGenericOperator[StreamOperatorInfo, DataFrame], BaseGenericMutantOperator[StreamOperatorInfo, DataFrame, DataFrame], BaseDoer, Logging, Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. HeuristicOperator
  2. DeAnonymizer
  3. BasePrivacyAnonymizer
  4. DataFrameCheck
  5. BaseDataOperator
  6. BaseGenericOperator
  7. BaseGenericMutantOperator
  8. BaseDoer
  9. Logging
  10. Serializable
  11. Serializable
  12. AnyRef
  13. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new HeuristicOperator(o: StreamOperatorInfo)

    Permalink

    o

    Object that contains message ksb.csle.common.proto.StreamDidentProto.HeuristicInfo HeuristicInfo contains attributes as follows:

    • selectedColumnId: Column ID to apply the heuristic function
    • method: how to apply heuristic function.
    • manualInfo: changes the data with the given manual info
    • randInfo: changes the data randomly made string
    • fieldInfo: the info about column attributes (identifier, sensitive, ..)
    • check: the method how to verify the performance of anonymized data

    HeuristicInfo

    enum GenHeuristicTableMethod {
      HEUR_RANDOM = 0;
      HEUR_MANUAL = 1;
      HEUR_FILE = 2; // not used
    }
    message ManualInfo {
    	required string value = 1;
    	required string replaceValue = 2;
    }
    message RandomInfo {
    	required RandomMethod randMethod = 1 [default = MIXED];
    	required int32 length = 2 [default = 6];
    }
    message HeuristicInfo {
      repeated int32 selectedColumnId = 1;
      required GenHeuristicTableMethod method = 2 [default = HEUR_RANDOM];
      repeated ManualInfo manualInfo = 3;
      optional RandomInfo randInfo = 4;
      repeated FieldInfo fieldInfo = 5;
      optional PrivacyCheckInfo check = 6;
    }

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def anonymize(src: DataFrame, columnNames: Array[String]): DataFrame

    Permalink
    Definition Classes
    BasePrivacyAnonymizer
  5. def anonymize(src: DataFrame, columnName: String): DataFrame

    Permalink

    Anonymizes the column specified in src dataframe using generic 'Type' method.

    Anonymizes the column specified in src dataframe using generic 'Type' method. The 'Type' is decided by inherited object module.

    src

    Dataframe to anonymize

    columnName

    Column to be anonymized

    returns

    DataFrame The dataframe which replaces original column with anonymized column

    Definition Classes
    BasePrivacyAnonymizer
  6. def anonymizeColumn(src: DataFrame, columnName: String, heuristicType: GenHeuristicTableMethod): DataFrame

    Permalink

    Pseudo-anonymizes, that is, replaces the values of the column in src dataframe with heuristic table.

    Pseudo-anonymizes, that is, replaces the values of the column in src dataframe with heuristic table. 'heuristicType' method denotes how to make this table.

    src

    Dataframe to anonymize

    columnName

    Column to be pseudo-anonymized

    heuristicType

    How to make table

    returns

    DataFrame Anonymized dataframe

  7. def anonymizeColumn(src: DataFrame, columnName: String): DataFrame

    Permalink

    Pseudo-anonymizes, that is, replaces the values of the column in src dataframe with heuristic table.

    Pseudo-anonymizes, that is, replaces the values of the column in src dataframe with heuristic table. 'heuristicType' method denotes how to make this table.

    src

    Dataframe to anonymize

    columnName

    Column to be pseudo-anonymized

    returns

    DataFrame Anonymized dataframe

    Definition Classes
    HeuristicOperatorBasePrivacyAnonymizer
  8. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  9. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  10. def deAnonymize(src: DataFrame, columnName: String, heuristicTable: Map[String, String]): DataFrame

    Permalink

    De-anonymizes the anonymized dataframe by using the given heuristic table

    De-anonymizes the anonymized dataframe by using the given heuristic table

    src

    Dataframe to de-anonymize

    columnName

    Column name to be de-anonymized

    heuristicTable

    The heuristic table which was used to anonymize the records

    returns

    DataFrame The de-anonymized dataframe

    Definition Classes
    DeAnonymizer
  11. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  12. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  13. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  14. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  15. def getColumnName(src: DataFrame, columnId: Int): String

    Permalink

    Returns column name from src dataframe specified by the column ID defined by protobuf.

    Returns column name from src dataframe specified by the column ID defined by protobuf.

    src

    dataframe to get names of columns.

    columnId

    column ID to anonymize.

    returns

    String.

    Definition Classes
    DataFrameCheck
  16. def getColumnNames(src: DataFrame, columnIDs: Array[Int]): Array[String]

    Permalink

    Returns column names from src dataframe specified by column IDs.

    Returns column names from src dataframe specified by column IDs. Note that the column with invalid IDs are ignored.

    src

    dataframe to get names of columns.

    returns

    Array[String].

    Definition Classes
    DataFrameCheck
  17. def getQuasiColumnIDs(fieldInfos: Array[FieldInfo]): Array[Int]

    Permalink
    Definition Classes
    DataFrameCheck
  18. def getSensColumnIDs(fieldInfos: Array[FieldInfo]): Array[Int]

    Permalink
    Definition Classes
    DataFrameCheck
  19. def getValidColumnIDs(src: DataFrame, columnIDs: Array[Int]): Array[Int]

    Permalink
    Definition Classes
    DataFrameCheck
  20. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  21. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  22. def isValidColumnID(src: DataFrame, columnID: Int): Boolean

    Permalink

    Checks the given column ID is valid.

    Checks the given column ID is valid.

    src

    dataframe to get names of columns.

    returns

    Boolean.

    Definition Classes
    DataFrameCheck
  23. def isValidColumnName(src: DataFrame, columnName: String): Boolean

    Permalink

    Checks the given column Name is valid.

    Checks the given column Name is valid.

    src

    dataframe to get names of columns.

    columnName

    column Name.

    returns

    Boolean.

    Definition Classes
    DataFrameCheck
  24. val logger: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  25. def makeHeuristicTable(src: DataFrame, columnName: String, heuristicType: GenHeuristicTableMethod): Map[String, String]

    Permalink

    generates heuristic table in order to change the original data by referring to the generated table.

    generates heuristic table in order to change the original data by referring to the generated table. It is composed of [original data, anonymized data] and can be made by the following methods. - RANDOM: makes the table generated by random strings. - DESIGNATED: TO BE DEFINED

    src

    Dataframe to anonymize

    columnName

    Column to be pseudo-anonymized

    heuristicType

    How to make heuristic table

    returns

    Map[String, String] Generated heuristic table

  26. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  27. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  28. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  29. def operate(df: DataFrame): DataFrame

    Permalink

    Operates heuristic module for basic de-identification

    Operates heuristic module for basic de-identification

    df

    Input dataframe

    returns

    DataFrame Anonymized dataframe

    Definition Classes
    HeuristicOperator → BaseGenericOperator → BaseGenericMutantOperator
  30. val p: HeuristicInfo

    Permalink
  31. val privacy: PrivacyCheckInfo

    Permalink
    Definition Classes
    BasePrivacyAnonymizer
  32. def stop: Unit

    Permalink
    Definition Classes
    BaseGenericOperator → BaseGenericMutantOperator
  33. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  34. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  35. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  36. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  37. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from DeAnonymizer

Inherited from BasePrivacyAnonymizer

Inherited from DataFrameCheck

Inherited from BaseDataOperator[StreamOperatorInfo, DataFrame]

Inherited from BaseGenericOperator[StreamOperatorInfo, DataFrame]

Inherited from BaseGenericMutantOperator[StreamOperatorInfo, DataFrame, DataFrame]

Inherited from BaseDoer

Inherited from Logging

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped