Gets the average value among the given list of values
Gets the average value among the given list of values
the list of values
Double The average value
Gets the average value among this column
Gets the average value among this column
Dataframe
Column
String The string type of average value
Returns column name from src dataframe specified by the column ID defined by protobuf.
Returns column name from src dataframe specified by the column ID defined by protobuf.
dataframe to get names of columns.
column ID to anonymize.
String.
Returns column names from src dataframe specified by column IDs.
Returns column names from src dataframe specified by column IDs. Note that the column with invalid IDs are ignored.
dataframe to get names of columns.
Array[String].
Gets the number of tuples among he given list of values
Gets the number of tuples among he given list of values
Double The number of tuples
Gets the number of tuples among this column
Gets the number of tuples among this column
Dataframe
Column
Double The number of tuples
Statistic table is composed of [Interval, statistic].
Statistic table is composed of [Interval, statistic]. This function returns Interval which include given 'value'
Statistic table
Value to search return Interval
Gets the maximum value among the given list of values
Gets the maximum value among the given list of values
the list of values
Double Maximum value
Gets the maximum value among this column
Gets the maximum value among this column
Dataframe
Column
String the string type of Maximum value
Gets the minimum value among the given list of values
Gets the minimum value among the given list of values
the list of values
Double Minimum value
Gets the minimum value among this column
Gets the minimum value among this column
Dataframe
Column
String the string type of Minimum value
Get the representative value of the 'columnName' column in 'src' data frame.
Get the representative value of the 'columnName' column in 'src' data frame.
Dataframe
Column name
Statistic methods. ex., min, max, and avg
Double The representative value
Get the representative value of the 'columnName' column in 'src' data frame.
Get the representative value of the 'columnName' column in 'src' data frame.
Dataframe
Column name
Statistic methods. ex., min, max, and avg
Double The representative value
Gets the average value among the given list of values
Gets the average value among the given list of values
the list of values
Double The average value
Gets the standard deviation value among this column
Gets the standard deviation value among this column
Dataframe
Column
String The string type of standard deviation value
Checks the given column ID is valid.
Checks the given column ID is valid.
dataframe to get names of columns.
Boolean.
Checks the given column Name is valid.
Checks the given column Name is valid.
dataframe to get names of columns.
column Name.
Boolean.
Same as makeAgeStatTable(src: DataFrame, columnName: String, method: String), but the method is the type of AggregationMethod.
Same as makeAgeStatTable(src: DataFrame, columnName: String, method: String), but the method is the type of AggregationMethod.
Dataframe
Column
Statistic methods. ex., min, max, avg, and median
Map[Interval, Double] Statistic table
In case of age-related column, the statistic info may be decided by the 10s, 20s, and so on.
In case of age-related column, the statistic info may be decided by the 10s, 20s, and so on.
Dataframe
Column
Statistic methods. ex., min, max, avg, and median
Map[Interval, Double] Statistic table
Same as makeMixedStatTable(src: DataFrame, columnName: String, method: String), but the method is the type of AggregationMethod
Same as makeMixedStatTable(src: DataFrame, columnName: String, method: String), but the method is the type of AggregationMethod
Dataframe
Column
AggregationMethod. ex., min, max, avg, and median
Map[Interval, Double] Statistic table
Some columns may contain both numerical and non-numerical values simultaneously, i.e., 21K, $45, etc.
Some columns may contain both numerical and non-numerical values simultaneously, i.e., 21K, $45, etc. In this case, this function calculates the statistics information only considering numerical parts, and then only replaces numerical parts with calculated statistic info while sustaining the other parts. It makes the statistic table as a form of map [numerical interval, statistics info].
Dataframe
Column
Statistic methods. ex., min, max, avg, and median
Map[Interval, Double] Statistic table
Makes the statistic table which includes statistic information about some numerical interval as a form of map [numerical interval, statistics info].
Makes the statistic table which includes statistic information about some numerical interval as a form of map [numerical interval, statistics info].
Dataframe
Column
Statistic methods. ex., min, max, avg, and median
The number of intervals in a column
Map[Interval, Double] Statistic table
Same as makeNumericStatTable(src: DataFrame, columnName: String, method: String, nSteps: Int), but the method is the type of AggregationMethod.
Same as makeNumericStatTable(src: DataFrame, columnName: String, method: String, nSteps: Int), but the method is the type of AggregationMethod.
Dataframe
Column
AggregationMethod. ex., min, max, avg, and median
The number of intervals in a column
Map[Interval, Double] Statistic table
Makes the statistic table which includes statistic information about some numerical interval as a form of map [numerical interval, statistics info].
Makes the statistic table which includes statistic information about some numerical interval as a form of map [numerical interval, statistics info]. The default number of intervals in a column is 10.
Dataframe
Column
Statistic methods. ex., min, max, avg, and median
Map[Interval, Double] Statistic table
Same as makeNumericStatTable(src: DataFrame, columnName: String, method: String), but the method is the type of AggregationMethod
Same as makeNumericStatTable(src: DataFrame, columnName: String, method: String), but the method is the type of AggregationMethod
Dataframe
Column
AggregationMethod. ex., min, max, avg, and median
Map[Interval, Double] Statistic table
This object provides some functions to manage statistic information.