Converts all numerical variables into factor or character, depending on 'stringsAsFactors' parameter, based on equal frequency criteria. The thresholds for each segment in each variable are generated based on the output of discretize_get_bins function, which returns a data frame containing the threshold for each variable. This result is must be the 'data_bins' parameter input. Important to note that the returned data frame contains the non-transformed variables plus the transformed ones. More info about converting numerical into categorical variables can be found at:

discretize_df(data, data_bins, stringsAsFactors = T)



Input data frame


data frame generated by 'discretize_get_bins' function. It contains the variable name and the thresholds for each bin, or segment.


Boolean variable which indicates if the discretization result is character or factor. When TRUE, the segments are ordered. TRUE by default.


Data frame with the transformed variables


if (FALSE) { # Getting the bins thresholds for each. If input is missing, will run for all numerical variables. d_bins=discretize_get_bins(data=heart_disease, input=c("resting_blood_pressure", "oldpeak"), n_bins=5) # Now it can be applied on the same data frame, or in a new one (for example in a predictive model that change data over time) heart_disease_discretized=discretize_df(data=heart_disease, data_bins=d_bins, stringsAsFactors=T) }