All functions

auto_grouping()

Reduce cardinality in categorical variable by automatic grouping

categ_analysis()

Profiling analysis of categorical vs. target variable

compare_df()

Compare two data frames by keys

concatenate_n_vars()

Concatenate 'N' variables

convert_df_to_categoric()

Convert every column in a data frame to character

coord_plot()

Coordinate plot

correlation_table()

Get correlation against target variable

cross_plot()

Cross-plotting input variable vs. target variable

data_country

People with flu data

data_golf

Play golf

data_integrity()

Data integrity

data_integrity_model()

Check data integrity model

desc_groups()

Profiling categorical variable

desc_groups_rank()

Profiling categorical variable (rank)

df_status()

Get a summary for the given data frame (o vector).

discretize_df()

Discretize a data frame

discretize_get_bins()

Get the data frame thresholds for discretization

discretize_rgr()

Variable discretization by gain ratio maximization

entropy_2()

Computes the entropy between two variables

equal_freq()

Equal frequency binning

export_plot()

Export plot to jpeg file

fibonacci()

Fibonacci series

freq()

Frequency table for categorical variables

funModeling-package

funModeling: Exploratory data analysis, data preparation and model performance

gain_lift()

Generates lift and cumulative gain performance table and plot

gain_ratio()

Gain ratio

get_sample()

Sampling training and test data

hampel_outlier()

Hampel Outlier Threshold

heart_disease

Heart Disease Data

infor_magic()

Computes several information theory metrics between two vectors

information_gain()

Information gain

metadata_models

Metadata models data integrity

plot_num()

Plotting numerical data

plotar()

Correlation plots

prep_outliers()

Outliers Data Preparation

profiling_num()

Profiling numerical data

range01()

Transform a variable into the [0-1] range

status()

Get a summary for the given data frame (o vector).

tukey_outlier()

Tukey Outlier Threshold

v_compare()

Compare two vectors

var_rank_info()

Importance variable ranking based on information theory