Split input data into training and test set, retrieving always same sample by setting the seed.

get_sample(data, percentage_tr_rows = 0.8, seed = 987)

Arguments

data

input data source

percentage_tr_rows

percentage of training rows, range value from 0.1 to 0.99, default value=0.8 (80 percent of training data)

seed

to generate the sample randomly, default value=987

Value

TRUE/FALSE vector same length as 'data' param. TRUE represents that row position is for training data

Examples

# Training and test data. Percentage of training cases default value=80%. index_sample=get_sample(data=heart_disease, percentage_tr_rows=0.8) # Generating the samples data_tr=heart_disease[index_sample,] data_ts=heart_disease[-index_sample,]