Essential data preprocessing steps for ML