Text data cleaning strategy