4.5 Tokenization

Tokenization is use for dividing the text into a sequence of words or sentences.In below output, used only index value of 1, you can use any index.

4.6 Stemming

Stemming is use for removal of suffices, like “ing”, “ional”, “ly”, “s”, etc. by a simple rule-based approach. we use PorterStemmer from the NLTK library. In below output, we can see following is follow and trending is trend.