Twitter Data Analytics of Announcement of Team India for ICC T20 World Cup
I have to start off by admitting that my domain knowledge on the subject (BCCI and Indian Cricket Team) is not what it used to be. However, the announcement of the Indian Cricket Team roster for the ICC T20 World Cup on September 12th, 2022 left many of the folks I follow in social media in bad taste. Most of the posts were surrounding how the team selection was not merit based (a particular highlight being the exclusion of #SanjuSamson from the roster). I found this an interesting topic to pursue and started collecting twitter data from the 12th of September, till today Oct23rd. 325,847 tweets later this is what I have. I did not collect the data for the lull period of Sept 27th and Oct 4th. There was a lot of spurious data that was removed through R and a further filtering was done on the source to make sure bot entries did not make through into the data. There was quite a bit of manual stemming and removal of stop words in order to filter out the noise (BigBoss, Movie advertisements