As to how we analyzed our data, we primarily used pandas, a well-known Python library for data analysis. We will also be using Google Colab to visualize how we utilized pandas in our code.
Results and Discussion Summary
As discussed above, we conducted both a manual and machine learning classification of the data by their implied reasonings. Below, we can see the breakdowns for each set of categories.
The bar graph above shows the distribution of tweets according to our specified categories and manual classification. 62 of the tweets implied that the main reason for FEM’s regime being a Golden era was because of the flourishing economy during that time, 56 of them claimed peace and good quality of life, 22 pointed out being saved from communism and terrorism, and 10 cited infrastructure and other projects such as nutribun and the like.
Machine Learning Topic Clustering Result
.png)
For our machine learning model, we used a Topic Clustering model wherein the topics are the different reasons for supposing the "Golden era." In our case, we used the LDA model, which automatically classifies all the keywords from our tweets into group. Since the keywords were mostly overlapping with 4 topics, we decided to cut it down to 3.. With these results, we labeled them according to what themes the group of keywords seemed to suggest:
-
Topic 1: Personal Experience
-
Topic 2: Prosperity
-
Topic 3: Communism/Economy
On the other hand, we have this distribution of tweets as generated by the machine learning model. In this categorization, the tweets are classified in closer numbers. We can observe that 48 of them cited their personal experiences in supposing the "Golden era" during FEM's regime, 47 claimed prosperity during this time, and 55 of them implied the reason of communism and the economy. As we can notice, the numbers are quite closer to each other.​