Using Soft Margin Kernel Support Vector Machine to classify newspaper articles to model an Economic Policy Uncertainty Index for India.
Plain textual data was collected by web scraping the Hindu newspaper articles for a period of 4 years. The processed text was fed to the soft margin RBF Kernel SVM classifier after data transformation and feature selection. We read 1100 newspaper articles and tested over 47875 remaining articles. The resulting index after normalization was compared with scores like Volatility Index (VIX) and the past Government Securities. We came up with an index that had a linear correlation of 0.61 with Indian VIX and we were also able to detect important national events like political elections, terrorism, economic crisis, budget season etc.
The link to the report can be found here