As the requirement of the competition is a solution to analyze the effect of news sentiment on the daily stock return of top companies in the Oil industry, I have selected the ten companies in the given sample dataset to write the python code. This solution has a Panel OLS Regression analysis for the full dataset of ten companies and a simple OLS regression performed for a single company (Exxon) as well.
Panel regression has been used in analyzing the relationship because the dataset is of panel nature with cross-sections of 10 entities and time series of 5 years. While traditional linear regression models can lead to biased estimation results, panel data regression is a powerful statistical method. It is capable of controlling dependencies of unobserved, independent variables on a dependent variable.
Firstly, all the stock return data, firm-specific financial data, interest rate, oil demand, crisis variable are gathered for each company in separate data-frames. Then, all the data-frames are combined in a full data-frame. Panel OLS regression has been developed based on this full data-frame.
The ordinary least square (OLS) method has been occupied in studying the correlation of independent variables and the dependent variable of a single entity. Any company can run this OLS regression designed for a single company, to see the results and interpret how the independent variables affect their stock return. Alternatively, the regression analysis of the full data set provides a statistical overview of the oil industry as a whole.
|Are you a contestant for RMDS 2021 Data Science Competition?
||Mar 19, 2021
||Mar 21, 2021
Please sign in or create an account to give a rating or comment.