In this day and age, information spreads at the click of a button and its ripple effects are felt throughout economies and the world at large. We are pleased to announce the 2021 RMDS Data Science Competition: Predicting Impact of News Sentiment on the Stock Market in collaboration with our partner WorldData.AI. Contestants will develop a model in order to predict daily stock performance using sentiment analysis. A grant of $1000 will be awarded to the grand prize winner and $500 to the runner up.
RMDS Lab presents The 2021 Challenge in partnership with WorldData.AI. This three week data science competition seeks to predict the impact of news sentiment on the stock market. You may register your team as an individual or as a group and only one representative from each group is required to fill out this form.
Winning Solutions See Award Ceremony
Team 5 Aces: Vishal Kapadia, Dhruvil Trivedi, Deep Amin, Kartik Patel, Saurav Borse
Team Data Garage: Kalyan Kumar Alisetty, Maruthi Sankar Nanduri, Vinil Rayala
This data science competition seeks to create an innovative solution to analyze the effects of news sentiment and biases on daily stock performance for top companies in the oil and gas industry. News Sentiments and biases have a significant impact on stock prices and consumer behavior. Contestants will be provided with the necessary news data, stock market data, macro data and company financial data.
Participants are encouraged to explore 10+ datasets provided in the sample dataset which covers the data mostly in 2020. There is also a recommended page for contestants to preview those related full datasets. Participants have the flexibility to leverage other datasets from WorldData.AI. They could apply filters within WorldData.AI to refine the data.
The related datasets include:
- NORTH AMERICA NEWS SOURCES:
This database contains News Sentiment analysis of each news for the past years. Participants can query keywords and find the right sentiment index. We have preselected "Exxon" as a relevant indicator.
- COMMODITY PRICES
This database covers the global price for Ice Brent Crude Oil, WTI Crude Oil with different measures Change, Volume, Settle, Open, Low and Value.
- GLOBAL STOCK EXCHANGE DATA
This dataset comprises daily stock market data from the New York Stock Exchange and covers Exxon, BP and Chevron companies closing prices.
- STOCK MARKET INDICATORS & INDEX
This covers the daily index for Dow Jones Transportation, Dow Jones Utility, S&P 500 and others.
- FINANCIAL STATEMENT
This is a company’s financial statement published in SEC. It covers the Balance Sheet, Cash Flow and Income Statement.
RMDS Lab offers our community a variety of educational resources focusing on data science applications and techniques. You may explore the RMDS learning portal containing various data science courses at learn.grmds.org.
Competitors may use the code “COMPETITION2021” to get complimentary access to our online course on Big Data and AI to Improve Competency and Employability.
Below are additional free resources:
If you have any questions regarding access to training materials and want to learn more about RMDS educational resources, you may use the Forum.
- Source code required (Python or R)
- Readme on how to run your code and requirements.txt on your development environment
- Datasets used in .zip folder
- CSV of results
- Technical report in PDF with names of all team members and team name required
- Optional is a working prototype like map, web page, apps
Impact: What useful business insights are acquired from the proposal? How does this submitted model benefit (or cost) businesses, and what actionable steps are recommended to improve their work?
Methodology Validity: Document the methodology, mathematics, and economic principles behind the proposal and provide the references or reasoning for your approach. How is the prediction generated and how are the factors weighted sensible? Are the assumptions and limitations of the methodology clearly outlined with suggestions to improve the proposal? Are the quantitative steps of data ingestion, feature engineering, model architecture, and performance optimization valid? How robust is your model?
Reproducibility: Does the solution use coding best practices with workflows and documentation to reproduce one’s work? Are the data ingress and egress pipelines reproducible? Is there a clear presentation of data science work in the documentation?
Usability: Is the information presented in a way that is actionable? Would a member of the general public understand the model, what it means, and what actions to take?
Ability to Deploy: Whether or not getting access to the data is realistic, computation time, whether or not it is a good fit within the existing system, scalability of the system to take into account new data sources, how often it needs to be maintained, score with feasible suggestions, easy to maintain/update, how much manpower, time, resources need to be allocated to maintain the functionality?
Fair and Ethical Use of Data: Does the solution take into account biases in data? Is the data from open and trusted sources?
Innovation: Will the idea have a big impact? How innovative is the approach, selection and weighting of various factors, or how information is displayed and communicated?
Stage 1: Registration
Participants will register on GRMDS. We will send out a confirmation email to all participants upon successful registration. Once you form your team, one representative from your team must fill out the Team Registration Form. Please note that this competition is open to all participants globally. For any questions you may ask it on the Forum.
Stage 2: Team work and submission
Submissions must include all deliverables and are due Sunday, March 21 11:59 PDT. Please upload all deliverables to the GRMDS. Place the names of all team members and team name on the technical report. Submission by any individual group member will represent the whole team.
Stage 3: Evaluation and Final Presentation
Our expert committee will evaluate all project deliverables and select the finalist teams at the Awards Ceremony. WorldData.AI and RMDS Lab may work with partners to deploy and use the winning models to score risks to guide our communities in the form of alerts accessed via map, website, or app.
Cash prizes, internship opportunities, and certificates of recognition will be awarded for first and second place.
First prize is $1,000
Second prize is $500
Considerations for internship positions at RMDS Lab, WorldData.AI and other partner organizations
Data Science Competency Certification
Complimentary premium membership at RMDS Lab
Complimentary data subscription at WorldData.AI
Publishing opportunities with partners
Invitation to present at IM Data 2021
Code of Conduct
The use of data will adhere to ethical use and protection of individual data privacy. Find the Code of Conduct here
Frequently Asked Questions
The registration form can be found here. (You must be signed in to view the form.)
Participants are welcomed either as individuals or as teams. In the case of teams, one person must be designated as the team leader and will be solely responsible for communications with the organizers.
Submissions can be made here. See above section “Submission Deliverables” to see what must be included in your submission.
There is no deadline for our registration. But we strongly recommend that your registration is no later than March 8, since you need time to prepare your work.
No minimum or maximum number.
The number of team members will not impact potential prize offerings. The prize offerings will remain the same.
Yes. We welcome people from different cities or countries to join our competition. This competition is open to the global community.
If you have any questions, you may ask in the Forum. We’ll get back to you as soon as we can.
If you need to update your team roster, fill out the form here
Please see resources listed on this page, including recordings of competition training sessions. There is also a dataset sample, data dictionary and further reading material.