diff --git a/project_proposal.tex b/project_proposal.tex index a890a0a..1ee5d48 100644 --- a/project_proposal.tex +++ b/project_proposal.tex @@ -44,8 +44,46 @@ LLM which gives web scrapped articles a score how good the news is for a company information, I need to reevaluate which algorithm is the best. \section{Libraries and Tools} +The project will be implemented in Python using \texttt{gym-anytrading} to build the trading +environment. For initial experiments, I will use the built-in datasets from +\texttt{gym\_anytrading.datasets} such as \texttt{STOCKS\_GOOGL}, and later switch to real +historical stock data via \texttt{yfinance}. + +The reinforcement learning algorithms will be implemented using the \texttt{stable-baselines3} +library. I will start with the standard DQN algorithm and experiment with different epsilon decay +strategies. Since \texttt{stable-baselines3} does not directly support Double DQN, I plan to modify +the DQN implementation myself. Specifically, I will adjust the target calculation so that the action +is selected using the online network but evaluated using the target network, as required in Double +DQN. This will allow me to better understand the internal workings of the algorithm and directly +control its behavior. + +In addition to DQN and Double DQN, I will also train PPO using the standard implementation in +\texttt{stable-baselines3}. + +After training, I will evaluate all models using backtesting and performance metrics like total +profit, Sharpe ratio, and maximum drawdown. Later, I plan to extend the observation space with +technical indicators, volume data, or sentiment features. For technical indicators, I will use the +\texttt{pandas-ta} library since it is easy to install, well integrated with \texttt{pandas}, and +provides a wide range of indicators sufficient for prototyping and research. Alternatively, +\texttt{TA-Lib} is an option if higher performance is needed, but it has more complex installation +requirements. + +After adding these features, I will retrain the models and compare their performance again. + \section{Development plan} +Depending on the exact time my presentation will be scheduled, I have about 9-10 weeks of time. + +\subsection{Week 1--3} +I want to integrate the DQN algorithm as an example and train it already with historical data. + +\subsection{Week 4--6} +I plan to implement the other RL algorithms and the variations and evaluate which works best. Also +change the reward function. + +\subsection{Week 7 to the presentation} +Add the technical indicators and market volume to the environment. If I have too much time left, I can +try news analysis. \section{Availability} I am on vacation from the 04.08 to 13.08. On the 15. I am on an event, but I have time on the 14.