project_proposal finnished

2025-06-11 11:47:35 +02:00
parent a14a863107
commit 00b3d6cfac
1 changed files with 38 additions and 0 deletions
--- a/project_proposal.tex
+++ b/project_proposal.tex
@ -44,8 +44,46 @@ LLM which gives web scrapped articles a score how good the news is for a company
 information, I need to reevaluate which algorithm is the best.
 \section{Libraries and Tools}
 The project will be implemented in Python using \texttt{gym-anytrading} to build the trading
 environment. For initial experiments, I will use the built-in datasets from
 \texttt{gym\_anytrading.datasets} such as \texttt{STOCKS\_GOOGL}, and later switch to real
 historical stock data via \texttt{yfinance}.
 The reinforcement learning algorithms will be implemented using the \texttt{stable-baselines3}
 library. I will start with the standard DQN algorithm and experiment with different epsilon decay
 strategies. Since \texttt{stable-baselines3} does not directly support Double DQN, I plan to modify
 the DQN implementation myself. Specifically, I will adjust the target calculation so that the action
 is selected using the online network but evaluated using the target network, as required in Double
 DQN. This will allow me to better understand the internal workings of the algorithm and directly
 control its behavior.
 In addition to DQN and Double DQN, I will also train PPO using the standard implementation in
 \texttt{stable-baselines3}.
 After training, I will evaluate all models using backtesting and performance metrics like total
 profit, Sharpe ratio, and maximum drawdown. Later, I plan to extend the observation space with
 technical indicators, volume data, or sentiment features. For technical indicators, I will use the
 \texttt{pandas-ta} library since it is easy to install, well integrated with \texttt{pandas}, and
 provides a wide range of indicators sufficient for prototyping and research. Alternatively,
 \texttt{TA-Lib} is an option if higher performance is needed, but it has more complex installation
 requirements.
 After adding these features, I will retrain the models and compare their performance again.
 \section{Development plan}
 Depending on the exact time my presentation will be scheduled, I have about 9-10 weeks of time.
 \subsection{Week 1--3}
 I want to integrate the DQN algorithm as an example and train it already with historical data.
 \subsection{Week 4--6}
 I plan to implement the other RL algorithms and the variations and evaluate which works best. Also
 change the reward function.
 \subsection{Week 7 to the presentation}
 Add the technical indicators and market volume to the environment. If I have too much time left, I can
 try news analysis.
 \section{Availability}
 I am on vacation from the 04.08 to 13.08. On the 15. I am on an event, but I have time on the 14.