project_proposal finnished

2025-06-11 11:47:35 +02:00
parent a14a863107
commit 00b3d6cfac
1 changed files with 38 additions and 0 deletions
--- a/project_proposal.tex
+++ b/project_proposal.tex
@ -44,8 +44,46 @@ LLM which gives web scrapped articles a score how good the news is for a company
 information, I need to reevaluate which algorithm is the best.

 \section{Libraries and Tools}
+The project will be implemented in Python using \texttt{gym-anytrading} to build the trading
+environment. For initial experiments, I will use the built-in datasets from
+\texttt{gym\_anytrading.datasets} such as \texttt{STOCKS\_GOOGL}, and later switch to real
+historical stock data via \texttt{yfinance}.
+
+The reinforcement learning algorithms will be implemented using the \texttt{stable-baselines3}
+library. I will start with the standard DQN algorithm and experiment with different epsilon decay
+strategies. Since \texttt{stable-baselines3} does not directly support Double DQN, I plan to modify
+the DQN implementation myself. Specifically, I will adjust the target calculation so that the action
+is selected using the online network but evaluated using the target network, as required in Double
+DQN. This will allow me to better understand the internal workings of the algorithm and directly
+control its behavior.
+
+In addition to DQN and Double DQN, I will also train PPO using the standard implementation in
+\texttt{stable-baselines3}.
+
+After training, I will evaluate all models using backtesting and performance metrics like total
+profit, Sharpe ratio, and maximum drawdown. Later, I plan to extend the observation space with
+technical indicators, volume data, or sentiment features. For technical indicators, I will use the
+\texttt{pandas-ta} library since it is easy to install, well integrated with \texttt{pandas}, and
+provides a wide range of indicators sufficient for prototyping and research. Alternatively,
+\texttt{TA-Lib} is an option if higher performance is needed, but it has more complex installation
+requirements.
+
+After adding these features, I will retrain the models and compare their performance again.
+

 \section{Development plan}
+Depending on the exact time my presentation will be scheduled, I have about 9-10 weeks of time.
+
+\subsection{Week 1--3}
+I want to integrate the DQN algorithm as an example and train it already with historical data.
+
+\subsection{Week 4--6}
+I plan to implement the other RL algorithms and the variations and evaluate which works best. Also
+change the reward function.
+
+\subsection{Week 7 to the presentation}
+Add the technical indicators and market volume to the environment. If I have too much time left, I can
+try news analysis.

 \section{Availability}
 I am on vacation from the 04.08 to 13.08. On the 15. I am on an event, but I have time on the 14.