project_proposal finnished
This commit is contained in:
@ -44,8 +44,46 @@ LLM which gives web scrapped articles a score how good the news is for a company
|
||||
information, I need to reevaluate which algorithm is the best.
|
||||
|
||||
\section{Libraries and Tools}
|
||||
The project will be implemented in Python using \texttt{gym-anytrading} to build the trading
|
||||
environment. For initial experiments, I will use the built-in datasets from
|
||||
\texttt{gym\_anytrading.datasets} such as \texttt{STOCKS\_GOOGL}, and later switch to real
|
||||
historical stock data via \texttt{yfinance}.
|
||||
|
||||
The reinforcement learning algorithms will be implemented using the \texttt{stable-baselines3}
|
||||
library. I will start with the standard DQN algorithm and experiment with different epsilon decay
|
||||
strategies. Since \texttt{stable-baselines3} does not directly support Double DQN, I plan to modify
|
||||
the DQN implementation myself. Specifically, I will adjust the target calculation so that the action
|
||||
is selected using the online network but evaluated using the target network, as required in Double
|
||||
DQN. This will allow me to better understand the internal workings of the algorithm and directly
|
||||
control its behavior.
|
||||
|
||||
In addition to DQN and Double DQN, I will also train PPO using the standard implementation in
|
||||
\texttt{stable-baselines3}.
|
||||
|
||||
After training, I will evaluate all models using backtesting and performance metrics like total
|
||||
profit, Sharpe ratio, and maximum drawdown. Later, I plan to extend the observation space with
|
||||
technical indicators, volume data, or sentiment features. For technical indicators, I will use the
|
||||
\texttt{pandas-ta} library since it is easy to install, well integrated with \texttt{pandas}, and
|
||||
provides a wide range of indicators sufficient for prototyping and research. Alternatively,
|
||||
\texttt{TA-Lib} is an option if higher performance is needed, but it has more complex installation
|
||||
requirements.
|
||||
|
||||
After adding these features, I will retrain the models and compare their performance again.
|
||||
|
||||
|
||||
\section{Development plan}
|
||||
Depending on the exact time my presentation will be scheduled, I have about 9-10 weeks of time.
|
||||
|
||||
\subsection{Week 1--3}
|
||||
I want to integrate the DQN algorithm as an example and train it already with historical data.
|
||||
|
||||
\subsection{Week 4--6}
|
||||
I plan to implement the other RL algorithms and the variations and evaluate which works best. Also
|
||||
change the reward function.
|
||||
|
||||
\subsection{Week 7 to the presentation}
|
||||
Add the technical indicators and market volume to the environment. If I have too much time left, I can
|
||||
try news analysis.
|
||||
|
||||
\section{Availability}
|
||||
I am on vacation from the 04.08 to 13.08. On the 15. I am on an event, but I have time on the 14.
|
||||
|
||||
Reference in New Issue
Block a user