project_proposal finnished
This commit is contained in:
@ -44,8 +44,46 @@ LLM which gives web scrapped articles a score how good the news is for a company
|
|||||||
information, I need to reevaluate which algorithm is the best.
|
information, I need to reevaluate which algorithm is the best.
|
||||||
|
|
||||||
\section{Libraries and Tools}
|
\section{Libraries and Tools}
|
||||||
|
The project will be implemented in Python using \texttt{gym-anytrading} to build the trading
|
||||||
|
environment. For initial experiments, I will use the built-in datasets from
|
||||||
|
\texttt{gym\_anytrading.datasets} such as \texttt{STOCKS\_GOOGL}, and later switch to real
|
||||||
|
historical stock data via \texttt{yfinance}.
|
||||||
|
|
||||||
|
The reinforcement learning algorithms will be implemented using the \texttt{stable-baselines3}
|
||||||
|
library. I will start with the standard DQN algorithm and experiment with different epsilon decay
|
||||||
|
strategies. Since \texttt{stable-baselines3} does not directly support Double DQN, I plan to modify
|
||||||
|
the DQN implementation myself. Specifically, I will adjust the target calculation so that the action
|
||||||
|
is selected using the online network but evaluated using the target network, as required in Double
|
||||||
|
DQN. This will allow me to better understand the internal workings of the algorithm and directly
|
||||||
|
control its behavior.
|
||||||
|
|
||||||
|
In addition to DQN and Double DQN, I will also train PPO using the standard implementation in
|
||||||
|
\texttt{stable-baselines3}.
|
||||||
|
|
||||||
|
After training, I will evaluate all models using backtesting and performance metrics like total
|
||||||
|
profit, Sharpe ratio, and maximum drawdown. Later, I plan to extend the observation space with
|
||||||
|
technical indicators, volume data, or sentiment features. For technical indicators, I will use the
|
||||||
|
\texttt{pandas-ta} library since it is easy to install, well integrated with \texttt{pandas}, and
|
||||||
|
provides a wide range of indicators sufficient for prototyping and research. Alternatively,
|
||||||
|
\texttt{TA-Lib} is an option if higher performance is needed, but it has more complex installation
|
||||||
|
requirements.
|
||||||
|
|
||||||
|
After adding these features, I will retrain the models and compare their performance again.
|
||||||
|
|
||||||
|
|
||||||
\section{Development plan}
|
\section{Development plan}
|
||||||
|
Depending on the exact time my presentation will be scheduled, I have about 9-10 weeks of time.
|
||||||
|
|
||||||
|
\subsection{Week 1--3}
|
||||||
|
I want to integrate the DQN algorithm as an example and train it already with historical data.
|
||||||
|
|
||||||
|
\subsection{Week 4--6}
|
||||||
|
I plan to implement the other RL algorithms and the variations and evaluate which works best. Also
|
||||||
|
change the reward function.
|
||||||
|
|
||||||
|
\subsection{Week 7 to the presentation}
|
||||||
|
Add the technical indicators and market volume to the environment. If I have too much time left, I can
|
||||||
|
try news analysis.
|
||||||
|
|
||||||
\section{Availability}
|
\section{Availability}
|
||||||
I am on vacation from the 04.08 to 13.08. On the 15. I am on an event, but I have time on the 14.
|
I am on vacation from the 04.08 to 13.08. On the 15. I am on an event, but I have time on the 14.
|
||||||
|
|||||||
Reference in New Issue
Block a user