ResearchSpringer 2022Co-AuthorNot on GitHub

Deep RL Stock Trading

Springer 2022 — Co-Author — DRL with short selling

Rutgers

6 technologies

2 key decisions

4 results

Problem

Most deep RL stock trading research restricts agents to long-only positions — buy, hold, or sell what you own. Real trading includes short selling, which requires different risk management and introduces asymmetric reward dynamics. The research question: can a DRL agent learn profitable long+short strategies, and how does allowing short positions affect policy behavior and risk-adjusted returns?

Approach

We built a custom OpenAI Gym environment modeling a trading account with margin requirements for short positions. The action space includes three position states per asset: long, flat, short. The state space is a window of price history, technical indicators (RSI, MACD, Bollinger Bands), and current portfolio state. We trained and evaluated multiple DRL algorithms (DQN, PPO, A3C) with and without short-selling capability, measuring cumulative return, Sharpe ratio, and maximum drawdown across multiple market regimes.

Architecture

Deep RL Stock Trading — system diagram

Key Technical Decisions

Assembly Instructions — 2 Steps

Sharpe ratio as secondary reward signal

Optimizing only for cumulative return produces agents with extremely high volatility. Adding a Sharpe ratio component to the reward shaped agents toward risk-adjusted returns. The risk-tolerance weighting became a hyperparameter that produced a meaningful spectrum of agent personalities.

Multiple DRL algorithms for comparison

Rather than championing a single algorithm, the paper provides comparative analysis across DQN, PPO, and A3C. This empirically grounded the finding that policy gradient methods (PPO, A3C) adapt more gracefully to the non-stationary nature of financial time series than value-based DQN.

Results

✓Short-selling capability improved risk-adjusted returns (Sharpe ratio) vs. long-only agents
✓PPO outperformed DQN and A3C on out-of-sample test periods
✓Published in Springer 2022 as co-author
✓Custom trading environment handles margin requirements and position sizing

Tech Stack

PyTorchOpenAI GymPythonPandasNumPyyfinance

Links

Springer 2022

All Projects