Find the Optimal Stock Portfolio with a Genetic Algorithm

Published: May 10, 2024

Find the Optimal Stock Portfolio with a Genetic Algorithm

Note: This article was published on Medium. You can find the original article here.

If you are someone like me, who is scratching his head about how to invest money, this article might be for you!

This article is not investment advice, so please use it at your own discretion. I don’t accept any liabilities for your financial decisions.

Background

Is there a way to invest in a smaller group of stocks to obtain an above-average risk-adjusted return? E.g. can we identify 10 stocks (E.g. 1. Tesla + 2. IBM + 3. Chevron + … + 10. Gamestop), which may be performing average individually, but yield strong results when invested at the same time?

If we were to try all 10-stock combinations from S&P 500, that would be 245,810,588,801,891,070,000 combinations, which would require a crazy amount of time to find the “optimal” 10-stock portfolio. In such a scenario, the Genetic Algorithm comes in handy, as it takes a significantly shorter time to find a “good-enough” solution.

Money

Problem Statement

We want to find a 10-stock combination from stocks listed in S&P 500, that yields the best risk-adjusted-return. Return will be the % growth of that portfolio, while risk will be the standard deviation observed along the invested time period. Risk-adjusted return is just return divided by risk.

Method

We will be using a Genetic Algorithm to iterate through 10-stock combinations and try to find the best risk-adjusted return. The below video gives a good overview of Genetic Algorithms.

On a high level, Genetic Algorithms simulate evolution, by iterating through generations of portfolios, and only remaining the ones with strong “fitness” (risk-adjusted return of a portfolio). Mechanisms like “crossover” (combining 2 existing stock portfolios to generate a new one) and “mutation” (randomly swapping a stock in a portfolio with another one) prevent the model get stuck on local minima and introduce diversity in the calculation process.

We will be using the PyGAD library in Python which helps us easily implement the genetic algorithms.

Data

We get the list of stocks on the S&P500 (See the end for data source) and access Yahoo Finance to obtain the historical performance of each stock on the list. A master database is created with all stocks.

Data

We arbitrarily fix the start and end date of our investment period to 2021 April ~ 2022 March. “Adjusted close” is used for stock price, and all stock prices are normalized to $10 based on their price at the beginning of the investment period. This is because we want to invest equally ($10 per stock) in each stock and not be swayed by the stock price. Each stock ticker is assigned a stock index, as this is required by the Genetic Algorithm library we will be using. Below is how the data looks like to be fed to the Genetic Algorithm:

Final data

Training

As GA has many parameters (# of generations, # solutions per population, # % of mutating genes, etc.), we do a quick hyperparameter optimization to find the best parameters. Then we train the algorithm to find the best stock portfolio. As seen in the below chart, the algorithm is able to double the fitness (a.k.a risk-adjusted return) around 20 generations, beyond where the fitness does not improve.

Training

Results - Benchmark 1: S&P 500

Firstly, we wanted to improve the performance of the market index S&P500 so let’s take a look at how it performed in the investment timeframe:

Results - Benchmark 1: S&P 500

S&P 500’s performance was totally unremarkable during this time period with ~7% annual return, and ~1% daily risk (standard deviation), which yields ~8% risk-adjusted return. As this benchmark is too easy to beat, let’s find ourselves another benchmark:

Results - Benchmark 2: Top 10 performers

During the investment time frame, we selected the 10 stocks that achieved the highest return (Note that this is ONLY return, which does not take “risk” into account). The top 10 stocks were ‘DVN’, ‘APA’, ‘MRO’, ‘COP’, ‘CF’, ‘FTNT’, ‘OXY’, ‘FANG’, ‘NVDA’, and ‘MOS’. Most of these companies are in energy-related business with exceptions like NVIDIA.

Results - Benchmark 2: Top 10 performers

As seen above, the portfolio of these 10 companies returned ~90% growth with ~2% risk. Overall, the risk-adjusted return was 42.8%.

Our Results

Our model has identified the combination of ‘CERN’, ‘DVN’, ‘DRE’, ‘ABBV’, ‘SEE’, ‘ORLY’, ‘WST’, ‘COP’, ‘ED’, ‘PSA’ stocks as the top portfolio with the maximum risk-adjusted return. Some of these companies are also present in Benchmark #2 — a.k.a Top performers — but we see new entries such as Abbvie.

Our Results

The model’s normal return is below that of top performers and stands around 52%, however, it is able to achieve a significantly lower risk: 0.8%. As you can see in the above graph, the portfolio does not experience any swings and it just keeps increasing in a consistent manner. The portfolio overall achieves a risk-adjusted return of 62% which is 20pp above the top performers' portfolio.

Conclusion

Conclusion

We trained a genetic algorithm for finding the portfolio with the maximum risk-adjusted return. The genetic algorithm helped us find a “good enough” solution among billions of possible portfolios. Higher returns can be achieved with further training.

You can find the data and code in this GitHub Repository.

Happy hacking!

Leave comment

Comments

Check out other blog posts

Create A Simple and Dynamic Tooltip With Svelte and JavaScript

2024/06/19

Create A Simple and Dynamic Tooltip With Svelte and JavaScript

JavaScriptSvelteSimpleDynamicTooltipFront-end
Create an Interactive Map of Tokyo with JavaScript

2024/06/17

Create an Interactive Map of Tokyo with JavaScript

SvelteSVGJavaScriptTailwindInteractive MapTokyoJapanTokyo Metropolitan Area23 Wards
How to Easily Fix Japanese Character Issue in Matplotlib

2024/06/14

How to Easily Fix Japanese Character Issue in Matplotlib

MatplotlibGraphChartPythonJapanese charactersIssueBug
Book Review | Talking to Strangers: What We Should Know about the People We Don't Know by Malcolm Gladwell

2024/06/13

Book Review | Talking to Strangers: What We Should Know about the People We Don't Know by Malcolm Gladwell

Book ReviewTalking to StrangersWhat We Should Know about the People We Don't KnowMalcolm Gladwell
Most Commonly Used 3,000 Kanjis in Japanese

2024/06/07

Most Commonly Used 3,000 Kanjis in Japanese

Most CommonKanji3000ListUsage FrequencyJapaneseJLPTLanguageStudyingWordsKanji ImportanceWord Prevalence
Replace With Regex Using VSCode

2024/06/07

Replace With Regex Using VSCode

VSCodeRegexFindReplaceConditional Replace
Do Not Use Readable Store in Svelte

2024/06/06

Do Not Use Readable Store in Svelte

SvelteReadableWritableState ManagementStoreSpeedMemoryFile Size
Increase Website Load Speed by Compressing Data with Gzip and Pako

2024/06/05

Increase Website Load Speed by Compressing Data with Gzip and Pako

GzipCompressionPakoWebsite Load SpeedSvelteKit
Find the Word the Mouse is Pointing to on a Webpage with JavaScript

2024/05/31

Find the Word the Mouse is Pointing to on a Webpage with JavaScript

JavascriptMousePointerHoverWeb Development
Create an Interactive Map with Svelte using SVG

2024/05/29

Create an Interactive Map with Svelte using SVG

SvelteSVGInteractive MapFront-end
Book Review | Originals: How Non-Conformists Move the World by Adam Grant & Sheryl Sandberg

2024/05/28

Book Review | Originals: How Non-Conformists Move the World by Adam Grant & Sheryl Sandberg

Book ReviewOriginalsAdam Grant & Sheryl SandbergHow Non-Conformists Move the World
How to Algorithmically Solve Sudoku Using Javascript

2024/05/27

How to Algorithmically Solve Sudoku Using Javascript

Solve SudokuAlgorithmJavaScriptProgramming
How I Increased Traffic to my Website by 10x in a Month

2024/05/26

How I Increased Traffic to my Website by 10x in a Month

Increase Website TrafficClicksImpressionsGoogle Search Console
Life is Like Cycling

2024/05/24

Life is Like Cycling

CyclingLifePhilosophySuccess
Generate a Complete Sudoku Grid with Backtracking Algorithm in JavaScript

2024/05/19

Generate a Complete Sudoku Grid with Backtracking Algorithm in JavaScript

SudokuComplete GridBacktracking AlgorithmJavaScript
Why Tailwind is Amazing and How It Makes Web Dev a Breeze

2024/05/16

Why Tailwind is Amazing and How It Makes Web Dev a Breeze

TailwindAmazingFront-endWeb Development
Generate Sitemap Automatically with Git Hooks Using Python

2024/05/15

Generate Sitemap Automatically with Git Hooks Using Python

Git HooksPythonSitemapSvelteKit
Book Review | Range: Why Generalists Triumph in a Specialized World by David Epstein

2024/05/14

Book Review | Range: Why Generalists Triumph in a Specialized World by David Epstein

Book ReviewRangeDavid EpsteinWhy Generalists Triumph in a Specialized World
What is Svelte and SvelteKit?

2024/05/13

What is Svelte and SvelteKit?

SvelteSvelteKitFront-endVite
Internationalization with SvelteKit (Multiple Language Support)

2024/05/12

Internationalization with SvelteKit (Multiple Language Support)

InternationalizationI18NSvelteKitLanguage Support
Reduce Svelte Deploy Time With Caching

2024/05/11

Reduce Svelte Deploy Time With Caching

SvelteEnhanced ImageCachingDeploy Time
Lazy Load Content With Svelte and Intersection Oberver

2024/05/10

Lazy Load Content With Svelte and Intersection Oberver

Lazy LoadingWebsite Speed OptimizationSvelteIntersection Observer
Convert ShapeFile To SVG With Python

2024/05/09

Convert ShapeFile To SVG With Python

ShapeFileSVGPythonGeoJSON
Reactivity In Svelte: Variables, Binding, and Key Function

2024/05/08

Reactivity In Svelte: Variables, Binding, and Key Function

SvelteReactivityBindingKey Function
Book Review | The Art Of War by Sun Tzu

2024/05/07

Book Review | The Art Of War by Sun Tzu

Book ReviewThe Art Of WarSun TzuThomas Cleary
Specialists Are Dead. Long Live Generalists!

2024/05/06

Specialists Are Dead. Long Live Generalists!

SpecialistGeneralistParadigm ShiftSoftware Engineering
Analyze Voter Behavior in Turkish Elections with Python

2024/05/03

Analyze Voter Behavior in Turkish Elections with Python

TurkeyAge Analysis2018 ElectionsVoter Behavior
Create Turkish Voter Profile Database With Web Scraping

2024/05/01

Create Turkish Voter Profile Database With Web Scraping

PythonSeleniumWeb ScrapingTurkish Elections
Make Infinite Scroll With Svelte and Tailwind

2024/04/30

Make Infinite Scroll With Svelte and Tailwind

SvelteTailwindInfinite ScrollFront-end
How I Reached Japanese Proficiency In Under A Year

2024/04/29

How I Reached Japanese Proficiency In Under A Year

JapaneseProficiencyJLPTBusiness
Use-ready Website Template With Svelte and Tailwind

2024/04/25

Use-ready Website Template With Svelte and Tailwind

Website TemplateFront-endSvelteTailwind
Lazy Engineers Make Lousy Products

2024/01/29

Lazy Engineers Make Lousy Products

Lazy engineerLousy productStarbucksSBI
On Greatness

2024/01/28

On Greatness

GreatnessMeaning of lifeSatisfactory lifePurpose
Converting PDF to PNG on a MacBook

2024/01/28

Converting PDF to PNG on a MacBook

PDFPNGMacBookAutomator
Recapping 2023: Compilation of 24 books read

2023/12/31

Recapping 2023: Compilation of 24 books read

BooksReading2023Reflections
Create a Photo Collage with Python PIL

2023/12/30

Create a Photo Collage with Python PIL

PythonPILImage ProcessingCollage
Detect Device & Browser of Visitors to Your Website

2024/01/09

Detect Device & Browser of Visitors to Your Website

JavascriptDevice DetectionBrowser DetectionWebsite Analytics
Anatomy of a ChatGPT Response

2024/01/19

Anatomy of a ChatGPT Response

ChatGPTLarge Language ModelMachine LearningGenerative AI