Compare commits

...

66 Commits

Author SHA1 Message Date
burkkyy 9fde484d2f flashcards how to 2026-06-24 17:42:09 -07:00
burkkyy 0d27ec7dea More progression 2026-06-24 00:25:59 -07:00
burkkyy d109a5adf8 Part 1 prog 2026-06-23 01:55:09 -07:00
burkkyy 2ccde288ff part 1 progress 2026-06-22 18:16:36 -07:00
burkkyy dc22ebd3fa part1 fixes 2026-06-22 17:58:42 -07:00
burkkyy 669271901d Started new tutorial. Added a bunch of python packages such as pytorch 2026-06-22 17:13:57 -07:00
burkkyy c965a11c7d Moving stuff around 2026-06-22 16:38:14 -07:00
burkkyy b9018b3ae5 moving stuff around 2026-06-22 16:37:57 -07:00
burkkyy 4ab87d74c2 Trying out new directorys 2026-06-22 16:34:11 -07:00
burkkyy 9fc8299af9 some pages read 2026-06-22 12:28:20 -07:00
burkkyy d48f8094a7 more renaming 2026-06-22 01:27:19 -07:00
burkkyy f7a72f2cf5 more renaming 2026-06-22 01:26:17 -07:00
burkkyy cd86381692 renaming 2026-06-22 01:25:52 -07:00
burkkyy 8600d97d13 Moving done book to archive 2026-06-22 01:15:15 -07:00
burkkyy ecd9031f7d Done book 2026-06-22 01:14:35 -07:00
burkkyy bb37776915 README update 2026-06-22 01:14:23 -07:00
burkkyy 751ffc9cc3 adding potencial book to read 2026-06-21 17:03:45 -07:00
burkkyy 47e6513e37 big prog on 001 2026-06-18 01:34:12 -07:00
burkkyy be271e98f0 big prog 2026-06-17 01:44:14 -07:00
burkkyy 72fabfcb14 swapping reading order 2026-06-17 00:35:43 -07:00
burkkyy f6791e8aa5 Moving textbook to last read 2026-06-16 01:49:43 -07:00
burkkyy b2c1339e8b prog 2026-06-16 01:47:29 -07:00
burkkyy e19b8ac412 Moving around textbooks 2026-06-16 01:39:52 -07:00
burkkyy 702b0a1f58 Setting up more books 2026-06-16 01:37:56 -07:00
burkkyy c817c43072 got more stuff done 2026-06-15 00:50:24 -07:00
burkkyy 8de85110c3 fixed note 2026-06-15 00:32:30 -07:00
burkkyy 26fa6b9723 adding flashcards! 2026-06-15 00:31:45 -07:00
burkkyy 713576cc83 README update 2026-06-08 02:03:18 -07:00
burkkyy 3ca496fdfa more improved notes 2026-06-08 01:46:12 -07:00
burkkyy b2463fa2ab ch1 notes update 2026-06-05 01:26:45 -07:00
burkkyy a7d65a9c82 file renaming 2026-06-04 21:32:04 -07:00
burkkyy a32ab3d1ac updated naming schema 2026-06-04 21:29:33 -07:00
burkkyy d1b14d77f8 AI disclaimer 2026-06-04 21:27:53 -07:00
burkkyy 434881d109 Moving around stuff 2026-06-04 21:26:32 -07:00
burkkyy 4813288c97 Update README 2026-06-04 21:24:41 -07:00
burkkyy 2004dab515 More big changes 2026-06-04 21:22:50 -07:00
burkkyy d59927bbcb A TON OF NEW TEXTBOOKS 2026-06-04 21:16:45 -07:00
burkkyy be01665784 organizang books 2026-06-04 20:09:31 -07:00
burkkyy 091c7450ea moved textbook 2026-06-04 19:50:00 -07:00
burkkyy e955e56c36 renamed dir 2026-06-04 19:49:16 -07:00
burkkyy 77fd1312ef done reading chapter 3 2026-06-01 23:51:56 -07:00
burkkyy 8727808495 big progress on chapter 3 2026-06-01 01:05:08 -07:00
burkkyy 4cf8cfd079 small chapter 3 prog 2026-05-30 00:14:44 -07:00
burkkyy 25139c8008 adding todo 2026-05-29 02:00:30 -07:00
burkkyy 6ad73e1b7e more chapter 3 progress 2026-05-28 01:35:48 -07:00
burkkyy 96d5314edf Chapter 3 progress 2026-05-28 01:04:32 -07:00
burkkyy 90397250e6 test 2026-05-27 18:00:59 -07:00
burkkyy 00bd6e7220 note was incorrect 2026-05-27 17:52:03 -07:00
burkkyy 12668a41cc quick link to pdf 2026-05-26 13:31:37 -07:00
burkkyy a7a9e85ec0 more progress! 2026-05-26 01:46:35 -07:00
burkkyy f44358b1bd setting up chapter 3,4 2026-05-26 01:28:28 -07:00
burkkyy 3c8e3de2ea header rename 2026-05-26 01:22:23 -07:00
burkkyy 7be4bae6d1 main reading from chapter two glossed through 2026-05-26 01:21:53 -07:00
burkkyy 0a07c07746 re organizating 2026-05-26 00:50:36 -07:00
burkkyy d6c77d09c4 Setting up chapter 2 2026-05-26 00:48:29 -07:00
burkkyy 6b3d6a4d38 Ending reading of chapter 1 2026-05-26 00:44:57 -07:00
burkkyy 2b936d82c5 8 pages read 2026-05-26 00:39:57 -07:00
burkkyy c16372206c MORE chapter 1 progress 2026-05-25 23:13:21 -07:00
burkkyy 3959b06668 chapter 1 progress 2026-05-25 18:03:02 -07:00
burkkyy ab387f08ad moving dir 2026-05-24 10:15:06 -07:00
burkkyy 778c9d8b49 more books 2026-05-23 02:34:49 -07:00
burkkyy b8dfb0feed new textbook to eventually study 2026-05-23 02:19:26 -07:00
burkkyy 1d2d09faf7 chapter 1 major progress 2026-05-23 01:42:51 -07:00
burkkyy 5dba7cd2f7 added uv and dotenv 2026-05-22 17:01:27 -07:00
burkkyy 362679ff86 Moving stuff around 2026-05-22 16:55:45 -07:00
burkkyy d176120b7e chapter one 2026-05-19 23:33:50 -07:00
53 changed files with 123023 additions and 0 deletions
+6
View File
@@ -0,0 +1,6 @@
.env
__pycache__/
*.pyc
*.pyo
cache/
*.parquet
+1
View File
@@ -0,0 +1 @@
3.14
+70
View File
@@ -4,3 +4,73 @@ My roadmap for becoming competent quant and capable of creating automated tradin
Current Goal: **Learn Quantitative Foundations**
## Study
Current reading [Introduction to Probability, Statistics, and Random Processes - Hossein Pishro-Nik](./textbooks/reading/20260622012450_Introduction%20to%20Probability,%20Statistics,%20and%20Random%20Processes/Introduction%20to%20Probability,%20Statistics,%20and%20Random%20Processes%20-%20Hossein%20Pishro-Nik.pdf) and [Algorithmic Trading](./textbooks/reading/20260622012451_Algorithmic%20Trading%20Winning%20Strategies%20and%20their%20rationale/Algorithmic%20Trading%20Winning%20Strategies%20and%20their%20rationale.pdf)
## Phases
Recommended path, generated by CLAUDE
### Phase 1 — Math Foundation (*)
1. [ ] Pishro-Nik (*)
2. [x] Quantitative Trading
3. [ ] Elementary Stochastic Calculus
4. [ ] A guide to Brownian motion
### Phase 2 — Get Practical Early
1. [ ] Algorithmic Trading — Chan ← read this soon; short, orients everything else
2. [ ] Analysis of Financial Time Series — Tsay ← essential for price/return modeling
### Phase 3 — Core Quant ML
1. [ ] Data-Driven Science and Engineering — Brunton & Kutz
2. [ ] Advances in Financial Machine Learning — Lopez de Prado ← most important book in your backlog
3. [ ] Stochastic Calculus: An Introduction with Applications ← now the theory lands better
4. [ ] Detecting Regime Change in Computational Finance
### Phase 4 — Portfolio, Risk & Systems
1. [ ] Active Portfolio Management — Grinold & Kahn
2. [ ] The Microstructure of Financial Markets — de Jong & Rindi
3. [ ] Trading Systems and Methods — Kaufman (use as reference)
4. [ ] The Mathematics of Money Management — Vince
5. [ ] The Leverage Space Trading Model — Vince
6. [ ]Testing and Tuning Market Trading Systems
### Phase 5 — Specialized / Optional
1. [ ] Assessing and Improving Prediction and Classification (C++ heavy, niche)
2. [ ] Trading on Sentiment (alt data / NLP, very specialized)
3. [ ] Numerical Recipes (reference only, don't read cover to cover)
4. [ ] Measure Theory (only if you want pure math depth — lowest ROI for bots)
### Missing textbooks
- Systematic Trading: A unique new method for designing trading and investing systems
- Permutation and Randomization Tests for Trading System Development: Algorithms in C++
## Textbooks
- *Introduction to Probability, Statistics, and Random Processes* - Hossein Pishro-Nik
- *Quantitative Trading* - Ernest P. Chan
- *Algorithmic Trading: Winning Strategies and Their Rationale* - Ernest P. Chan
- *Elementary Stochastic Calculus* - Thomas Mikosch
- *A Guide to Brownian Motion and Related Stochastic Processes* - Jim Pitman & Marc Yor
- *Stochastic Calculus: An Introduction with Applications* - Bernt Øksendal
- *Active Portfolio Management* - Richard Grinold & Ronald Kahn
- *Probability and Statistics: The Science of Uncertainty* - Michael J. Evans & Jeffrey S. Rosenthal
- *Analysis of Financial Time Series* - Ruey S. Tsay
- *Data-Driven Science and Engineering* - Steven L. Brunton & J. Nathan Kutz
- *Advances in Financial Machine Learning* - Marcos Lopez de Prado
- *Detecting Regime Change in Computational Finance* - Timothy Masters
- *Trading Systems and Methods* - Perry J. Kaufman
- *The Mathematics of Money Management* - Ralph Vince
- *The Leverage Space Trading Model* - Ralph Vince
- *Testing and Tuning Market Trading Systems* - Timothy Masters
- *Assessing and Improving Prediction and Classification* - Timothy Masters
- *Trading on Sentiment* - Richard L. Peterson
- *Numerical Recipes: The Art of Scientific Computing* - Press, Teukolsky, Vetterling & Flannery
- *An Introduction to Measure Theory* - Terence Tao
View File
View File
+34
View File
@@ -0,0 +1,34 @@
[project]
name = "study"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.14"
dependencies = [
"aiohttp>=3.14.1",
"altair>=6.2.1",
"cryptography>=49.0.0",
"jupyter>=1.1.1",
"matplotlib>=3.10.9",
"numpy>=2.4.6",
"polars>=1.41.2",
"python-dotenv>=1.2.2",
"seaborn>=0.13.2",
"torch>=2.12.1",
"tqdm>=4.68.3",
"vegafusion>=2.0.3",
"vl-convert-python>=1.9.0.post1",
]
[[tool.uv.index]]
name = "pytorch-cu132"
url = "https://download.pytorch.org/whl/cu132"
explicit = true
[tool.uv.sources]
torch = [
{ index = "pytorch-cu132", marker = "sys_platform == 'linux' or sys_platform == 'win32'" },
]
torchvision = [
{ index = "pytorch-cu132", marker = "sys_platform == 'linux' or sys_platform == 'win32'" },
]
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
File diff suppressed because one or more lines are too long
@@ -0,0 +1,5 @@
# ELEMENTARY STOCHASTIC CACULUS with Finance in View
total pages=217
**Currently reading:** page 1
@@ -0,0 +1,5 @@
# A guide to Brownian motion and related stochastic processes
total pages=101
**Currently reading:** page 1
@@ -0,0 +1,11 @@
# Quantitative Trading
[pdf](./Quantitative_Trading_Ernest_P_Chan.pdf)
I will be loosey reading through this book. Many of the concepts will be gone into more rigour in other textbooks.
## Post read notes
Not alot of math, but math that did show up was hand wavy.(Seems like the book assumes you understand certain concepts)
Main takeaway is use Kelly formula for sizing positions.
@@ -0,0 +1,460 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "ec753176",
"metadata": {},
"source": [
"# Notes"
]
},
{
"cell_type": "markdown",
"id": "b13688b9",
"metadata": {},
"source": [
"When loss of money occurs, rationality is often the first victim.\n",
"\n",
"As long as financial markets demand instant liquidity, however, there will always be a profitable\n",
"niche for quantitative trading."
]
},
{
"cell_type": "markdown",
"id": "5df7046f",
"metadata": {},
"source": [
"## Daily routine"
]
},
{
"cell_type": "markdown",
"id": "5a622bb3",
"metadata": {},
"source": [
"\"The largest block of time I need to spend is in the morning\n",
"before the market opens: I typically need to run various programs to\n",
"download and process the latest historical data, read company news\n",
"that comes up on my alert screen, run programs to generate the orders for the day, and then launch a few baskets of orders before\n",
"the market opens and start a program that will launch orders automatically throughout the day. I would also update my spreadsheet\n",
"to record the previous days profit and loss (P&L) of the different\n",
"strategies I run based on the brokerages statements. All of this takes\n",
"about two hours.\" - Quant trader morning routine"
]
},
{
"cell_type": "markdown",
"id": "f0855890",
"metadata": {},
"source": [
"## Definitions"
]
},
{
"cell_type": "markdown",
"id": "8ec7516e",
"metadata": {},
"source": [
"**Defn** Information Ratio (Sharpe ratio):\n",
"\n",
"$$\\text{Information Ratio} = \\frac{\\text{Average of Excess Returns}}{\\text{Standard Deviation of Excess Returns}}$$\n",
"\n",
"$$\\text{Excess Returns} = \\text{Portfolio Returns} - \\text{Benchmark Returns}$$"
]
},
{
"cell_type": "markdown",
"id": "7aab1f0d",
"metadata": {},
"source": [
"- ***directional trades*** - long or short only\n",
"- ***dollar-neutral trades*** - hedged or pair trades\n",
"- ***dollar neutral portfolio*** - The market value of the long positions equals the market value of the short positions\n",
"- ***market neutral portfolio*** - The beta of the portfolio with respect to a market index is close to zero, where beta measures the ratio between the expected returns of the portfolio and the expected returns of the market index\n",
"- ***Leverage*** - Borrowing funds to buy an investment\n",
"- ***slippage*** - The difference between the price that triggers the trading signal and the average execution price of the entire order"
]
},
{
"cell_type": "markdown",
"id": "865b04b1",
"metadata": {},
"source": [
"- ***equaity curve***: A line chart of your portfolio's value over time\n",
"- ***drawdown***: The decline from a peak to a subsequent trough, expressed as a percentage. It's a measure of loss from a prior high, not from your starting point.\n",
"- ***maximum drawdown***: The largest peak-to-trough decline over the entire history of a strategy. It answers: \"What's the worst loss someone could have experienced if they invested at the worst possible time?\" It's one of the most common risk metrics used to evaluate strategies.\n",
"- ***high watermark***: The highest portfolio value ever reached\n",
"- ***maximum drawdown duration***: The longest amount of time spent below the high watermark.\n",
"- ***basis points***: 1 basis point is 0.01%\n",
"- ***regime shift*** - Situation when the financial market structure or the macroeconomic environment undergoes a drastic change so much so that trading strategies that were profitable before may not be profitable now"
]
},
{
"cell_type": "markdown",
"id": "076ad86c",
"metadata": {},
"source": [
"- ***Risk-Adjusted returns*** - "
]
},
{
"cell_type": "markdown",
"id": "a845580c",
"metadata": {},
"source": [
"## Ruling out bad strategies"
]
},
{
"cell_type": "markdown",
"id": "303f8dc1",
"metadata": {},
"source": [
"\n",
"- If a strategy trades only a few times a year, chances are its\n",
"Sharpe ratio wont be high. This does not prevent it from being\n",
"part of your multistrategy trading business, but it does disqualify\n",
"the strategy from being your main profit center.\n",
"- If a strategy has deep (e.g., more than 10 percent) or lengthy\n",
"(e.g., four or more months) drawdowns, it is unlikely that it will\n",
"have a high Sharpe ratio. I will explain the concept of drawdown\n",
"in the next section, but you can just visually inspect the equity\n",
"curve (which is also the cumulative profit-and-loss curve, assuming no redemption or cash infusion) to see if it is very bumpy\n",
"or not. Any peak-to-trough of that curve is a drawdown. (See\n",
"Figure 2.1 for an example.)\n",
"\n",
"As a rule of thumb, any strategy that has a Sharpe ratio of less\n",
"than 1 is not suitable as a stand-alone strategy\n",
"\n",
"For a given strategy, its important to ask the following:\n",
"\n",
"- Does it outperform a benchmark?\n",
"- Does it have a high enough Sharpe ratio?\n",
"- Does it have a small enough drawdown and short enough drawdown duration?\n",
"- Does the backtest suffer from survivorship bias?\n",
"- Does the strategy lose steam in recent years compared to its earlier years?\n",
"- Does the Strategy Suffer from Data-Snooping Bias?\n",
"- Does the strategy have its own “niche” that protects it from intense competition from large institutional money managers?"
]
},
{
"cell_type": "markdown",
"id": "ecc17253",
"metadata": {},
"source": [
"## Backtesting\n"
]
},
{
"cell_type": "markdown",
"id": "98567746",
"metadata": {},
"source": [
"- When gathering data for backtesting, ensure data is split and dividend adjusted.\n",
"- A backtest that relies on high and low data is less reliable than one that relies on the open and close\n",
"- Typically, an extreme return should be accompanied by a news announcement, or should occur on a day when the market index also experienced extreme returns. If not, then your data is suspect."
]
},
{
"cell_type": "markdown",
"id": "0dbd8630",
"metadata": {},
"source": [
"$$\\text{Annualized Sharpe Ratio} = \\sqrt{N_T} \\times \\text{Sharpe Ratio Based on }T$$"
]
},
{
"cell_type": "markdown",
"id": "319f28df",
"metadata": {},
"source": [
"Incorporate transaction costs into backtests. Transaction costs include:\n",
"\n",
"- Commission\n",
"- Liquidity cost\n",
"- Opportunity cost\n",
"- Market Impact\n",
"- Slippage\n",
"\n",
"Try to combine all of these into a \"one way transaction cost\" (onewaytcost)"
]
},
{
"cell_type": "markdown",
"id": "1cbf1e0a",
"metadata": {},
"source": [
"Potential backtest performance issues:\n",
"\n",
"- Data: Split/dividend adjustments, noise in daily high/low, and\n",
"survivorship bias.\n",
"- Performance measurement: Annualized Sharpe ratio and maximum drawdown.\n",
"- Look-ahead bias: Using unobtainable future information for past\n",
"trading decisions.\n",
"- Data-snooping bias: Using too many parameters to fit historical\n",
"data, and avoiding it using large enough sample, out-of-sample\n",
"testing, and sensitivity analysis.\n",
"- Transaction cost: Impact of transaction costs on performance.\n",
"- Strategy refinement: Common ways to make small variations on\n",
"the strategy to optimize performance."
]
},
{
"cell_type": "markdown",
"id": "8047f4f0",
"metadata": {},
"source": [
"## Execution Systems: Why does actual performance diverge from expectations?"
]
},
{
"cell_type": "markdown",
"id": "30a1531a",
"metadata": {},
"source": [
"- Do you have bugs in your ATS software?\n",
"- Do the trades generated by your ATS match the ones generated\n",
"by your backtest program?\n",
"- Are the execution costs much higher than what you expected?\n",
"- Are you trading illiquid stocks that caused a lot of market impact?\n",
"- Strategy may have suffered from data-snooping bias or regime shift"
]
},
{
"cell_type": "markdown",
"id": "d5ea890d",
"metadata": {},
"source": [
"## Money and Risk Management"
]
},
{
"cell_type": "markdown",
"id": "41ddff80",
"metadata": {},
"source": [
"### The Kelly Formula"
]
},
{
"cell_type": "markdown",
"id": "169dc996",
"metadata": {},
"source": [
"Let $F^*$ be the optimal fractions of your equity that you should allocate to each of your $n$ strategies by a column vector $F^* = (f_1^*, f_2^*, \\dots, f_n^*)^T$\n",
"\n",
"Let $C$ be the covariance matrix such that matrix element $C_{ij}$ is the covariance of the returns of the $i^\\text{th}$ and $j^\\text{th}$ strategies.\n",
"\n",
"Let $M = (m_1, m_2, \\dots, m_n)^T$ be the column vector of mean returns of the strategies, where $m_i$ is a one-period, simple(uncompounded), unlevered return.\n",
"\n",
"Given our optimization objective and the Gaussian assumption, Dr. Thorp has shown that the optimal allocation is given by\n",
"\n",
"$$F^* = C^{-1}M$$\n",
"\n",
"If we assume that the strategies are all statistically independent, the covariance matrix becomes a diagonal matrix, with the diagonal elements equal to the variance of the individual strategies. This leads to an especially simple formula\n",
"\n",
"$$f_i = \\frac{m_i}{s_i^2}$$\n",
"\n",
"This is the famous Kelly formula as applied to continuous finance as opposed to gambling with discrete outcomes, and it gives the optimal leverage one should employ for a particular trading strategy.\n",
"\n",
"As a practical procedure, this continuous updating of the capital allocation should occur at least once at the end of each trading\n",
"day. In addition to updating the capital allocation, one should also\n",
"periodically update F* itself by recalculating the most recent trailing mean return and standard deviation. What should the lookback\n",
"period be and how often do you need to update these inputs to the\n",
"Kelly formula? These depend on the average holding period of your\n",
"strategy. If you hold your positions for only one day or so, then as a\n",
"rule of thumb, I would advise using a lookback period of six months.\n",
"Using a relatively short lookback period has the advantage of allowing you to gradually reduce your exposure to strategies that have\n",
"been losing their performance. As for the frequency of update, it\n",
"should not be a burden to update F* daily once you have written a\n",
"program to do so."
]
},
{
"cell_type": "markdown",
"id": "270f0944",
"metadata": {},
"source": [
"Model risk simply refers to the possibility that trading losses are\n",
"not due to the statistical vagaries of the market, but to the fact that\n",
"the trading model is wrong\n"
]
},
{
"cell_type": "markdown",
"id": "b6709290",
"metadata": {},
"source": [
"the one golden rule in risk management is to keep the size of your portfolio under control at all times\n",
"\n",
"Do not succumb to either despair or greed"
]
},
{
"cell_type": "markdown",
"id": "a6edc9d9",
"metadata": {},
"source": [
"# Mean-reverting versus Momentum strategies"
]
},
{
"cell_type": "markdown",
"id": "63f1028e",
"metadata": {},
"source": [
"Security prices are either mean reverting or trending. Otherwise they are random walking and trading will be futile. \n",
"\n",
"- ***Mean reverting***: Prices tend to return to an average (mean) over time\n",
"- ***Trending***: Prices move persistently in one direction. Momentum builds and continues.\n",
"\n",
"Sometimes (usually) a security is both mean reverting and trending."
]
},
{
"cell_type": "markdown",
"id": "084a3a0b",
"metadata": {},
"source": [
"## Stationarity and Cointegration\n",
"\n",
"- ***cointegrated***: Most stock price series are not stationary—they exhibit a geometric random walk that gets them farther and farther away from their starting (i.e., initial public offering) values. However, you can often find a pair of stocks such that if you long one and short the other, the market value of the pair is stationary, then the pair of stocks are cointegrated\n",
"\n",
"If a price series (of a stock, a pair of stocks, or, in general, a portfolio of stocks) is stationary, then a mean-reverting strategy is guaranteed to be profitable, as long as the stationarity persists into the future (which is by no means guaranteed)\n",
"\n",
"**Cointegration Is Not Correlation**"
]
},
{
"cell_type": "markdown",
"id": "a60d882c",
"metadata": {},
"source": [
"## Factor Models"
]
},
{
"cell_type": "markdown",
"id": "2d21e3c0",
"metadata": {},
"source": [
"- ***Factor returns***: The common drivers of stock returns\n",
"- ***Factor exposures***: The sensitivities to each of these common drivers\n",
"- ***Specific return***: Any part of a stocks return that cannot be explained by these common factor returns is deemed a specific return\n",
"\n",
"Each stocks specific return is assumed to be uncorrelated to another stocks."
]
},
{
"cell_type": "markdown",
"id": "6e48e218",
"metadata": {},
"source": [
"## Exit Strategy"
]
},
{
"cell_type": "markdown",
"id": "e61edcb1",
"metadata": {},
"source": [
"- A fixed holding period\n",
"- A target price or profit cap\n",
"- The latest entry signals\n",
"- A stop price\n",
"\n",
"The mean reversion of a time series can be modeled by an equation called the Ornstein-Uhlenbeck formula. See page 163 for more info.\n",
"\n",
"The properties of the Ornstein-Uhlenbeck formula can inform us about the exit strategies.\n",
"\n",
"If you believe that your security is mean reverting, then you also have a ready-made target price—the mean value of the historical prices of the security, or µ in the Ornstein-Uhlenbeck formula.\n",
"\n",
"Target prices can also be used in the case of momentum models if you have a fundamental valuation model of a company. But as fundamental valuation is at best an inexact science, target prices are not as easily justified in momentum models as in mean-reverting models.\n",
"\n",
"Exiting a position based on running an entry model also tells us whether a stop-loss strategy is recommended"
]
},
{
"cell_type": "markdown",
"id": "eb884fe5",
"metadata": {},
"source": [
"## High Frequency Trading Strategies"
]
},
{
"cell_type": "markdown",
"id": "22ad0ba8",
"metadata": {},
"source": [
"Requires low level programming and alot of reasources to consider viable"
]
},
{
"cell_type": "markdown",
"id": "9c8eeee5",
"metadata": {},
"source": [
"## Other topics / notes\n",
"\n",
"- Markov regime switching / hidden Markov models\n",
"- Kalman filter\n",
"- neural networks\n",
"\n",
"Empirical studies have found that a portfolio that consists of low-beta stocks generally has lower risk and thus a higher Sharpe ratio"
]
},
{
"cell_type": "markdown",
"id": "bdd23956",
"metadata": {},
"source": [
"- Mean-reverting regimes are more prevalent than trending regimes.\n",
"- There are some tricky data issues involved with backtesting mean-reversion strategies: Outlier quotes and survivorship bias are among them.\n",
"- Trending regimes are usually triggered by the diffusion of new\n",
"information, the execution of a large institutional order, or\n",
"“herding” behavior.\n",
"- Competition between traders tends to reduce the number of\n",
"mean-reverting trading opportunities.\n",
"- Competition between traders tends to reduce the optimal holding period of a momentum trade.\n",
"- Regime switching can sometimes be detected using a dataminingx approach with numerous input features.\n",
"- A stationary price series is ideal for a mean-reversion trade.\n",
"- Two or more nonstationary price series can be combined to form a stationary one if they are “cointegrating.”\n",
"- Cointegration and correlation are different things: Cointegration\n",
"is about the long-term behavior of the prices of two or more\n",
"stocks, while correlation is about the short-term behavior of\n",
"their returns.\n",
"- Factor models, or arbitrage pricing theory, are commonly used\n",
"for modeling how fundamental factors affect stock returns linearly.\n",
"- One of the most well-known factor models is the Fama-French\n",
"Three-Factor model, which postulates that stock returns are\n",
"proportional to their beta and book-to-price ratio, and negatively\n",
"to their market capitalizations.\n",
"- Factor models typically have a relatively long holding period and\n",
"long drawdowns due to regime switches.\n",
"- Exit signals should be created differently for mean-reversion versus momentum strategies.\n",
"- Estimation of the optimal holding period of a mean-reverting strategy can be quite robust, due to the Ornstein-Uhlenbeck formula.\n",
"- Estimation of the optimal holding period of a momentum strategy can be error prone due to the small number of signals.\n",
"- Stop loss can be suitable for momentum strategies but not reversal strategies.\n",
"- Seasonal trading strategies for stocks (i.e., calendar effect) have\n",
"become unprofitable in recent years.\n",
"- Seasonal trading strategies for commodity futures continue to\n",
"be profitable.\n",
"- High-frequency trading strategies rely on the “law of large numbers” for their high Sharpe ratios.\n",
"- High-frequency trading strategies typically generate the highest\n",
"long-term compounded growth due to their high Sharpe ratios.\n",
"- High-frequency trading strategies are very difficult to backtest\n",
"and very technology reliant for their execution.\n",
"- Holding a highly leveraged portfolio of low-beta stocks should\n",
"generate higher long-term compounded growth than holding unleveraged portfolio of high-beta stocks."
]
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
@@ -0,0 +1,203 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "a6732353-51d5-4478-9cf8-5834e57e5a4e",
"metadata": {},
"source": [
"# Chapter 1 Notes"
]
},
{
"cell_type": "markdown",
"id": "48d9ec9e-83da-40ca-ae79-3c45f8af137c",
"metadata": {},
"source": [
"## Main Concepts\n",
"\n",
"Outcome: A result of a random experiment.\n",
"\n",
"Sample Space: The set of all possible outcomes.\n",
"\n",
"Event: A subset of the sample space.\n",
"\n",
"Inclusion-exclusion principle holds for probability\n",
"\n",
"Consider a sample space S. If S is a countable set, this refers to a discrete probability\n",
"mode\n"
]
},
{
"cell_type": "markdown",
"id": "7ac122be-50b2-423c-b88f-e4b3327b21bd",
"metadata": {},
"source": [
"## Example Problems"
]
},
{
"cell_type": "markdown",
"id": "7fb87a35-a470-4d98-935f-80c814e3f95d",
"metadata": {},
"source": [
"Example 1.5 - soln\n",
"\n",
"- there are 10 people with white shirts and 8 people with red shirts;\n",
"- 4 people have black shoes and white shirts\n",
"- 3 people have black shoes and red shirts\n",
"- the total number of people with white or red shirts or black shoes is 21\n",
"\n",
"Let A be the set of people with white shirts, B be the set of people with red shirts and let C be the set of people with black shoes.\n",
"\n",
"\\begin{align*}\n",
"|A|=10 \\\\\n",
"|B|=8 \\\\\n",
"|A \\cap C| = 4 \\\\\n",
"|B \\cap C| = 3 \\\\\n",
"|A \\cup B \\cup C| = 21\n",
"\\end{align*}\n",
"\n",
"Now we solve for $|C|$:\n",
"\n",
"\\begin{align*}\n",
"|A| + |B| + |C| - |A \\cap B| - |A \\cap C| - |B \\cap C| + |A \\cap B \\cap C| = 21 \\\\\n",
"10 + 8 + |C| - 0 - 4 - 3 - 0 = 21 \\\\\n",
"18 + |C| - 7 = 21 \\\\\n",
"|C| + 11 = 21 \\\\\n",
"|C| = 10\n",
"\\end{align*}\n",
"\n",
"$\\therefore$ number of people with black shoes is 10\n"
]
},
{
"cell_type": "markdown",
"id": "72b40733-531a-48f3-9879-75601684afc2",
"metadata": {},
"source": [
"Example 1.11 - soln\n",
"\n",
"Suppose we have the following information:\n",
"1. There is a 60 percent chance that it will rain today.\n",
"2. There is a 50 percent chance that it will rain tomorrow.\n",
"3. There is a 30 percent chance that it does not rain either day.\n",
"\n",
"T = rains\n",
"F = no rain\n",
"\n",
"$S = \\{(F, F), (F, T), (T, F), (T, T)\\}$\n",
"\n",
"$P((T, F) \\cup (T, T)) = 0.6$\n",
"\n",
"$P((F, T) \\cup (T, T)) = 0.5$\n",
"\n",
"$P((F, F)) = 0.3$\n",
"\n",
"\\begin{align*}\n",
"P(S) = 1 \\\\\n",
"P(\\{(F, F)\\} \\cup \\{(F, T)\\} \\cup \\{(T, F)\\} \\cup \\{(T, T)\\}) = 1 \\\\\n",
"P((F,F)) + P(\\{(F, T)\\} \\cup \\{(T, F)\\} \\cup \\{(T, T)\\}) = 1 \\\\\n",
"0.3 + P(\\{(F, T)\\} \\cup \\{(T, F)\\} \\cup \\{(T, T)\\}) = 1 \\\\\n",
"P(\\{(F, T)\\} \\cup \\{(T, F)\\} \\cup \\{(T, T)\\}) = 0.7 \\\\\n",
"P(\\{(F, T)\\} \\cup \\{(T, T)\\}) + P((T, F)) = 0.7 \\\\\n",
"0.5 + P((T, F)) = 0.7 \\\\\n",
"P((T, F)) = 0.2 \\\\\n",
"P(\\{(T, F)\\} \\cup \\{(T, T)\\}) + P((F, T)) = 0.7 \\\\\n",
"P((F, T)) = 0.1\n",
"\\end{align*}\n",
"\n",
"Find the following probabilities:\n",
"\n",
"a. The probability that it will rain today or tomorrow.\n",
"\n",
"\\begin{align*}\n",
"P((T, F) \\cup (F, T) \\cup (T, T)) = 0.7\n",
"\\end{align*}\n",
"\n",
"b. The probability that it will rain today and tomorrow.\n",
"\n",
"\\begin{align*}\n",
"P((T, T)) = 1 - 0.3 - 0.2 - 0.1 = 0.4\n",
"\\end{align*}\n",
"\n",
"c. The probability that it will rain today but not tomorrow.\n",
"\n",
"\\begin{align*}\n",
"P((T, F)) = 0.2\n",
"\\end{align*}\n",
"\n",
"d. The probability that it either will rain today or tomorrow, but not both.\n",
"\n",
"\\begin{align*} \n",
"P(\\{(T, F)\\} \\cup \\{(F, T)\\}) = P((T, F)) + P((F, T)) = 0.2 + 0.1 = 0.3\n",
"\\end{align*}\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "8b5131dd-5ebd-4156-b808-f8df273317fb",
"metadata": {},
"source": [
"Example 1.12 - soln\n",
"\n",
"$S = \\{ -1, 0, 1, 2, 3, ... \\}$\n",
"\n",
"$\\forall x \\in S, P(x) = \\frac{1}{2^{x + 2}}$\n",
"\n",
"What is the probability that I win more than or equal to 1 dollar and less than 4 dollars?\n",
"\n",
"\\begin{align*} \n",
"P({1, 2, 3}) = P(1) + P(2) + P(3) \\\\\n",
"= 1/8 + 1/16 + 1/32\n",
"\\end{align*}\n",
"\n",
"What is the probability that I win more than 2 dollars?\n",
"\n",
"\\begin{align*} \n",
"\\sum_{i=3}^{\\infty} P(i) = P(3) + P(4) + P(5) + P(6) + ... \\\\\n",
"= 1/32 + 1/64 + 1/128 + 1/256 + ... \\\\\n",
"=\\frac{\\frac{1}{32}}{1 - \\frac{1}{2}}\n",
"=\\frac{1}{16}\n",
"\\end{align*}"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "69962960-18a6-436b-b9eb-9a6dfae7559a",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"id": "11163495-a6cc-43b9-bf41-02748b13d210",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "roadmap (3.14.5)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.14.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
@@ -0,0 +1,28 @@
# Introduction to Probability, Statistics, and Random Processes - Hossein Pishro-Nik
[pdf](./Introduction%20to%20Probability,%20Statistics,%20and%20Random%20Processes%20-%20Hossein%20Pishro-Nik.pdf)
total pages=1007
**Currently reading:** chapter 3, page 236
TODO:
- 3.2.5 problems
- ch3 end of chapter problems
- 4.1.4 problems
- 4.2.6 problems
- 4.3.3 problems
- ch4 end of chapter problems
## Flashcards
Flashcards can be found for each chapter under `chX/flashcards.md`
Flashcard usage:
1. Open `mochi`
2. Click `Import`
3. Import by `Markdown`
4. Select `Multiple cards per .md file`
5. Set string delimiter as `===`
@@ -0,0 +1,222 @@
State De Morgan's Law for $n$ sets
---
For any sets $A_1, A_2, \dots, A_n$:
$$\left(\bigcup_{i=1}^n A_i\right)^c = \bigcap_{i=1}^n A_i^c$$
$$\left(\bigcap_{i=1}^n A_i\right)^c = \bigcup_{i=1}^n A_i^c$$
The complement of a union is the intersection of complements, and vice versa.
===
State the Distributive Law for sets $A$, $B$, and $C$
---
$$A \cap (B \cup C) = (A \cap B) \cup (A \cap C)$$
$$A \cup (B \cap C) = (A \cup B) \cap (A \cup C)$$
===
State the inclusion-exclusion principle for a finite collection of sets $A_1, A_2, A_3, \dots A_n$ where $n=2$
---
$n = 2$ case:
$|A \cup B| = |A| + |B| - |A \cap B|$
---
$n = 3$ case:
$|A \cup B \cup C| = |A| + |B| + |C| - |A \cap B| - |A \cap C| - |B \cap C| + |A \cap B \cap C|$
---
For a finite collection of sets $A_1, A_2, A_3, \dots A_n$, we have
$\left| \bigcup_{i=1}^n A_i \right| = \sum_{i=1}^n |A_i| - \sum_{i < j} |A_i \cap A_j| + \sum_{i < j < k} |A_i \cap A_j \cap A_k| - \dots + (-1)^{n-1} |A_1 \cap A_2 \cap \dots \cap A_n|$
===
For the function
$$f: A \to B$$
State the following of $f$:
1. Domain
2. Co-domain
3. Range
---
1. $A$
2. $B$
3. Set of all possible outputs of $f$ (not necessarily $B$)
===
State definition for each the following:
1. Random experiment
2. Outcome
3. Sample space
4. Event
---
1. A **random experiment** is a process by which we observe something uncertain
2. An **outcome** is a result of a random experiment
3. The **sample space** $S$ is the set of all possible outcomes
4. An **event** is a subset of the sample space
===
State the Axioms of Probability
---
**Axioms of Probability**
1. For any event $A$, $P(A) \geq 0$
2. $P(S) = 1$
3. If $A_1, A_2, A_3, \dots$ are disjoint events, then $P(A_1 \cup A_2 \cup A_3 \cup \dots) = P(A_1) + P(A_2) + P(A_3) + \dots$
===
In a finite sample space $S$, where all outcomes are equally likely, what is the probability of any event $A$?
---
$P(A) = \frac{|A|}{|S|}$
===
What is $P(A^c)$ in terms of $P(A)$?
---
$$P(A^c) = 1 - P(A)$$
Follows from the axioms: $A$ and $A^c$ are disjoint, and $A \cup A^c = S$, so $P(A) + P(A^c) = P(S) = 1$.
===
Define conditional probability $P(A|B)$
---
The probability of $A$ given $B$ (when $P(B) > 0$):
$$P(A|B) = \frac{P(A \cap B)}{P(B)}$$
In the equally-likely case:
$$P(A|B) = \frac{|A \cap B|}{|B|}$$
===
State the chain rule of probability for $n$ events
---
$$P(A_1 \cap A_2 \cap \cdots \cap A_n) = P(A_1)\,P(A_2|A_1)\,P(A_3|A_1,A_2)\cdots P(A_n|A_1,A_2,\dots,A_{n-1})$$
Each factor conditions on all previously listed events.
===
What conditions must hold for three events $A$, $B$, $C$ to be independent?
---
All four conditions must hold:
1. $P(A \cap B) = P(A)P(B)$
2. $P(A \cap C) = P(A)P(C)$
3. $P(B \cap C) = P(B)P(C)$
4. $P(A \cap B \cap C) = P(A)P(B)P(C)$
Pairwise independence alone is **not** sufficient.
===
What does it mean for $n$ events $A_1, A_2, \dots, A_n$ to be independent?
---
Every subset of the events must satisfy the product rule. That is, for all $i < j < k < \dots$:
$$P(A_i \cap A_j) = P(A_i)P(A_j)$$
$$P(A_i \cap A_j \cap A_k) = P(A_i)P(A_j)P(A_k)$$
$$\vdots$$
$$P(A_1 \cap A_2 \cap \cdots \cap A_n) = \prod_{i=1}^n P(A_i)$$
===
If $A_1, A_2, \dots, A_n$ are independent, what is $P(A_1 \cup A_2 \cup \cdots \cup A_n)$?
---
$$P(A_1 \cup A_2 \cup \cdots \cup A_n) = 1 - \prod_{i=1}^n (1 - P(A_i))$$
Derived by taking the complement (none of the events occur) and using independence.
===
What is the difference between disjoint and independent?
---
- Disjoint: $A$ and $B$ cannot occur at the same time. $A \cap B = \empty$
- Independent: $A$ does not give any information about $B$
===
State the Law of Total Probability using event $B$ and its complement
---
$$P(A) = P(A|B)\,P(B) + P(A|B^c)\,P(B^c)$$
===
State the general Law of Total Probability for a partition of the sample space
---
If $B_1, B_2, B_3, \dots$ is a partition of $S$, then for any event $A$:
$$P(A) = \sum_i P(A \cap B_i) = \sum_i P(A|B_i)\,P(B_i)$$
===
State Bayes' Rule for two events $A$ and $B$
---
For $P(A) \neq 0$:
$$P(B|A) = \frac{P(A|B)\,P(B)}{P(A)}$$
===
State Bayes' Rule when $B_1, B_2, \dots$ form a partition of $S$
---
For any event $A$ with $P(A) \neq 0$:
$$P(B_j|A) = \frac{P(A|B_j)\,P(B_j)}{\displaystyle\sum_i P(A|B_i)\,P(B_i)}$$
The denominator expands $P(A)$ via the Law of Total Probability.
===
State the meaning of conditionally independent between events $A$ and $B$ given event $C$. Note $P(C) \gt 0$.
---
$$P(A \cap B \mid C) = P(A \mid C)P(B \mid C)$$
===
State the multiplication rule for $P(A \cap B)$
---
$$P(A \cap B) = P(A|B)\,P(B) = P(B|A)\,P(A)$$
Rearrangement of the definition of conditional probability.
@@ -0,0 +1,17 @@
For any event $A$, prove $P(A^c) = 1 - P(A)$
---
$$
1 = P(S)
= P(A \cup A^c)
= P(A) + P(A^c)
$$
===
Prove $P(A \setminus B) = P(A) - P(A \cap B)$
---
TODO
@@ -0,0 +1,216 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "206bf674",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import sys\n",
"\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"\n",
"sns.set_theme(style=\"whitegrid\", context=\"notebook\")"
]
},
{
"cell_type": "markdown",
"id": "612bd02c",
"metadata": {},
"source": [
"# Chapter 1 Summary Notes"
]
},
{
"cell_type": "markdown",
"id": "be70f5df",
"metadata": {},
"source": [
"## Intro (1.0.0 - 1.3.1)\n",
"\n",
"**Theorem 1.1: De Morgan's law** \\\n",
"For any sets $A_1, A_2, A_3, \\dots A_n$, we have\n",
"- $(A_1 \\cup A_2 \\cup A_3 \\cup \\dots A_n)^c = A_1^c \\cap A_2^c \\cap A_3^c \\cap \\dots A_n^c$\n",
"- $(A_1 \\cap A_2 \\cap A_3 \\cap \\dots A_n)^c = A_1^c \\cup A_2^c \\cup A_3^c \\cup \\dots A_n^c$\n",
"\n",
"**Theorem 1.2: Distributive law** \\\n",
"For any sets $A, B,$ and $C$ we have\n",
"- $A \\cap (B \\cup C) = (A \\cap B)\\cup(A \\cap C)$\n",
"- $A \\cup (B \\cap C) = (A \\cup B)\\cap(A \\cup C)$\n",
"\n",
"**Inclusion-exclusion principle** \\\n",
"For a finite collection of sets $A_1, A_2, A_3, \\dots A_n$, we have\n",
"\n",
"$\\left| \\bigcup_{i=1}^n A_i \\right| = \\sum_{i=1}^n |A_i| - \\sum_{i < j} |A_i \\cap A_j| + \\sum_{i < j < k} |A_i \\cap A_j \\cap A_k| - \\dots + (-1)^{n-1} |A_1 \\cap A_2 \\cap \\dots \\cap A_n|$\n",
"\n",
"$n = 2$ case:\n",
"\n",
"$|A \\cup B| = |A| + |B| - |A \\cap B|$\n",
"\n",
"$n = 3$ case:\n",
"\n",
"$|A \\cup B \\cup C| = |A| + |B| + |C| - |A \\cap B| - |A \\cap C| - |B \\cap C| + |A \\cap B \\cap C|$\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "c7475d4f",
"metadata": {},
"source": [
"## Random experiments (1.3.1 - 1.4)\n",
"\n",
"- A **random experiment** is a process by which we observe something uncertain\n",
"- An **outcome** is a result of a random experiment\n",
"- The **sample space** $S$ is the set of all possible outcomes\n",
"- An **event** $A$ is any subset of $S$\n",
"\n",
"> In the context of a random experiment, the sample space is our *universal set*\n",
"\n",
"**Axioms of Probability**\n",
"\n",
"1. For any event $A$, $P(A) \\geq 0$\n",
"2. $P(S) = 1$\n",
"3. If $A_1, A_2, A_3, \\dots$ are disjoint events, then $P(A_1 \\cup A_2 \\cup A_3 \\cup \\dots) = P(A_1) + P(A_2) + P(A_3) + \\dots$\n",
"\n",
"**Some notation**\n",
"\n",
"- $P(A \\cap B) = P(A$ and $B) = P(A,B)$\n",
"- $P(A \\cup B) = P(A$ or $B)$\n",
"\n",
"In a finite sample space $S$, where all outcomes are equally likely, the probability of any event $A$ can be found by\n",
"\n",
"\\begin{align*}\n",
"P(A) = \\frac{|A|}{|S|}\n",
"\\end{align*}"
]
},
{
"cell_type": "markdown",
"id": "b705ef32",
"metadata": {},
"source": [
"## Conditional probability (1.4.0)\n",
"\n",
"If $A$ and $B$ are twos events in sample space $S$, then the **conditional probability of $A$ given $B$** is defined as\n",
"\n",
"\\begin{align*}\n",
"P(A|B) = \\frac{P(A \\cap B)}{P(B)}, \\text{when } P(B) > 0\n",
"\\end{align*}\n",
"\n",
"or\n",
"\n",
"\\begin{align*}\n",
"P(A|B) = \\frac{|A \\cap B|}{|B|}, \\text{when } P(B) > 0\n",
"\\end{align*}\n",
"\n",
"For events $A, B,$ and $C$, with $P(C) \\gt 0$, we have\n",
"\n",
"- $P(A^c|C) = 1 - P(A|C)$\n",
"- $P(\\empty|C) = 0$\n",
"- $P(A|C) \\leq 1$\n",
"- $P(A \\setminus B|C) = P(A|C) - P(A \\cap B|C)$\n",
"- $P(A \\cup B|C) = P(A|C) + P(B|C) - P(A \\cap B|C)$\n",
"- if $A \\subset B$ then $P(A|C) \\leq P(B|C)$\n",
"\n",
"![](../public/conditional_prob_tree.png)"
]
},
{
"cell_type": "markdown",
"id": "188a8fc2",
"metadata": {},
"source": [
"## Independence (1.4.1)\n",
"\n",
"**Definition.** Two events $A$ and $B$ are *independent* if $P(A \\cap B) = P(A)P(B)$. AKA $P(A|B) = P(A)$\n",
"\n",
"**Definition.** Three events $A, B,$ and $C$ are independent if **all** of the following conditions hold:\n",
"- $P(A \\cap B) = P(A)P(B)$\n",
"- $P(A \\cap C) = P(A)P(C)$\n",
"- $P(B \\cap C) = P(B)P(C)$\n",
"- $P(A \\cap B \\cap C) = P(A)P(B)P(C)$\n",
"\n",
"**Definition.** $N$ events $A_1, A_2, A_3, \\dots, A_n$ are independent if **all** the following conditions holds:\n",
"- $P(A_i \\cap B_j) = P(A_i)P(A_j)$\n",
"- $P(A_i \\cap A_j \\cap A_k) = P(A_i)P(A_j)P(A_k)$ where $i \\in [1:n+1]$, $j \\in [i:n+1]$, $k \\in [j:n+1]$\n",
"- $\\dots$\n",
"- $P(A_1 \\cap A_2 \\cap A_3 \\cap \\dots \\cap A_n) = \\prod_{i=1}^nP(A_i)$\n",
"\n",
"**Lemma.** \\\n",
"If $A$ and $B$ are independent then\n",
"- $A$ and $B^c$ are independent\n",
"- $A^c$ and $B$ are independent\n",
"- $A^c$ and $B^c$ are independent\n",
"\n",
"**Definition.** If $A_1, A_2, \\dots, A_n$ are independent then\n",
"$$P(A_1 \\cup A_2 \\cup \\dots \\cup A_n) = 1 - (1 - P(A_1))(1 - P(A_2))\\dots(1 - P(A_n))$$"
]
},
{
"cell_type": "markdown",
"id": "90cf5b00",
"metadata": {},
"source": [
"## Law of Total Probability (1.4.2)\n",
"\n",
"$$P(A) = P(A|B)P(B) + P(A|B^c)P(B^c)$$\n",
"\n",
"**Definition.** Law of Total Probability: \\\n",
"If $B_1, B_2, B_3, \\dots $ is a partition of the sample space $S$, then for any event $A$ we have\n",
"\n",
"$$P(A) = \\sum_i P(A \\cap B_i) = \\sum_i P(A|B_i)P(B_i)$$"
]
},
{
"cell_type": "markdown",
"id": "c33df800",
"metadata": {},
"source": [
"## Bayes' Rule (1.4.3)\n",
"\n",
"**Definition.** Bayes' Rule\n",
"\n",
"- For any two events $A$ and $B$, where $P(A) \\neq 0$, we have\n",
"\n",
"$$P(B|A) = \\frac{P(A|B)P(B)}{P(A)}$$\n",
"\n",
"- If $B_1, B_2, B_3, \\dots$ form a partition of the sample space $S$, and $A$ is any event with $P(A) \\neq 0$, we have\n",
"\n",
"$$P(B_j|A) = \\frac{P(A|B_j)P(B_j)}{\\sum_i P(A|B_i)P(B_i)}$$"
]
},
{
"cell_type": "markdown",
"id": "849d7c99",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "roadmap (3.14.5)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.14.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
@@ -0,0 +1,139 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 2,
"id": "c58309b2",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import sys\n",
"\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"\n",
"sns.set_theme(style=\"whitegrid\", context=\"notebook\")"
]
},
{
"cell_type": "markdown",
"id": "a6732353-51d5-4478-9cf8-5834e57e5a4e",
"metadata": {},
"source": [
"# Chapter 2 Notes"
]
},
{
"cell_type": "markdown",
"id": "9f0046c2",
"metadata": {},
"source": [
"## Counting (2.0.0 - 2.1.5)\n",
"\n",
"***Definition.*** Multiplication Principle: \\\n",
"Suppose that we perform $r$ experiments such that the $k\\text{th}$ experiment has $n_k$ possible outcomes, for $k=1,2,\\dots,r$. Then there are a total of $n_1 \\times n_2 \\times n_3 \\times \\dots \\times n_r$ possible outcomes for the sequence of $r$ experiments\n",
"\n",
"### Terminology\n",
"\n",
"- **Sampling**: Sampling from a set means choosing an element from that set. We\n",
"often **draw** a sample at random from a given set in which each element of the\n",
"set has equal chance of being chosen\n",
"\n",
"- **With or without replacement**: Usually we draw multiple samples from a set. If\n",
"we put each object back after each draw, we call this sampling with\n",
"replacement. In this case a single object can be possibly chosen multiple times.\n",
"For example, if A = {a1, a2, a3, a4} and we pick 3 elements with replacement, a\n",
"possible choice might be (a3, a1, a3). Thus \"with replacement\" means \"repetition\n",
"is allowed.\" On the other hand, if repetition is not allowed, we call it sampling\n",
"without replacement\n",
"\n",
"- **Ordered or unordered**: If ordering matters (i.e.: a1, a2, a3 ≠ a2, a3, a1), this is\n",
"called ordered sampling. Otherwise, it is called unordered\n",
"\n",
"### Counting Formulas\n",
"\n",
"- **ordered sampling with replacement:** $n^k$\n",
"\n",
"- **ordered sampling without replacement:** $n$ permute $k$ $\\quad$ ie $P^n_k = \\frac{n!}{(n - k)!}$\n",
"\n",
"- **unordered sampling without replacement:** $n$ choose $k$ $\\quad$ ie $\\binom{n}{k} = \\frac{n!}{k!(n-k)!}$\n",
"\n",
"- **unordered sampling with replacement:** $\\binom{n + k - 1}{k}$"
]
},
{
"cell_type": "markdown",
"id": "2e93e0fe",
"metadata": {},
"source": [
"## Problem Solving Principles"
]
},
{
"cell_type": "markdown",
"id": "87279e59",
"metadata": {},
"source": [
"When solving a combinatorics problem, consider:\n",
"1. Does order matter?\n",
" - Yes → Permutations\n",
" - No → Combinations\n",
"\n",
" - \"Are HHHTT and THHHT the same outcome to me?\"\n",
"\n",
"2. Are we sampling with of without replacement?\n",
" - Without replacement → Hypergeometric (phone problem)\n",
" - With replacement → Binomial (coin flips)\n",
" \n",
" \"Can the same item be chosen twice?\"\n",
"3. Are the \"groups\" labeled or unlabeled?\n",
" - Labeled/distinguishable → Just multiply combinations\n",
" - Unlabeled/interchangeable → Divide by k!\n",
"\n",
" \"Does it matter which group is called group 1?\"\n",
"4. Are the items distinguishable?\n",
" - Distinguishable → Each item is unique, classical probability applies\n",
" - Indistinguishable → Outcomes are not equally likely, be careful\n",
"\n",
" \"Could I label these items 1 to n?\"\n",
"5. Is complement of inclusion-exclusion easier?\n",
" - Complement → When \"at least\" or \"at most\" language appears\n",
" - Inclusion-Exclusion → When events overlap\n",
"\n",
" \"Is the opposite event simpler to count?\"\n",
"6. Am I counting each outcome exactly once? \n",
" - If yes, done. Otherwhise we are overcounting or undercounting"
]
},
{
"cell_type": "markdown",
"id": "9c5783dc",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "roadmap (3.14.5)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.14.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
@@ -0,0 +1,101 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "206bf674",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import sys\n",
"\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"\n",
"sns.set_theme(style=\"whitegrid\", context=\"notebook\")"
]
},
{
"cell_type": "markdown",
"id": "612bd02c",
"metadata": {},
"source": [
"# Chapter 4 Example Problems"
]
},
{
"cell_type": "markdown",
"id": "5ce47b88",
"metadata": {},
"source": [
"4.2 \n",
"\n",
"a. \n",
"\n",
"$\\int_{-\\infty}^{\\infty}f_X(u)du = 1$\n",
"\n",
"ie\n",
"\n",
"\\begin{align*}\n",
"\\int_{-\\infty}^{0^+}f_X(u)du + \\int_{0}^{\\infty}f_X(x)dx = 1 \\\\\n",
"0 + \\int_{0}^{\\infty}ce^{-x}dx = 1 \\\\\n",
"c\\cdot\\lim_{t \\to \\infty }\\int_{0}^{t}e^{-x}dx = 1 \\\\\n",
"c\\cdot\\lim_{t \\to \\infty } [-e^{-x}]_0^t = 1 \\\\\n",
"c\\cdot\\lim_{t \\to \\infty } ((-e^{-t}) - (-e^{-0})) = 1 \\\\\n",
"c\\cdot\\lim_{t \\to \\infty } (-e^{-t} + 1) = 1 \\\\\n",
"c\\cdot 1 = 1 \\\\\n",
"c = 1\n",
"\\end{align*}\n",
"\n",
"b.\n",
"\n",
"Note that\n",
"\n",
"$$F_X(x) = \\int_{-\\infty}^x f_X(u)du$$\n",
"\n",
"and $c=1$\n",
"\n",
"so\n",
"\n",
"\\begin{align*}\n",
"F_X(x) = \\begin{cases} \n",
"1-e^{-x} & \\text{for } x \\geq 0 \\\\ \n",
"0 & \\text{otherwise} \n",
"\\end{cases}\n",
"\\end{align*}\n",
"\n",
"c. $F_X(3) - F_X(1) = $"
]
},
{
"cell_type": "markdown",
"id": "95a3b21c",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "roadmap (3.14.5)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.14.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
@@ -0,0 +1,176 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 2,
"id": "c58309b2",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import sys\n",
"\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"\n",
"sns.set_theme(style=\"whitegrid\", context=\"notebook\")"
]
},
{
"cell_type": "markdown",
"id": "a6732353-51d5-4478-9cf8-5834e57e5a4e",
"metadata": {},
"source": [
"# Chapter 4 Notes"
]
},
{
"cell_type": "markdown",
"id": "9f0046c2",
"metadata": {},
"source": [
"## Continuous Random Variables and their Distributions (4.1.0)"
]
},
{
"cell_type": "markdown",
"id": "ea9b1f96",
"metadata": {},
"source": [
"***Definition*** A random variable $X$ with CDF $F_X(x)$ is said to be continuous if $F_X(x)$ is a continuous for all $x \\in \\mathbb{R}$"
]
},
{
"cell_type": "markdown",
"id": "a96da4d3",
"metadata": {},
"source": [
"## Probability Density Function (PDF) (4.1.1)"
]
},
{
"cell_type": "markdown",
"id": "c62129a5",
"metadata": {},
"source": [
"***Definition*** Consider a continuous random variable $X$ with an absolutely continuous CDF $F_X(x)$. The function $f_X(x)$ defined by\n",
"\n",
"$$f_X(x) = \\frac{dF_X(x)}{dx} = F'_X(x) \\quad \\text{if } F_X(x) \\text{ is differentiable at } x$$\n",
" \n",
"is called the probability density function (PDF) of $X$."
]
},
{
"cell_type": "markdown",
"id": "df411869",
"metadata": {},
"source": [
"NOTE: The PDF being constant implies uniformity\n",
"\n",
"NOTE: For small values of $\\delta$,\n",
"\n",
"$$P(x \\lt X \\leq x + \\delta) \\approx f_X(x)\\delta$$\n",
"\n",
"Thus if $f_X(x_1) \\gt f_X(x_2)$, we can say $P(x_1 \\lt X \\leq x_1 + \\delta) \\gt P(x_2 \\lt X \\leq x_2 + \\delta)$, ie the value of $X$ is more likely to be around $x_1$ then $x_2$\n",
"\n",
"NOTE: The CDF can be obtained from the PDF via (assuming absolute continuity)\n",
"\n",
"$$F_X(x) = \\int_{-\\infty}^x f_X(u)du$$"
]
},
{
"cell_type": "markdown",
"id": "f29a3bfb",
"metadata": {},
"source": [
"***Properties*** Consider a continuous random variable $X$ with PDF $f_X(x)$. We have\n",
"- $f_X(x) \\geq 0, \\forall x \\in \\mathbb{R}$\n",
"- $\\int_{-\\infty}^{\\infty}f_X(u)du = 1$\n",
"- $P(a \\lt X \\leq b) = F_X(b) - F_X(a) = \\int_a^bf_X(u)du$\n",
"- For a set $A$, $P(X \\in A) = \\int_Af_X(u)du$. However, set $A$ must satisfy:"
]
},
{
"cell_type": "markdown",
"id": "6f546fed",
"metadata": {},
"source": [
"***Definition*** If $X$ is a continuous random variable, we can write the range of $X$ as\n",
"\n",
"$$R_X = \\{ x \\mid f_X(x) \\gt 0 \\}$$"
]
},
{
"cell_type": "markdown",
"id": "170db3a0",
"metadata": {},
"source": [
"***Property*** The expected value if a continuous random variable $X$ is\n",
"\n",
"$$E[X] = \\int_{-\\infty}^{\\infty}xf_X(x)dx$$"
]
},
{
"cell_type": "markdown",
"id": "deccea51",
"metadata": {},
"source": [
"***Property*** Law of the unconscious statistician (LOTUS) for continuous random variables\n",
"\n",
"$$E[g(X)] = \\int_{-\\infty}^{\\infty}g(x)f_X(x)dx$$\n"
]
},
{
"cell_type": "markdown",
"id": "c0a20195",
"metadata": {},
"source": [
"***Property*** Variance for a continuous random variable $X$, we can write\n",
"\n",
"\\begin{align*}\n",
"\\text{Var}(X) &= E[(X - E[x])^2] = \\int_{-\\infty}^{\\infty}(x - E[X])^2f_X(x)dx \\\\\n",
"&= E[X^2] - E[X]^2 = \\int_{-\\infty}^{\\infty}x^2f_X(x)dx - E[X]^2\n",
"\\end{align*}"
]
},
{
"cell_type": "markdown",
"id": "6f2f6b72",
"metadata": {},
"source": [
"## Functions of Continuous Random Variables"
]
},
{
"cell_type": "markdown",
"id": "f4a95738",
"metadata": {},
"source": [
"***Theorem***"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "roadmap (3.14.5)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.14.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
@@ -0,0 +1,7 @@
# Algorithmic Trading - Winning Strategies and their rationale
[pdf](./Algorithmic%20Trading%20Winning%20Strategies%20and%20their%20rationale.pdf)
total pages=225
**Currently reading:** chapter 1, page 19
@@ -0,0 +1,51 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 3,
"id": "93b4302b",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import sys\n",
"\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"\n",
"sns.set_theme(style=\"whitegrid\", context=\"notebook\")"
]
},
{
"cell_type": "markdown",
"id": "bdd46cba",
"metadata": {},
"source": [
"# Chapter 1 Notes"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "roadmap (3.14.5)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.14.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
@@ -0,0 +1,51 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 3,
"id": "93b4302b",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import sys\n",
"\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"\n",
"sns.set_theme(style=\"whitegrid\", context=\"notebook\")"
]
},
{
"cell_type": "markdown",
"id": "bdd46cba",
"metadata": {},
"source": [
"# Notes"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "roadmap (3.14.5)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.14.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
View File
@@ -0,0 +1,3 @@
# Youtube Playlist - Let's Build a Quant Trading Strategy
<https://www.youtube.com/playlist?list=PLdsqLas3-Kg3fM8VgykXBJ2gRIFBpKHjY>
@@ -0,0 +1,148 @@
from typing import List
import requests
import zipfile
from pathlib import Path
import polars as pl
from datetime import datetime, timedelta
from tqdm import tqdm
import research
MAKER_FEE = 0.000450
TAKER_FEE = 0.000450
def download_and_unzip(symbol: str, date: str | datetime,
download_dir: str = "data", cache_dir: str = "cache") -> pl.DataFrame:
"""
Download and unzip Binance futures trade data for a given symbol and date.
Caches results as parquet files to avoid repeated downloads.
"""
# Normalize date to string
date_str = date.strftime('%Y-%m-%d') if isinstance(date, datetime) else date
cache_dir = Path(cache_dir)
cache_dir.mkdir(exist_ok=True)
cache_path = cache_dir / f"{symbol}-trades-{date_str}.parquet"
if cache_path.exists():
return pl.read_parquet(cache_path)
url = f"https://data.binance.vision/data/futures/um/daily/trades/{symbol}/{symbol}-trades-{date_str}.zip"
download_dir = Path(download_dir)
download_dir.mkdir(exist_ok=True)
zip_path = download_dir / f"{symbol}-trades-{date_str}.zip"
# Download zip
response = requests.get(url, stream=True)
response.raise_for_status()
with open(zip_path, 'wb') as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
# Extract
with zipfile.ZipFile(zip_path, 'r') as zf:
zf.extractall(download_dir)
csv_path = download_dir / f"{symbol}-trades-{date_str}.csv"
# Load into Polars
df = pl.read_csv(
csv_path,
schema={
"id": pl.Int64,
"price": pl.Float64,
"qty": pl.Float64,
"quoteQty": pl.Float64,
"time": pl.Int64,
"isBuyerMaker": pl.Boolean,
}
).with_columns(
pl.from_epoch("time", time_unit="ms").alias("datetime")
)
# Cache and clean
df.write_parquet(cache_path)
zip_path.unlink(missing_ok=True)
csv_path.unlink(missing_ok=True)
return df
def download_date_range(symbol: str, start_date: str | datetime, end_date: str | datetime,
download_dir: str = "data", cache_dir: str = "cache") -> list[pl.DataFrame]:
"""
Download trade data for a range of dates with a progress bar.
"""
if isinstance(start_date, str):
start_date = datetime.strptime(start_date, '%Y-%m-%d')
if isinstance(end_date, str):
end_date = datetime.strptime(end_date, '%Y-%m-%d')
num_days = (end_date - start_date).days + 1
for i in tqdm(range(num_days), desc=f"Downloading {symbol}"):
current_date = start_date + timedelta(days=i)
try:
download_and_unzip(symbol, current_date, download_dir, cache_dir)
except Exception as e:
tqdm.write(f"[ERROR] {symbol} {current_date.date()}: {e}")
def download_trades(symbol: str, no_days: int,
download_dir: str = "data", cache_dir: str = "cache", return_trades=False) -> pl.DataFrame:
"""
Download trades for the last N days up to yesterday with a progress bar.
"""
yesterday = datetime.now() - timedelta(days=1)
start_date = yesterday - timedelta(days=no_days - 1)
dfs = []
for i in tqdm(range(no_days), desc=f"Downloading {symbol}"):
current_date = start_date + timedelta(days=i)
try:
if return_trades:
dfs.append(download_and_unzip(symbol, current_date, download_dir, cache_dir))
else:
download_and_unzip(symbol, current_date, download_dir, cache_dir)
except Exception as e:
tqdm.write(f"[ERROR] {symbol} {current_date.date()}: {e}")
return pl.concat(dfs) if return_trades else None
def download_ohlc_timeseries(symbol: str, no_days: int, time_interval: str, download_dir: str = "data", cache_dir: str = "cache") -> pl.DataFrame:
"""
Download trades for the last N days up to yesterday with a progress bar.
"""
yesterday = datetime.now() - timedelta(days=1)
start_date = yesterday - timedelta(days=no_days - 1)
time_series = []
for i in tqdm(range(no_days), desc=f"Downloading {symbol}"):
current_date = start_date + timedelta(days=i)
try:
trades = download_and_unzip(symbol, current_date, download_dir, cache_dir)
time_series.append(research.timeseries(trades, time_interval, research.OHLC_AGGS))
except Exception as e:
tqdm.write(f"[ERROR] {symbol} {current_date.date()}: {e}")
return pl.concat(time_series)
def download_timeseries(symbol: str, no_days: int, time_interval: str, aggs: List[pl.Expr], download_dir: str = "data", cache_dir: str = "cache") -> pl.DataFrame:
"""
Download trades for the last N days up to yesterday with a progress bar.
"""
yesterday = datetime.now() - timedelta(days=1)
start_date = yesterday - timedelta(days=no_days - 1)
time_series = []
for i in tqdm(range(no_days), desc=f"Downloading {symbol}"):
current_date = start_date + timedelta(days=i)
try:
trades = download_and_unzip(symbol, current_date, download_dir, cache_dir)
time_series.append(research.timeseries(trades, time_interval, aggs))
except Exception as e:
tqdm.write(f"[ERROR] {symbol} {current_date.date()}: {e}")
return pl.concat(time_series)
File diff suppressed because one or more lines are too long
Generated
+2447
View File
File diff suppressed because it is too large Load Diff