With the Yahoo! Finance market data downloader (yfinance), we can pull historical data on virtually any stock with a single line of code.
You can install yfinance with pip;
pip install yfinance
From there, simply import the library, and pull a .Ticker()
.
Let’s do the New York Times Company;
That outputs a yfinance.Ticker object, which holds historical $NYT data accessible with .history()
. Setting period=’max’
will return all data;
Prophet expects input data to have 2 columns, ds
and y
, so let’s just copy the historical dates (hist.index
) and adjusted closing prices (hist[‘Close’]
) to a new DataFrame.
…
“Time series forecasting is the use of a model to predict future values based on previously observed values.”
— Wikipeida
In this story, we’ll break down and examine the R API of Prophet, a procedure for forecasting time series data open-sourced by Facebook in February 2017 with v0.6 released in March 2020.
While advancements in data science often increase the infamous “skills…
OpenCV’s CUDA python module is a lot of fun, but it’s a work in progress.
For starters, we have to load in the video on CPU before passing it (frame-by-frame) to GPU. cv.cuda.imread()
has not been built yet.
.upload()
cv.VideoCapture()
can be used to load and iterate through video frames on CPU. Let’s read the corn.mp4
file with it;
After .read()
ing the 1st image, we’re ready to make a GPU matrix (picture frame) so that image can be .upload()
ed to our GPU.
Great! But what about the 2nd image?
Well, you probably noticed .read()
output 2 variables, ret
…
First, we need to create GPU space (gpu_frame
) to hold an image (as a picture frame holds a picture) before we can upload our image to the GPU.
Next, load the image into memory with CPU (screenshot
), and .upload()
it to the gpu_frame
(frame the image);
Image now in frame, we can start having fun.
OpenCV CUDA functions return cv2.cuda_GpuMat
(GPU matrices), so each result can be operated on without the user having to re-.upload()
.
Let’s convert the image…
Install PyCUDA with PIP;
pip install pycuda
If you don’t have pip, get pip.
nvcc
PathNvcc comes preinstalled, but your Nano isn’t exactly told about it.. Use sudo to open your bashrc file;
sudo gedit ~/.bashrc
Add a blank, then these 2 lines (letting your Nano know where CUDA is) to the bottom of the file;
export PATH=${PATH}:/usr/local/cuda/bin
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/lib64
Save, close, then (back in Terminal) source the bashrc
file
source ~/.bashrc
You can now check your nvcc version with;
nvcc --version
By using PyCUDA’s SourceModule
to create a function (add_them
) with CUDA C code, we can simply .get_function()
…
sudo pip install selenium
or
sudo pip3 install selenium
Assuming you have pip on your system, you can install or upgrade Selenium’s Python bindings with one of the above (the latter worked for me). If you don’t have pip, get pip.
sudo apt-get install chromium-chromedriver
To interface with a browser (Chromium in our case), Selenium requires a driver (Chromium Chromedriver in our case).
Paste the following into your favorite editor or python terminal, and if it runs you’re good to go!
Thanks for reading! Please feel free to respond with any questions.
Cross validating Prophet with Dask is done the same as cross validating Prophet without Dask, but you pass parallel=’dask’
into the cross_validation()
function like;
In this story, we’ll use Prophet to forecast the average distance of a NYC yellow cab trip by day. To quickly judge our model’s performance, we’ll call on Dask to parallelize cross-validation across your system’s CPUs.
Afterwords, we’ll apply this parallelized cross_validation()
to perform hyperparameter optimization (HPO) and fine tune that model.
Logistic regression is an algorithm used to predict the probability of events, given some other measurements. Logistic Regression is used when the dependent variable (“target”) is categorical.
For example,
Logistic regression can also be used in non-binary situations, but let’s cover that in a later post and stick to binary logistic regression for now.
Essentially, the logistic regression function takes examples with known classes (e.g. cake (1) or pie (0)), fits a (Sigmoid) line to their distribution, and…
K-Means is an easy way to cluster data. It randomly selects K points in a given dataset, then computes which of the dataset’s instances are closest to each point (making clusters).
For every cluster, the mean of its values (instances) is computed, and this mean becomes that cluster’s new point (centroid).
Once a cluster’s centroid has moved, its distance from the dataset’s instances has changed, and instances may be added to or removed from that cluster. The mean is recalculated & replaced until it stops moving or has hit a given maximum iterations (max_iter
) whichever comes first.
K-Nearest Neighbors (KNN) is a simple way to determine the value of something by asking what the values of the K nearest things to it are.
So if K=3, what do the 3 nearest things look like?
KNN is a non-parametric, lazy learning algorithm.
Non-parametric models differ from parametric models in that the model structure is not specified a priori [(beforehand)] but is instead determined from data.
Basically, KNN makes no assumption on the data’s underlying distribution, and builds itself by analyzing the data. The one set “parameter” is K.
The term non-parametric is not meant to imply that such…
Friend links: https://gumdropsteve.github.io/blog — “Energy may be likened to the bending of a crossbow, decision to the releasing of a trigger.”