Blog
Data Science & ML
Random Forest on GPUs: 2000x Faster than Apache Spark
This blog post compares using RAPIDS and Dask vs Apache Spark for model training
See more
Supercharging Hyperparameter Tuning with Dask
The distributed computing framework Dask is great for hyperparameter tuning, since you can train different parameter sets concurrently.
See more
Practical Issues Setting up Kubernetes for Data Science on AWS
Data science has unique workflows that don't always match those of software engineers and require special setup for Kubernetes.
See more
Setting Up Your Data Science & Machine Learning Capability in Python
Python is a great language to base your DS/ML framework on, and allows you to avoid being locked into one vendor specific framework.
See more
Snowflake and Dask
This article covers efficient ways to load data from Snowflake into a Dask distributed cluster.
See more