This is the story: we are investors, and we have, let’s say, 1,199 USD$ (soon you will understand why I’m using 1,199 and not 1,000), and our goal is to make the most of them (could we want something else?). We decided to go into the crypto market because of its volatility. To maintain our partnership, we need to make consistent profits. We want to use machine learning without forgetting about the business understanding used by traders. Now, the only thing we need to know is HOW (Wow… I’m definitively using that in my next freestyle competition). Will we accomplish this? …
What about starting with a quote?.
“In the past, I’ve tried to teach machine learning using […] different programming languages […], and what I found is that students were able to learn the most productively […] using a relatively high level language like Octave.”, Andrew NG.
Building something from scratch was the method used by Andrew NG to teach his famous Coursera’s machine learning course (in plain Octave 😂), with one of the greatest ratings on the platform. Like Andrew, I truly believe that building things is the best way to learn because it forces us to understand every step of the algorithm. Unlike Andrew, I prefer to use Python and Numpy 😎 because of their simplicity and massive adoption. …
If you always wanted to learn decision trees, just by reading this, you’ve received a beautiful opportunity, a stroke of luck, I might say. But as entrepreneurs declare, “it’s not enough to be in the right place; you also need to take the opportunity when it comes”. So, consider giving yourself some time and get comfortable, because what’s coming it’s not the shortest guide, but it might be the best.
Decision trees are the foundation of the most well performing algorithms on Kaggle competitions and real-life. An indicator of this is that you are certainly going to collide with a “max_depth” on almost every ensemble. …
If it sounds like the dream for you, let me tell you that it sounded like the dream for product developers too. The result: Azure ML SDK.
Today we will be exploring how to deal with infrastructure, environments, and deployments in the cloud. We will dig deeper into concepts and the underlying structures required for you to master these skills.
I know that concepts by themselves might feel like castles in the air. But don’t worry, there will be enough code for you to feel that the castles are actually on the ground.
If we talk about cloud providers, there are three leading players: AWS, GCP, and Azure. When it comes to cloud computing, it’s not rocket science to figure out that one of their main customers are data scientists. Maybe not directly them, but, surprise! Companies are — still — made up of people. …
A few years ago, Docker started gaining popularity. Everyone claimed that this tool was saving them incredible amounts of time. The only thing that you needed for solving the messy process of building, deploying, and managing apps was Docker. It was like heaven.
But… such a wonderful thing could be real? Well, it turns out that everything was true. A little part of paradise might mistakenly have fallen on earth?. …
One of the most significant concerns in this data science era is operationalizing artificial intelligence’s full lifecycle. As you might know, the foundation for machine learning is data. If you want to be sure that your project has full traceability, you can’t forget about the less sexy component.
This article aims to visit why data versioning is essential and how to do it on Azure Machine Learning.
But first, let’s dive a little bit into what MLOps actually is. An excellent way to understand this is by looking at the infographic created by Microsoft. …
Before you start judging me and worrying thinking that I might be suffering from the “confirmation bias syndrome” (I took the exam and pass, so I could be tempted to find every single argument to say that this is the best certificate ever), I want to tell you that you don’t need to!
Here you won’t encounter statements like “study hard, become the best, like me!” or “It was incredibly hard, only experienced data scientist will pass.” …
Imagine that you get a great job as the head of the data science team in a new E-commerce mainly focused on selling men’s clothes. You noticed that the questions were heavily focused on clustering in the interview process, and now you get why: the CEO of the company it’s pressuring the marketing area to elaborate campaigns targetting the most representative groups. The marketing team knows that you are the best fit to help them! So they set up a meeting to figure this out, after a short introduction of the situation, they quickly got to the point. “We are experts defining well-suited campaigns when we know the people we are trying to reach, but we need to first identify the most representative group of clients with a description of their behavior. …
Yup, you read correctly, Optimizing the Target Variable. You might be thinking, “but, that’s like cheating” or “can you change the past?”. Stop worrying! I’m neither a liar nor a time traveler who attempts to change the past just to improve his machine learning models (There are plenty of more exciting things to do with that power).
If you went for your first dataset, learning about Titanic, where you tried everything to predict if the passenger survived, or the MNIST database, where there were numbers that even you couldn’t recognize, you might be a little confused. …
Data Science Project: Cryptocurrencies Part 1—Motivation —
Data Science Project: Cryptocurrencies Part 2 — Volume and Data Source —
Data Science Project: Cryptocurrencies Part 3 — Becoming a Trader Data Scientist —
Today I’m going to introduce you to one of the variables that we will be using in our models. I will describe to you the whole process from retrieving a trading concept to, in the end, create an insightful variable. You will understand how to transform your business insights into real machine learning material.
The horizon that I would like to evaluate is 5-minute data aggregation. This means that I will collect data for 5 minutes intervals, with the standard OHLCV information. I met a little problem with Cryptocompare. I just find out that for gathering more than a week for minute data, we need an enterprise account. …
About