So you want to build a machine learning startup? Here’s what you must do first

BOULDER — For startups interested in using data and machine learning, it might be tempting to jump right into crunching the information. But Sara Bates, co-founder and data scientist at Bates Analytics, a consulting firm that creates machine learning strategy for startups, shared some advice at Boulder Startup Week’s “practical intro to machine learning” on what to do before diving into data.

“This is what you need to know before you implement machine learning in your startup,” Bates said. “A lot of companies focus on machine learning, and are not thinking about the strategy as a whole. They’re missing out on benefits and risking a lot of time and resources.”

Before starting any data-centric startup, Bates encourages founders to make sure they really understand the elements of data science. First there is extracting the data set that’s needed, transforming it into a form that works for your purposes and loading it into whatever algorithms are being used. Then analysis and machine learning can take place. But just as important as the analysis is what comes after: communicating every step of the way what the data analysis means for your business line, visualizing the whole point of the data science project and then deploying the finished product.

The strategy considerations, Bates said, are crucial to laying the foundation for a successful machine learning project, just as they are for any business.

“Before you think of how you’re going to implement machine learning, you have to answer three questions,” she said. “Who does the work, what are the machine-learning goals and what data do you have? The answers to these will influence the methods you use to carry out your work.”

For example, Bates said a founder has to decide if they’re going to hire a data-science team to run the project, have existing engineers work on it in their spare time or outsource the project to an outside data-science team. While using existing engineers might be the most attractive and affordable option, Bates warns against that option, as it runs the possibility of your engineers not having the background to fully understand the algorithms and issues that come up. As to deciding between building a team in-house or outsourcing, one thing to consider is if the project could exist without machine learning. If it’s crucial to the identity of the project, it’s worthwhile to invest in building a dedicated in-house team.

Just because your startup is heavily involved in data, machine learning or even artificial intelligence doesn’t mean it can start without deciding what the goals and priorities are for the project. Just as important is assessing the data quantity and quality before starting.

Only then can you tackle how you’re going to do the work.

“There’s no one-size-fits-all on how to do this,” Bates said. “Not only do you get to choose the languages, algorithms and tools you use, but it’s open-ended on how to implement it. But once you figure out your overall strategy, figuring out your tools is much easier.”