Hi friends,
We all know that machine learning projects are quite different from standard software development projects. For example, estimating an ML project is surprisingly challenging, to the point of being impossible at times. Maintaining an ML project, specifically deciding when to update a model or a pipeline or a dataset is quite different too when compared to maintaining a standard dev project.
This of course comes as no surprise, since developing a classic system requires code, code, and a bit more code sprinkled away as microservices, whereas a machine learning system is based on data, lots of data, all the data if possible. One is deterministic, the other less so.
But I digress.
One thing I really enjoy when doing work in, say, C#, is how easy it is to use best practices such as automatic builds. Creating a build that runs every time I push something to main, compiles my code, and deploys it to Azure in a blaze of glory, is quite straightforward. It's not as straightforward for machine learning projects though, so I've written a guide on doing the same with Azure ML Pipelines. Hope you find it useful!
5 things to read
I'm also including five of the most interesting things I've read/listened to on the web lately, which I think you'll enjoy:
- Erik Bernhardsson's story on building a data team. It's well written, it's funny, and most of all it rings true.
- Uber Engineering on tuning model performance
- A summary of Cal Newport's So Good They Can't Ignore You that's better than the book imho
- Derek Sivers on letting go of your projects & ideas and seeing which ones come back
- And finally, how Team Wiz hacked thousands of Azure customers’ databases. Don't miss their post on protecting your environments, too.
In other news
Sandro Mancuso and I are talking about craftsmanship in AI project at next week's Codecamp on Architecture & Design. It's free for everyone, so I'm counting on you to join us!
Yours truly,
Vlad