This installment of Learning from Machine Learning features Sebastian Raschka, lead AI educator at Lightning AI. A passionate coder and proponent of open-source software, Sebastian is the creator of the mlxtend library, and is a contributor to Scikit-Learn and dozens of other open-source projects. He is the author of Machine Learning with PyTorch and Scikit-Learn and Machine Learning Q and AI.
I was fortunate to have the opportunity to pick the brain of Sebastian, a kind and thoughtful machine learning expert with over a decade of experience. During our discussion, Sebastian generously shared his insights, forged from years of teaching, tinkering and building in AI. He offers enlightening advice for navigating the field on everything from best practices to the pursuit of Artificial General Intelligence (AGI).

Our wide-ranging discussion yielded many insights, which I’ve summarized into 13 key lessons:
Start simple and be patient
Learn by doing
Always get a baseline
Embrace change
Find balance between specialized and general systems
Implement from scratch when learning
Use proven libraries in production
It’s the last mile that counts
Use the right tool for the job
Seek diversity when ensembling models
Beware of overconfidence (overconfident models :)
Leverage Large Language Models responsibly
Have fun!
💡 Lessons from a Pioneering Expert💡
1. Start simple and be patient
Approach machine learning with patience, taking concepts step-by-step, in order to build a solid foundation. “You should, make sure you understand the bigger picture and intuition.” Grasp the high-level concepts before getting bogged down in implementation details. Sebastian explains, “I would start with a book or a course and just work through that, almost with a blindness on not getting distracted by other resources.”
“I would start with a book or a course and just work through that, almost with a blindness on not getting distracted by other resources.”
Borrowing from Andrew Ng, Sebastian shares, “If we don’t understand a certain thing, maybe let’s not worry about it. Just yet.” Getting stuck on unclear details can slow you down. Move forward when needed rather than obsessing over gaps. Sebastian expands, “It happens to me all the time. I get distracted by something else, I look it up and then it’s like a rabbit role and you feel, ‘wow, there’s so much to learn’ and then you’re frustrated and overwhelmed because the day only has twenty four hours, you can’t possibly learn it all.”
Remember it’s about “doing one thing at a time, step by step. It’s a marathon, not a sprint.” For early data scientists, he stresses building strong fundamentals before diving into the specifics of advanced techniques.
2. Learn by doing
“Finding a project you’re interested in is the best way to get involved in machine learning and to learn new skills.” He recalled getting hooked while building a fantasy sports predictor, combining his soccer fandom with honing his data abilities. Sebastian explains, “That’s how I taught myself pandas.” Tackling hands-on projects and solving real problems that you feel passionate about accelerates learning.

3. Always get a baseline
When beginning a new ML project you should always find some baseline performance. For example when starting a text classification project, Sebastian says, “Even if you know more sophisticated techniques, even if it makes sense to use a Large Language Model… Start with a simple logistic regression, maybe a bag of words to get a baseline.”
By building a baseline before trying more advanced techniques you can get a better understanding of the problem and the data. If you run into issues when implementing more advanced techniques, having a baseline model where you already read and processed the data can help debug more complex models. If an advanced model underperforms the baseline, it may be an indicator that there are data issues rather than model limitations.

4. Embrace change
The field is changing quickly. While it’s important to start slow and take things step by step it is equally important to stay flexible and open to adopting new methods and ideas.
Sebastian stresses the importance of adaptability amid relentless change. “Things change completely. We were using [Generative Adversarial Networks] GANs two years ago and now we’re using diffusion models… [be] open to change.” Machine learning rewards the nimble. He emphasizes being open to new experiences both in machine learning and life.
5. Find balance between specialized and general systems
The pursuit of Artificial General Intelligence (AGI) is a worthy goal but specialized systems often provide better results. Depending on the use case, a specialized system may be more appropriate than a one-size-fits-all approach. Sebastian discusses how systems may be a combination of smaller models where the first model is used to determine which specialized model the task should be directed to.
Regardless, the pursuit for AGI is an incredible motivator and has led to many breakthroughs. As Sebastian explains, the quest for AGI pushed breakthroughs like Deepmind’s AlphaGo beating the best humans at Go. And while AlphaGo itself may not be directly useful, “it ultimately led to AlphaFold, the first version, for protein structure prediction.”
The dream of AGI serves as inspiration, but specialized systems focused on narrow domains currently provide the most value. Still, the race towards AGI has led to advances that found practical application.
6. When learning, implement from scratch
Coding algorithms without depending on external libraries (e.g., using just Python) helps build a better understanding of the underlying concepts. Sebastian explains, “Implementing algorithms from scratch helps build intuition and peel back the layers to make things more understandable.”
“Implementing algorithms from scratch helps build intuition and peel back the layers to make things more understandable.”
Fortunately, Sebastian shares many of these educational implementations through posts and tutorials. We dove into Sebastian’s breakdown of Self-Attention of LLMs from Scratch where he breaks down the importance of the “self-attention” mechanism which is a cornerstone of both transformers and stable-diffusion.
7. In production, don’t reinvent the wheel!
In real world applications, you don’t have to reinvent the wheel. Sebastian expands for things that already exist, “I think that is a lot of work and also risky.” While building from scratch is enlightening, production-ready applications rely on proven, battle-tested libraries.
8. It’s the last mile that counts
Getting a model to relatively high performance is much easier than squeezing out the last few percentage points to reach extremely high performance. But that final push is vital — it’s the difference between an impressive prototype and a production-ready system. Even if rapid progress was made initially, the final seemingly marginal gains to reach “perfection” are very challenging.
Even if rapid progress was made initially, the final seemingly marginal gains to reach “perfection” are very challenging.
Sebastian uses self-driving cars to drive this point across. “Five years ago, they already had pretty impressive demos… but I do think it’s the last few percent that are crucial.” He continues, “Five years ago, it was almost let’s say 95% there, almost ready. Now five years later, we are maybe 97–98%, but can we get the last remaining percent points to really nail it and have them on the road reliably.”

Sebastian draws a comparison between ChatGPT and Self-Driving cars. While astounding demos of both technologies exist, getting those last few percentage points of performance to reach full reliability has proven difficult and vital.
9. Use the right tool for the job
Sebastian cautions against forcing ML everywhere, stating “If you have a hammer, everything looks like a nail… the question becomes when to use AI and when not to use AI.” The trick is often knowing when to use rules, ML, or other tools. Sebastian shares, “Right now, we are using AI for a lot of things because it is exciting, and we want to see how far we can push it until it breaks or doesn’t work… sometimes we have nonsensical applications of AI because of that.”
Automation has limits. Sometimes rules and human expertise outperform AI. It’s important to pick the best approach for each task. Just because we can use AI/ML as a solution doesn’t mean we should for every problem.
10. Seek Diversity in Model Ensembles
Ensemble methods like model stacking can improve prediction robustness, but diversity is key — combining correlated models that make similar types of errors won’t provide much upside.
As Sebastian explains, “Building an ensemble of different methods is usually something to make [models] more robust and [produce] accurate predictions. And ensemble methods usually work best if you have an ensemble of different methods. If there’s no correlation in terms of how they work. So they are not redundant, basically.”
The goal is to have a diverse set of complementary models. For example, you might ensemble a random forest model with a neural network, or a gradient boosting machine with a k-nearest neighbors model. Stacking models that have high diversity improves the ensemble’s ability to correct errors made by individual models.
So when building ensembles, seek diversity — use different algorithms, different feature representations, different hyperparameters, etc. Correlation analysis of predictions can help identify which models provide unique signal vs redundancy. The key is having a complementary set of models in the ensemble, not just combining slight variations of the same approach.
11. Beware of overconfidence
“There’s a whole branch of research on [how] neural networks are typically overconfident on out of distribution data.” ML predictions can be misleadingly overconfident on unusual data. Sebastian describes, “So what happens is if you have data that is slightly different from your training data or let’s say out of the distribution, the network will if you program it to give a confidence score as part of the output, this score for the data where it’s especially wrong is usually over confident… which makes it even more dangerous.” Validate reliability before deployment, rather than blindly trusting in confidence scores. Confidence scores can often be high for wrong predictions making them misleading for unfamiliar data.
Validate reliability before deployment, rather than blindly trusting in confidence scores.
12. Leverage Large Language Models responsibly
ChatGPT (and other generative models) are good brainstorming partners and can be used for ideation when “it doesn’t need to be 100% correct.” Sebastian warns that the model’s output should not be used as the final output. Large language models can generate text to accelerate drafting but require human refinement. It’s important to be fully aware of the limitations of the LLMs.
13. Remember to have fun!
“Make sure you have fun. Try not to do all at once.” Learning is most effective and sustainable when it’s enjoyable. Passion for the process itself, not just outcomes, leads to mastery. Sebastian emphasizes to remember to recharge and connect with others who inspire you. Sebastian shares, “Whatever you do, have fun, enjoy, share the joy… things are sometimes complicated and work can be intense. We want to get things done, but don’t forget… to stop and enjoy sometimes.”
Conclusion
The path to understanding machine learning and AI can be arduous and filled with obstacles. At times, it can be overwhelming and frankly it’s impossible to explore, consume and absorb it all. The sheer scope of the field stretches boundlessly in all directions, and at times the dizzying pace can be disorienting. With perseverance and the right teachers and mentors, we can traverse the ever-changing landscape of Machine learning.
Sebastian’s wisdom highlights that patience, passion, simplicity and openness can help unlock machine learning for beginners and experts. We are able to delve deepest when standing on firm fundamentals, innovating through openness and diversity, embracing the journey and welcoming change. Sebastian’s insights help unravel knotty concepts, prioritize core fundamentals and retrain the spirit of curiosity that drew learners to AI in the first place. Educators like Sebastian Raschka make the ML journey easier and can help lighten the way.
Listen on your favorite podcast platform:
https://rss.com/podcasts/learning-from-machine-learning/
Resources to learn more about Sebastian Raschka and his work:
Machine Learning with Pytorch and Scikit-Learn
Resources to learn more about Learning from Machine Learning and the host: https://www.linkedin.com/company/learning-from-machine-learning
https://www.linkedin.com/in/sethplevine/
https://medium.com/@levine.seth.p
References from Episode
https://scikit-learn.org/stable/
http://rasbt.github.io/mlxtend/
https://github.com/BioPandas/biopandas
Understanding and Coding the Self-Attention Mechanism of Large Language Models From Scratch
Andrew Ng - https://www.andrewng.org/
Andrej Karpathy - https://karpathy.ai/
Paige Bailey - https://github.com/dynamicwebpaige
Contents
01:15 - Career Background
05:18 - Industry vs. Academia
08:18 - First Project in ML
15:04 - Open Source Projects Involvement
20:00 - Machine Learning: Q&AI
24:18 - ChatGPT as Brainstorm Assistant
25:38 - Hype vs. Reality
27:55 - AGI
31:00 - Use Cases for Generative Models
34:01 - Should the goal to be to replicate human intelligence?
39:18 - Delegating Tasks using LLM
42:26 - ML Models are overconfident on Out of Distribution
44:54 - Responsible AI and ML
45:59 - Complexity of ML Systems
47:26 - Trend for ML Practitioners to move to AI Ethics
49:27 - What advice would you give to someone just starting out?
52:20 - Advice that you’ve received that has helped you
54:08 - Andrew Ng Advice
55:20 - Exercise of Implementing Algorithms from Scratch
59:00 - Who else has influenced you?
01:01:18 - Production and Real-World Applications - Don’t reinvent the wheel
01:03:00 - What has a career in ML taught you about life?
Share this post