HomeMachine LearningWhat really makes cars polluting? An in-depth analysis of CO₂ emissions through...

What really makes cars polluting? An in-depth analysis of CO₂ emissions through data science

Last updated on June 14, 2026 by the editorial team

Author: Sai Bhargav Rallapalli

Originally published on Towards AI.

Understanding Car Pollution: A Data Science Perspective

When considering how to tackle vehicle emissions, the Global Automotive Council faces a complex dilemma. Should the focus be on fuel types, engine displacements, or vehicle classes? The answer is not straightforward. Through data science, we can uncover insights that might defy common intuition and lead to more effective strategies.

What really makes cars polluting? An in-depth analysis of CO₂ emissions through data science

Building a Predictive Model to Analyze CO₂ Emissions

The article details the creation of a CO₂ emissions prediction model with a remarkable 98.8% accuracy. Starting with a comprehensive dataset of over 7,000 vehicles, the process involved meticulous cleaning of duplicates and an examination of the target distribution. To address multicollinearity, redundant fuel consumption columns were removed using variance inflation factor and Ridge regression, ensuring stability in the model.

Interestingly, high-emissions outliers were retained in the dataset. These “top 1 percent” vehicles are crucial for policymakers aiming to regulate emissions effectively. The analysis uncovers a significant reversal in fuel type assessments due to Simpson’s Paradox. While Ethanol (E85) appears to have higher emissions on average, controlling for engine size and fuel consumption reveals it as the cleanest fuel. This insight is often obscured because Ethanol is predominantly used in larger, more fuel-consuming engines.

The Role of Data Science in Policy Recommendations

By constructing a scikit-learn pipeline with one-hot encoding and evaluating model performance (achieving high R² with low error), the study highlights areas for policy improvement. The model’s weaknesses, particularly in rare alternative fuel categories, suggest targeted actions against super-emitters. Recommendations include linking fuel mandates to vehicle and engine constraints, providing a nuanced approach to reducing emissions.

For a comprehensive understanding, read the full blog for free on Medium.

Published via Towards AI

Empowering AI Education

We are building enterprise-grade AI solutions and teaching mastery through Towards AI Academy. With 15 engineers and over 100,000 students, our courses focus on practical, production-ready AI skills.

Begin your AI journey for free:

→ 6-Day Agentic AI Engineering Email Guide — One Practical Lesson Per Day

→ Agents Architecture Cheatsheet — 3 years of architectural decisions in 6 pages

Our course offerings:

→ AI Engineering Certification — 90+ lessons from project selection to deployed product, offering the most comprehensive practical LLM course available.

→ Agent Engineering Course — Hands-on experience with production agent architectures, memory, routing, and evaluation frameworks, built from real-world enterprise engagements.

→ AI for Work — Learn to understand, evaluate, and apply AI in complex work tasks.

Note: The article reflects the views of the contributing authors and not necessarily those of Towards AI.

For further insights and detailed analysis, visit the full article Here.

“`

Must Read
Related News

LEAVE A REPLY

Please enter your comment!
Please enter your name here