Many engineering students in India learn to build lab-perfect models to capture data, but few are taught how to audit the data for real-world application and measurable socio-economic impact. A case study identifying a 30% error margin in Delhi’s 2023 air quality index (AQI) data shows exactly why ‘Forensic Data Literacy’ needs to become a part of the core engineering curriculum in the country. The importance of accurate AQI readings Each winter, instead of being able to enjoy misty mornings with tea on their balconies, residents in India’s national capital region (Delhi-NCR) are forced to brace themselves with masks, air purifiers and AQI apps to survive (slowly die) through yet another day in one of the world’s most polluted cities. AQI readings assume greater significance at this time, as they provide a benchmark for the government’s imposition of restrictions on citizens’ day-to-day activities in general, and the region’s larger economic activities in particular, under Graded Response Action Plans (GRAP) of varying intensity. Yet, a structural vacuum resulting in a “Truth Gap” at the level of AQI monitoring itself adds on to the crisis, rather than mitigating it by resulting in premature or delayed impositions of restrictions. Hence, the importance of accurate AQI readings – something which traditional AQI monitoring systems have, arguably, failed to deliver on. The truth gap For decades, traditional AQI monitoring systems have been based on a linear principle – to mechanically measure fine particulate matter of specific microns (PM 2.5) and give an accurate reading of the same. However, AQI is not just a number and the real world does not constitute an ideal lab setting. Several parameters such as Relative Humidity (RH), barometric pressure, and even temperature constantly and numerically interact with particulate matter to change the measurement of the pollutants actually presentin the air. The pollution lock effect Traditional AQI monitoring systems did not account for this at all – thereby showing inaccurate AQI readings. For instance, monitoring factors traditionally looked at relative humidity and atmospheric pressure as separate pieces of information. However, “Pollution Lock” is the point at which the air is no longer dispersing the pollutant, and the AI sees that the air is trapping the pollutant instead due to the interaction between variables like relative humidity and atmospheric pressure. This risk contour map, thus, visualizes the non-linear ‘Red Zone’ where specific thresholds of atmospheric pressure and relative humidity converge to create a structural trapping effect for urban pollutants. By extension, using a correlation matrix, the dynamic bonds between weather variables are quantified – identifying the precise strength of the relationship between humidity, pressure, temperature, aerosol hygroscopicity (which is a substance’s ability to attract, adsorb, or absorb water molecules directly from the surrounding atmosphere), and the resulting AQI deviations – resulting in a correctional framework that can give a holistic and accurate view of the AQI at a given time. Forensic audit of Delhi’s air quality for actual AQI readings Forensic audit of Delhi’s Air Quality in 2023-2024 for actual AQI readings As a case study, to bridge this ‘truth gap’ in Delhi’s AQI readings, the Polynomial-Enhanced Random Forest Regression (PERFR) model was used on the air quality data in Delhi during the 2023-2024 period. Contrary to the former approach that considered real environmental factors affecting air quality only as secondary noise, this model tried to understand the interaction between the different factors affecting the aerosol mix being monitored by the AQI sensors. Various interaction terms, like the (Humidity * Pressure) interaction, being more accurately predictive than individual weather variables, the following feature-importance ranking table – illustrating how the interaction between secondary variables (such as temperature and humidity) mathematically outweighs individual metrics in determining predictive accuracy – was arrived at. The model found Relative Humidity (RH) to be the reason for the error in approximately 60% of the AQI data for Delhi. The model then plots the hygroscopic growth curve for the aerosol mixture found in the Delhi region and describes how water molecules coat dust particles to increase the apparent size of the dust as the sensor’s laser measures it – thus directly distorting AQI readings. Ultimately, it was found that the sensors always over-reported in the winter months in Delhi with the sensors found to be over-reporting on 4 out of 5 days of high humidity in the 2023-2024 audit – resulting in a “ghost spike”. On the other hand, the AQI monitors were under-reporting in extreme arid conditions (RH < 25%), with these conditions rarely occurring in the peak winter smog season. The impact of the ghost spike The ultimate test of this model happened from November 3-5, 2023, when Delhi’s air quality hovered around the hazardous saturation level of 479-500, leading to drastic GRAP restrictions on Stage III/IV, causing severe economic disturbances. Forensic results showed humidity peaked on these mornings, with relative humidity measured at 85%-92%. Therefore, applying the AI correctional framework, the sensors were over-estimating the amounts by 30%. Ghost data: Approximately 150 points of the hard spike were recognized as water molecules fooling the sensors (hygroscopic growth). Ultimately, it so happened that the decisions about taking emergency economic actions under GRAP-III/IV restrictions were based on atmospheric interference affecting the livelihoods of millions of Delhi-NCR residents. Real-world deployment Using AI and the PERFR model, an industrially applicable invention was made, which if deployed within the real-world air quality monitoring infrastructures, would directly influence socio-economic decisions affecting the daily lives of a city’s residents – an indispensable need for Smart Cities. The system can be applied to government air quality monitoring networks, smart city infrastructures (where real-time, reliable pollution measurements are critical for public safety and urban planning), low-cost sensor networks to upgrade their effective accuracy, and in environmental compliance systems used by regulatory agencies for emissions monitoring and enforcement. Why engineers need forensic data literacy This is an achievement which most engineers can replicate, provided they are adequately trained in forensic data literacy and on integrating AI to develop real-world applications. By integrating AI architectures for filtering environmental “noise” at the point of collection, the “Truth Gap” can be eliminated, and this was only one such example which can open doors to several inventions. There is a growing need in developing economies like India for their aspiring engineers and researchers to be trained to transition from academic publication to achieving measurable social impact, because in these countries the tightrope walk between economic success and environmental hazards is much longer than developed nations. Every invention which can reduce this distance is a direct contributor to India’s journey for development and no one is better placed to contribute to this than India’s current and future engineers. (Kunal Goyal is an engineer and researcher specializing in advanced sensor technologies, energy systems, and industrial materials strategy.) Share this: Click to share on WhatsApp (Opens in new window) WhatsApp Click to share on Facebook (Opens in new window) Facebook Click to share on Threads (Opens in new window) Threads Click to share on X (Opens in new window) X Click to share on Telegram (Opens in new window) Telegram Click to share on LinkedIn (Opens in new window) LinkedIn Click to share on Pinterest (Opens in new window) Pinterest Click to email a link to a friend (Opens in new window) Email More Click to print (Opens in new window) Print Click to share on Reddit (Opens in new window) Reddit Click to share on Tumblr (Opens in new window) Tumblr Click to share on Pocket (Opens in new window) Pocket Click to share on Mastodon (Opens in new window) Mastodon Click to share on Nextdoor (Opens in new window) Nextdoor Click to share on Bluesky (Opens in new window) Bluesky Like this:Like Loading... Post navigation International Kite Festival in Hubballi starting Feb. 7 Bank of Baroda Q3 net profit grows 4.5% to ₹5,054 crore