data-science

Introduction Before I enrolled in this data analytics program, Excel was something I associated mostly with tables and basic arithmetic. I knew it existed, I had opened it a few times, and I had absolutely no idea what it was truly capable of. After just one week of structured learning, I can confidently say that my perception of Excel has shifted entirely. It is not just a spreadsheet tool; it i…

Does the First Dragon Help Scaling Compositions Win? Every League of Legends player has heard some version of this argument: "We're a scaling comp, just don't feed and we win late game." But what actually accelerates that scaling? Is the first dragon one of those levers? My hypothesis was simple: Does securing the first dragon before 10 minutes improve the win rate of scaling compositions more th…

Enterprise customers are increasingly faced with data sovereignty, compliance, and regulatory requirements, making AI systems integration increasingly important. To support these needs, the IBM SQL Data Insights Pro (SQL DI Pro), generally available as of March 2026, introduces semantic search, similarity discovery, anomaly detection, and unified analysis of structured and unstructured data in Db…
Before this week, I thought Excel was just a fancy calculator with boxes. But after three days of my Data Science & Analytics course, I realise I was wrong. Really wrong. Excel is a spreadsheet tool used by millions of people from small business owners to data analysts at giant companies. And the best part? You don’t need to be a programmer to use it. You just need to know a few tricks. Here’s ho…
This project introduces a system, SECS, that for the first time is able to work on raw spectroscopic data; data that contains impurities, solvents, experimental artefacts.
When you build an AI system for marketing performance monitoring, one tempting idea is to let the LLM decide everything. Campaign pacing is off. Creative frequency is too high. A product category is spending inefficiently. So the natural thought is: Let’s send the current issue and previous issue to the LLM and ask if this is new, recurring, worsening, or improving. Something like this: state = l…
Introduction Hello Beginners! I'm between a beginner and an intermediate in Data Science; i pursued this path, having graduated in Applied Statistics with computing. While in campus ,i didn't do much of Excel in my coursework. We were just introduced to other database software and other statistical tools like SQL and STATA. This underestimated the power of Excel making me to miss a lot then. I th…
There's a normal way to analyze your YouTube watch history. You export it from Google Takeout, open a Jupyter notebook, pd.read_json() , run a couple of value_counts() , feel a brief flicker of shame, and close the laptop. I did not do that. Instead I built a full Bronze → Silver → Gold medallion lakehouse on Databricks — Delta Lake, PySpark, an enrichment layer that calls the YouTube Data API, a…

The International Monetary Fund (IMF) has announced applications for its prestigious Research Analyst Program (RAP), an exceptional global career opportunity designed for highly talented and motivated recent graduates who are passionate about economics, finance, statistics, data science, and quantitative research. This highly competitive program gives young professionals the chance to work direct…

Mayo Clinic Comprehensive Cancer Center researchers will present more than 30 studies at the 2026 American Society of Clinical Oncology (ASCO) Annual Meeting, highlighting advances in precision oncology, early cancer detection, artificial intelligence (AI) and personalized cancer care.
Originally published at https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-05-24-fabric-ai-functions-data-workflows.html Most enterprise GenAI demos start in the wrong place. They start with a chat window. The more useful place is usually earlier: inside the data workflow, before the dashboard, before the semantic model, before the analyst has to clean the same messy text for the tenth time. T…
Scientific Reports, Published online: 26 May 2026; doi:10.1038/s41598-026-54119-6 Data-driven prediction of micro-piled raft load–settlement using machine learning and Monte Carlo simulation

What 3.9M powerlifting records tell us about competition strategy — an EDA with Python When I started this EDA project for my Data Science Master at Evolve , I picked the Open Powerlifting dataset because beyond being a gym-rat, I've always been curious about the competition strategy in powerlifting. The dataset Open Powerlifting is an open-source project that tracks powerlifting competition resu…
Most financial institutions already operate across a complex mix of legacy systems, core banking tools, payment providers, customer communication channels, data warehouses, and manual workflows. These systems still perform critical work, holding account histories, customer records, balances, statuses, treatment logic, and reporting data used every day by collections teams.
Data has become one of the most powerful drivers of business success. However, managing data remains a significant challenge. Organizations require reliable analytical systems that can store, process, and interpret vast volumes of information. No wonder Analytics & Data Management SaaS led the industry in 2025. Whether you are aiming to disrupt the financial sector or embarking on complex healthc…
Can an AI agent handle a 10M-row dataset in sub-2 seconds? By leveraging DuckDB and a columnar vectorized execution engine, I built a conversational BI agent that bridges the gap between raw data and executive decisions. In this post, I explore how we integrated generative AI with high-performance analytical databases to create a seamless experience for data-driven teams. Originally published at …
In a hospital room, a nurse scans an IV bag. A computer instantly logs the moment, creating a timestamp in a patient’s electronic record. In the data trail, it now appears that treatment has begun. But the drug hasn’t reached the patient yet. In a humanitarian group’s warehouse, another system is logging data. A shipment of supplies is recorded as received. But the system doesn’t say which donor …
Short-term forecasting of the Air Quality Index (AQI) can support public health risk management and real-time environmental decision-making. In this study, we propose a multivariate, one-step-ahead time-series forecasting approach based on a Transformer encoder. The model predicts the AQI at the next time point using an observation sequence within a fixed historical window (five time steps in thi…
Over several months, we audited 1,000 French SMB sites using a unified protocol — same criteria, same tools, same thresholds. The goal: identify what actually separates sites capturing organic traffic from sites that stagnate. The results are sharper than we expected. The headline number Technically top-tier sites — those that pass Core Web Vitals, carry valid structured data, and use a coherent …
research.ioSign up to keep scrolling
Create your feed subscriptions, save articles, keep scrolling.



