This project demonstrates a complete data analytics pipeline starting from raw real-world flight data to professional visual dashboards using SQL Server and Power BI. It showcases data import, cleaning, optimization, transformation, and dynamic DAX-based visual reporting.
- Title: Flight Delay and Cancellation Dataset (2019–2023)
- Source: https://www.kaggle.com/datasets/patrickzel/flight-delay-and-cancellation-dataset-2019-2023
- Size: 6+ million rows
- SQL Server 2022 – For data import, cleaning, transformation
- Power BI Desktop – For data visualization and dashboard creation
- DAX (Data Analysis Expressions) – For creating measures and calculated columns
- Windows 10 Pro – Local development environment
- Downloaded the CSV dataset from Kaggle.
- The dataset includes millions of flight records from 2019 to 2023.
- Created database and raw tables using SQL Server Management Studio (SSMS).
- Used
BULK INSERTto load data from CSV files. - Due to Power BI performance limitations on my system, extracted a clean 100,000 row sample into a new table:
Flights_1Lakh.
- Removed rows where important columns like
FL_DATE,AIRLINE,FL_NUMBER,ORIGIN,DEST,DEP_DELAYare null. - Converted and standardized time formats (e.g., extracting hours from
DEP_TIME). - Removed unwanted characters (e.g., double quotes) from text columns.
- Used
CAST,TRY_CASTto handle type mismatches. - Dropped columns with too many nulls or less relevance.
- Removed duplicate records using composite keys.
- Connected Power BI to SQL Server.
- Imported
Flights_1Lakhtable. - Created calculated columns and DAX measures.
- Built visuals including bar charts, KPIs, and slicers.
- Total Flights
- Cancelled Flights
- Average Departure Delay
- Average Arrival Delay
- On-Time Departure Percentage
- Departure Hour (extracted from
DEP_TIME)
- Bar chart: Flights by Hour of Day
- Line chart: Average Delay by Airline
- Pie chart: Cancellations by Airline
- Table: Route performance
- KPI Cards: Total Flights, Average Delays, % On-Time, Cancellations
view.pbix– Power BI project fileview.pdf– Dashboard exported as PDFSQLQuery1.sql– Complete SQL data cleaning scriptlink.txt– Kaggle dataset URL
Developed a complete data analysis pipeline using a real-world flight dataset. Cleaned over 6 million records in SQL Server and prepared a 100K row sample for Power BI. Built a fully interactive dashboard with DAX measures and data modeling, covering performance metrics, airline delays, and time-based trends.
- Clone or download this repository.
- Open
view.pbixin Power BI Desktop. - Review
SQLQuery1.sqlto understand SQL cleaning steps. - Connect to your own SQL Server database if you want to recreate from scratch.
- Use
view.pdfto preview the dashboard without opening Power BI.
- Add Power BI Gateway for scheduled refresh
- Create geospatial maps for airport-to-airport routes
- Add airline metadata for more filter options
For feedback or collaboration:
- Lochan Singhal
- www.github.com/Analyst-Lochan
"Clean data fuels clean insights."
This project highlights my ability to perform end-to-end data analysis using SQL Server and Power BI.