This project is a case study from the Google Data Analytics Professional Certificate Capstone course. The main objective is to address a key business question to guide the company's marketing strategy: How do annual members (Subscribers) and casual riders (Customers) use Cyclistic bikes differently?
Cyclistic is a fictional bike-share company based in Chicago. Historically, Cyclistic's marketing strategy has focused on building general brand awareness. However, the Director of Marketing believes that the company's future success depends on maximizing the number of annual memberships.
To design effective marketing campaigns aimed at converting casual riders, we need to analyze historical bike trip data to understand how these two user segments behave differently.
- Casual Riders (Customers): Users who purchase single-ride or full-day passes.
- Annual Members (Subscribers): Users who purchase annual memberships.
- BigQuery SQL (Python Client API): Used for data exploration, integrity checks, data cleaning, and aggregation.
- Tableau: The primary tool used to design interactive dashboards and visualize data trends.
- Jupyter Notebook: Python environment utilized to connect with the BigQuery Client API, run SQL queries, and document the analysis process.
The analysis uses trip data from Q1 2019 and Q1 2020:
- Q1 2019 Dataset:
trips_2019_Q1(365,069 rows) - Q1 2020 Dataset:
trips_2020_Q1(426,887 rows)
Technical Note: Column names differed between the two datasets (e.g., trip_id corresponds to ride_id, and usertype corresponds to member_casual). The preprocessing step standardized these columns before merging.
- Duplicate Check: Verified that there are no duplicate entries in either dataset by comparing
COUNT(id)andCOUNT(DISTINCT id). - Time Range Verification: Confirmed that all trip start times fell precisely within Q1 2019 and Q1 2020.
- Logic Errors & Data Correction:
- Detected 24 records in the Q1 2019 dataset where the recorded
tripdurationdid not match the calculated duration (end_time-start_time). - Corrected these records by recalculating the duration using the
TIMESTAMP_DIFFfunction.
- Detected 24 records in the Q1 2019 dataset where the recorded
- Data Aggregation: Merged the datasets using a
UNION ALLquery, standardized user categories toSubscriberandCustomer, and filtered out trips with durations less than or equal to 0.- Total clean dataset size: 791,746 rows (saved as
CleanedTripData).
- Total clean dataset size: 791,746 rows (saved as
SELECT
trip_id, start_time, end_time, start_station_id, start_station_name, end_station_id, end_station_name,
CASE
WHEN usertype = 'member' THEN 'Subscriber'
WHEN usertype = 'casual' THEN 'Customer'
ELSE usertype
END AS cleaned_usertype,
ROUND(TIMESTAMP_DIFF(end_time, start_time, SECOND), 1) AS trip_duration,
FORMAT_DATE('%A', start_time) AS trip_start_weekday
FROM (
SELECT
CAST(trip_id AS STRING) AS trip_id, start_time, end_time,
from_station_id AS start_station_id, from_station_name AS start_station_name,
to_station_id AS end_station_id, to_station_name AS end_station_name, usertype
FROM `bike-usage-differences.bike_usage_2019.trips_2019_Q1`
UNION ALL
SELECT
ride_id, started_at, ended_at,
start_station_id, start_station_name,
end_station_id, end_station_name, member_casual
FROM `bike-usage-differences.bike_usage_2019.trips_2020_Q1`
) AS union_table
WHERE TIMESTAMP_DIFF(end_time, start_time, SECOND) > 0;- Average trip duration across all users: 1,184.13 seconds (~19.7 minutes).
- Most active day of the week (Mode of Weekday): Tuesday.
| User Type | Number of Trips | Percentage | Average Trip Duration (Seconds) | Average Trip Duration (Minutes) |
|---|---|---|---|---|
| Subscriber (Members) | 720,313 | 90.98% | 795.25 | ~13.25 mins |
| Customer (Casual Riders) | 71,433 | 9.02% | 5,105.49 | ~85.09 mins |
Key Takeaway: Subscribers represent the vast majority of trips (~91%), but Customers have average trip durations that are nearly 6.5 times longer than Subscribers.
| User Type | Day of the Week | Number of Trips | Average Trip Duration (Seconds) |
|---|---|---|---|
| Customer | Monday | 6,694 | 3,969.67 |
| Customer | Tuesday | 7,972 | 4,183.81 |
| Customer | Wednesday | 8,363 | 4,120.09 |
| Customer | Thursday | 7,771 | 7,773.23 |
| Customer | Friday | 8,508 | 5,736.57 |
| Customer | Saturday | 13,473 | 4,950.77 |
| Customer | Sunday | 18,652 | 5,061.30 |
| Subscriber | Monday | 110,430 | 822.31 |
| Subscriber | Tuesday | 127,974 | 769.44 |
| Subscriber | Wednesday | 121,903 | 711.98 |
| Subscriber | Thursday | 125,228 | 707.21 |
| Subscriber | Friday | 115,168 | 796.73 |
| Subscriber | Saturday | 59,413 | 974.07 |
| Subscriber | Sunday | 60,197 | 972.94 |
Key Behavioral Insights:
- Subscribers (Annual Members): Primarily use bikes for daily commuting (to work or school) from Monday to Friday, resulting in high weekday trip volumes and short, consistent trip times.
- Customers (Casual Riders): Primarily use bikes for leisure and exercise, with trip volumes peaking significantly on weekends (Saturday and Sunday) and featuring much longer durations.
Tableau dashboards highlighted two main trends:
- Trip Distribution: A clear breakdown showing the high volume of weekday trips by Subscribers compared to the leisure-oriented rides by Customers.
- Weekly Usage Patterns: Graphical trends showing Subscribers' activity dipping on weekends while Customers' activity surges.
Based on the insights derived from the data, we recommend the following marketing strategies to convert casual riders into annual members:
- "Commute Benefits" Marketing Campaign:
- Design marketing materials highlighting the financial savings, health benefits, environmental impact, and convenience of using Cyclistic bikes for daily commuting.
- Target casual riders who frequently rent bikes during typical weekday commute hours.
- Flexible/Weekend Membership Packages:
- Introduce a new membership tier tailored for weekend riders (e.g., Weekend-only Annual Membership) or a loyalty program that rewards week-day commuting.
- Send promotional upgrade offers to casual riders who show high weekend usage patterns.