The visualization protocol consists of the following:
1. Timespan:
The data used was collected during the year 2018. We need to keep into account the fact that with the recent COVID-19 pandemic it may be possible that this data does not reflect anymore the current situation.
2. Data source:
The data was collected from the United States Department of Transportation, more precisely on the Bureau of Transportation Statistics section, it can be found at this link: Bureau of Transportation Statistics
3. Link to the dataset used:
The specific dataset was gathered from a Kaggle page containing a collection of Air Traffic Data with a span from 2009 to 2018, it can be found here: Airline Delay and Cancellation Data, 2009 - 2018
4. Metadata of the main dataset (every route related data derives from this):
FL_DATE (date of the flight),
OP_CARRIER (name of the carrier operator),
OP_CARRIER_FL_NUM (flight number of the carrier operator),
ORIGIN (origin airport IATA code),
DEST (destination airport IATA code),
DEP_TIME (time of departure),
DEP_DELAY (departure delay),
TAXI_OUT (time spent on the runway),
WHEELS_OFF (time when wheels are of the ground),
WHEELS_ON (time when wheels touch the ground),
TAXI_IN (time spent on the arrival runway),
ARR_TIME (time of arrival at the gate),
ARR_DELAY (delay on arrival),
CANCELLED (true or false),
CANCELLATION_CODE (determines the cancellation reason),
DIVERTED (true or false),
ACTUAL_ELAPSED_TIME (total time elapsed),
AIR_TIME (time in the air),
DISTANCE (In kilometres),
DELAY_REASON (carrier_delay, weather_delay, nas_delay, security_delay,
late_aircraft_delay).
5. Short abstract of the data visualization process:
The dataset was processed using python, it didn’t require any pre-processing and alongside the Airports Dataset containing IATA code and coordinates available on the Bureau of Transportation Statistics site, for each of the visualizations we saved a new file containing only the necessary columns for the specific use case.
6. Actions performed to obtain each visualization:
An ordered list of all the actions performed and parameters and scripts used to transform the raw data into the final visualization is available in the MAP PROTOCOL section of each map/plot.
7. Data used to obtain this visualization:
The data used in this visualization regards only the fields about position of the airports, number of flights for each route and number of flights for each airport.
The visualization shown in the Flight Routes Map represents the amount of airline traffic with respect to each air route in the USA.
Its features and visual variables are:
Yellow dots (shape and colour): depict the airports locations. The bigger the point the higher its relevance (total number of flights).
Thickness of segments (size, orientation): the thicker the segment, the bigger the number of flights in that direction.
Colour of segment (colour): doesn't have a precise meaning. It is just a palette of colours that helps the user to understand the departure and arrival airport.
The goal of this visualization is to learn the distribution of the flights contemplated in our dataset.
There is the chance to select, by clicking on the map or writing the airport code/name in the designated field, a single airport in order to show the specific connections it has with other airports.
Zoom in on the map to discover smaller airports and connections.