Improving Management of Aquatic Invasions by Integrating Shipping Network, Ecological and Environmental Data: Data Mining for Social Good

Authors: 
J Xu, TL Wickramarathne, NV Chawla, E Grey, K Steinhaeuser, RP Keller, JM Drake and DM Lodge
Citation: 
Proc. of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data mining (KDD)
Publication Date: 
August, 2014

The unintentional transport of invasive species (i.e., nonnative
and harmful species that adversely affect habitats
and native species) through the Global Shipping Network
(GSN) causes substantial losses to social and economic welfare
(e.g., annual losses due to ship-borne invasions in the
Laurentian Great Lakes is estimated to be as high as USD
800 million). Despite the huge negative impacts, management
of such invasions remains challenging because of the
complex processes that lead to species transport and establishment.
Numerous difficulties associated with quantitative
risk assessments (e.g., inadequate characterizations
of invasion processes, lack of crucial data, large uncertainties
associated with available data, etc.) have hampered the
usefulness of such estimates in the task of supporting the
authorities who are battling to manage invasions with limited
resources. We present here an approach for addressing
the problem at hand via creative use of computational techniques
and multiple data sources, thus illustrating how data
mining can be used for solving crucial, yet very complex
problems towards social good. By modeling implicit species
exchanges as a network that we refer to as the Species Flow
Network (SFN), large-scale species flow dynamics are studied
via a graph clustering approach that decomposes the SFN
into clusters of ports and inter-cluster connections. We then
exploit this decomposition to discover crucial knowledge on
how patterns in GSN affect aquatic invasions, and then illustrate
how such knowledge can be used to devise effective and
economical invasive species management strategies. By experimenting
on actual GSN traffic data for years 1997–2006,
we have discovered crucial knowledge that can significantly
aid the management authorities.