Data and Modeling

Using machine learning to understand and make predictions about landslide activity

Data Collection

A robust dataset collected over an extended period of time is critical to modeling landslide susceptibility.

With the help of AirWorks, a team of local pilots is flying UAVs mounted with LIDAR and multi-spectral sensors to collect data about Mocoa's upper watershed area. This feature-rich data provides a comprehensive picture of the terrain across many dimensions and will be used to train a machine learning algorithm to make predictions about where landslides might happen and why.

Model Inputs

Data and layers used to train the algorithm

LIDAR data and multi-spectral imagery captured during the drone flights are used alongside datasets provided by the Mocoa Municipal Government and CorpoAmazonia to produce a comprehensive dataset to train the model.

LIDAR-Derived Topographic Layers

Researchers at MIT will use the LIDAR point clouds to create a Digital Terrain Model (DTM). 13 features can be extracted from the DTM which can be used to create a comprehensive understanding of the terrain, from altitude and slope to the location and accumulation of water.

Hydrology Features

Multi-spectral imagery collected from the drones is used to map and measure hydrological features in the upper watershed area. Model-training layers like flow accumulation, flow direction, stream network, topographic wetness index, sedimentation transport index, stream power index, & catchment areas are all critical to understanding how water moves in this area.

Thermal Features

Satellite imagery is used to map the temperature of earth's surface. Land surface temperature governs evapotranspiration–how water is absorbed into vegetation and the atmosphere. Because Mocoa is so humid, the soils and vegetation are constantly waterlogged. This creates a problem when it rains because the water has nowhere to go but downhill.

Geological Features

The team will take and test soil and rock samples in specific locations throughout the area to study how they absorb water. Soils with a lot of clay will absorb a lot of water, whereas sandy soils discharge water with ease. Understanding the way soils and rocks absorb water is critical for understanding the conditions for landslides.

Recorded Landslide Events

By learning about landslides that have occurred in the past, the algorithm can detect patterns between site conditions and landslides that occur there. The models predictions can be tested later against what actually occurs to assess how accurate they are.

Triggering Events

Heavy, sustained rainfall and seismic activity are known to trigger landslides in Mocoa. Seismic activity resulting from mining blasts are also being investigated as a potential trigger. Studying these events and their relationship to landslides of the past helps to hone the model in on predictions.

Modeling Techniques

Different techniques are explored to find a path towards the most accurate predictions.

Neural networks do not produce the same results, so trying different techniques over time helps analysts to find those that perform the best.

ACCESS TO DOCUMENTATION

Interpretable Machine Learning Methods for Landslide Analysis by Deepankar Gupta

Landslide Susceptibility Prediction Adaptive to Triggering Events by Ikechukwu Daniel Adebi

Additional documentation will be made available here soon.

Determine feature importance

By comparing the susceptibility indexes from the LIDAR capture to historic landslide event data with machine learning, analysts can determine how prominently different features contribute to actual landslide activity.

Split the data to train, test, and validate

Before landslides occur, the model is trained to make predictions given a triggering event. After landslides occur, analysts will compare how closely the models predictions match to what happened in reality. One model will be closer than the others, and this is what will be used moving forward.

Classification

Using techniques like logistic regression, decision trees, random forests, and convolutional neural networks, analysts can determine the combination of inputs that are most significantly linked to landslide susceptibility.

Future Work

After repeated observations (thorough data collection over time) models will be refined to provide more accurate predictions.

Once there is confidence in susceptibility results, models can be used to understand risk and the nature of risk at a more detailed and specific new level.

ACCESS TO DATA

Map layers generated from this work will be made available in an interactive form on this website. Data and map layers will be available for download as soon as they are available.

Susceptibility Map

Once the LIDAR and multi-spectral data is collected, susceptibility maps will be generated right away.

Early Warning Map

This map could suggest which areas in the watershed are at the most imminent risk.

Landslide Intervention Map

By isolating only high-susceptibility areas, modeling can be performed to identify features and infrastructure most at risk to affect Mocoa.

Population Impact Map

By comparing high-susceptibility areas to densely populated areas, analysts can help estimate the number of people and buildings that could be affected.

Want to learn more about how to get involved in the project?

Community Engagement