Modeling of Cu-Au prospectivity in the Carajás mineral province (Brazil) through machine learning : dealing with imbalanced training data

Elias Martins Guerra Prado; Carlos Roberto de Souza Filho; Emmanuel John M. Carranza; João Gabriel Motta

Modeling of Cu-Au prospectivity in the Carajás mineral province (Brazil) through machine learning : dealing with imbalanced training data

Elias Martins Guerra Prado, Carlos Roberto de Souza Filho, Emmanuel John M. Carranza, João Gabriel Motta

Material

ARTIGO

Idioma

Inglês

Resumo

Machine learning (ML) is becoming an appealing tool in various fields of Earth Sciences, especially in mineral prospectivity mapping (MPM) to support mineral exploration. ML algorithms are designed to assume a relatively balanced amount of training data for the estimation of the decision boundaries... Ver mais

Machine learning (ML) is becoming an appealing tool in various fields of Earth Sciences, especially in mineral prospectivity mapping (MPM) to support mineral exploration. ML algorithms are designed to assume a relatively balanced amount of training data for the estimation of the decision boundaries between the classes of interest (i.e., in MPM: mineralized- and non-mineralized locations). However, in MPM the numbers of mineralized and non-mineralized locations are naturally imbalanced, as the number of known mineral deposit occurrences (as a proxy of mineralized or positive class) are naturally much smaller than the number of non-mineralized locations (the negative class). The use of imbalanced data leads to difficulties in the training of ML models for MPM, due to the learning bias towards the features of the predominant (i.e., negative) class. In the present study, using support vector machine for Cu-Au prospectivity modeling in the Carajás mineral province (Brazil), we evaluated the effects of Synthetic Minority Over-sampling Technique (SMOTE), which addresses the issue of imbalanced training data on the performance of MPM. The original training data for the positive (i.e., minority) class was modified by over-sampling the mineralized locations using SMOTE and by randomly under-sampling the non-mineralized locations at different proportions, producing 400 training datasets with proportions of mineralized-to-non-mineralized samples ranging from 600:30 to 30:600. Each of these individual training datasets was used to evaluate the performance of MPM under different proportions of mineralized-to-non-mineralized samples. The performance of each prospectivity model was objectively evaluated using the F1 score and the success-rate curve. The results show that SMOTE can significantly increase the performance and the spatial efficiency of MPM. The main differences between the performances of the derived prospectivity models illustrate the sensitivity of MPM to the number of samples and distribution of classes in the training data. According to the results, better performance is achieved using SMOTE when the prospectivity models are trained with an equal number of mineralized and non-mineralized samples. The best prospectivity model trained with a modified dataset with 600:600 proportion of mineralized to non-mineralized samples resulted in 100% classification of the training mineralized locations and almost 80% of the testing mineralized locations, and outlined only 7% of the study area as prospective Ver menos

Nota de informação sobre financiamento

CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICO - CNPQ

401316/2014-9; 309712/2017-3; 401316/2014-9

Direito de acesso

Fechado

Assuntos

Aprendizado de máquina

Artigo original

Autoria

Prado, Elias Martins Guerra, 1988-

Souza Filho, Carlos Roberto de, 1965-

Motta, João Gabriel, 1988-

Sites

DOI: https://doi.org/10.1016/j.oregeorev.2020.103611

Texto completo: https://www.sciencedirect.com/science/article/pii/S0169136819308819

Modeling of Cu-Au prospectivity in the Carajás mineral province (Brazil) through machine learning : dealing with imbalanced training data

Elias Martins Guerra Prado, Carlos Roberto de Souza Filho, Emmanuel John M. Carranza, João Gabriel Motta

Modeling of Cu-Au prospectivity in the Carajás mineral province (Brazil) through machine learning : dealing with imbalanced training data

Elias Martins Guerra Prado, Carlos Roberto de Souza Filho, Emmanuel John M. Carranza, João Gabriel Motta

Fontes

Ore geology reviews (Fonte avulsa)

Terminal de consulta web

Modeling of Cu-Au prospectivity in the Carajás mineral province (Brazil) through machine learning : dealing with imbalanced training data

Modeling of Cu-Au prospectivity in the Carajás mineral province (Brazil) through machine learning : dealing with imbalanced training data

Modeling of Cu-Au prospectivity in the Carajás mineral province (Brazil) through machine learning : dealing with imbalanced training data

Fontes