Manager   •   11 months ago

Challenge 3: Model Inference or Predictions?

Great question from one of the participants in this challenge: it sounds like the model will not really be utilized for its predictions, but analyzing the features and their impact/importance on the pollutant it is predicting. Is that a fair assumption? Opening the discussion here for response from CMC experts (and any other participants with similar questions if they want to weigh in!)

  • 1 comment

  • Manager   •   10 months ago

    This answer is from Peter: That is a great question and one that is actually not for me to decide. If, for example, you are developing a spatial model that relates land use to bacteria concentrations, if you did a great job, then the model could be used to predict places that should have low or high bacteria concentrations according to your model findings.

    If you pursue a process of modeling something and decide that you want to start with hundreds of potential explanatory variables that are then evaluated for their contribution to predicting something in the environment - e.g., bacteria levels, nitrogen concentrations, benthic macroinvertebrate community health, etc. then that modeling work might inform the potential importance of particular measures that influence what you are trying to predict. Those would be correlations. Correlation is not causation. However, a good set of metrics that correlate to what you are predicting then represent the foundations for hypotheses that could be investigated to see if we can understand causation by those factors. The model is not the end of the story, it is a step in our constant effort to better understand relationships, cause and effect, testing our assumptions and retesting them when we get new information.

    Therefore, it really depends more on what path your group pursues, how it approaches the problem.

Comments are closed.