Pecan Street Inc.
Pecan Street Incorporated
Vision: Apply Machine Learning techniques to improve soil carbon content estimates and promote greater use of regenerative farming practices. Provide a computationally efficient method for simple “what-if” scenario planning tools for farmers making decisions about what to plant and when to maximize soil carbon sequestration.
Read about Soil Carbon Sequestration Regenerative Agriculture from the NRDC >
Identify potential pathways to reduce and optimize sampling with model estimations that improve soil carbon measurement and verification
PSI Project TEAM
2-Person Tech Team
Applying Artificial Intelligence approaches to soil carbon modeling can become a reliable, low-cost tool for estimation of soil carbon sequestration and so improve the market utility of these approaches; an Artificial Neural Network (ANN) proxy model could unlock efficiencies for agricultural users seeking to run what-if scenarios.
Research through the PJMF Accelerator was not a slam dunk
“The Patrick J. McGovern Foundation (PJMF) Accelerator created an exciting opportunity for Pecan Street’s data team to develop foundational skills in using AI/ML techniques and expand our experience with Amazon Web Services (AWS). With the advice and support of Chelsey Walden-Schreiner and others on the PJMF team, we built our first artificial neural network (ANN)!”
Soil Carbon Capture . . .
Soil carbon capture has become a hot topic in recent years as countries seek to fulfill their commitments in the Paris agreement of reducing emissions of the atmosphere’s most abundant greenhouse gases. Soil does not simply trap carbon by itself. It needs help from chlorophyll, the intracellular engine that drives all plant-life on earth, to capture atmospheric CO2 and convert that carbon dioxide into complex sugars that feed plant growth. Over time, as plants die and decay, molecules containing that captured carbon in their roots, stems and leaves make their way back into the soil, buried underground.
The impacts of farming on soil carbon retention . . .
Left alone, landscapes can sequester massive amounts of atmospheric carbon in their soil each year, becoming evergreen carbon sinks as long as they remain untouched. Food production for a growing population means that a significant proportion of land cannot be just left alone. Modern farming practices have a significant impact on soil’s ability to retain that captured carbon. As farmers till soil with their tractors before planting crops, pockets of air form deep underground, providing the conditions for microbes to consume the decaying plant matter, which in turn emits that captured carbon as CO2 back into the atmosphere. Some estimates indicate that soil-carbon emissions account for up to 20% of yearly human induced greenhouse gas emissions globally. For some perspective, that’s more emissions than all the vehicles on earth emit annually.
Adoption of new farming practices to reduce Carbon emissions . . .
Pioneering regenerative farmers, with support from climate-friendly policy and companies looking to offset their carbon footprints, are adopting practices that help keep more of the carbon in their fields from ending up back in our air. Low- or No-till agriculture, in combination with planting certain cover crops, can make a huge difference in how much carbon is kept underground. Yet it’s extremely difficult for a farmer to understand exactly how much carbon is sequestered in their fields with any degree of accuracy. Farmers face complex decisions on what and when to plant to increase carbon storage in fields while keeping production yields high. Finding an answer to the soil carbon capture equation is becoming increasingly important as carbon credit markets develop and farmers have new monetary incentives from companies and governments ready to pay them to offset emissions by the metric ton.
Current decision-making tools are lacking . . .
Not enough user-friendly tools exist to help farmers make these tradeoffs and communicate the environmental benefits of their choices confidently and accurately. The team at Pecan Street Inc. sought to create a tool that helps farmers model ‘what-if’ scenarios that can aid in decision-making process and help them understand just how much carbon they are capturing in their fields and what they can do to keep it in the ground.
HOW IT Started:
“The work was so interesting, I’d sit and wait for results and try to tune data and then I would hear my wife behind me and I didn’t even know she was home yet … This was one of the more fun projects i’ve worked on the past couple years”
Pecan Street wanted to improved the Decision Support System for Agrotechnology (DSSAT) by establishing the sensitivity of each of the approximately 140 inputs and providing viable recommendations to fill data gaps.
The team set out to understand the sensitivity of the industry standard, computationally intensive DayCENT model to inputs (which users need specialized compute knowledge to run).
In reviewing how this model worked, and what resources and knowledge it took to run, the team determined quickly that that a small farmer would never be able to use this at their kitchen table.
“Perhaps we were mistaken in our original assumption that DayCENT or other big agricultural models can be treated as a black box for analysis like this.”
The team wondered if it would be possible to create a proxy model with just 10 inputs that might give them a useful result that helps farmers make better decisions to maximize soil carbon storage.
The project plan was well aligned with the PSI mission of driving carbon emission reductions technologies, by identifying missing datasets and trying to fill gaps. There was enthusiastic support from their board, as it seemed like the perfect opportunity to ‘do AI stuff’ for the first time.
Trust in results was a major concern the whole time!
“We looked at input/output histograms and said ‘i dont buy it”.
It is often difficult to understand let alone communicate the complexity behind a ‘black-box’ML model, which requires the users to confront the possibility that the outputs generated may not be as anticipated, stemming from incomplete or flawed representation of the system being modeled.
Also, data scaling was not always correct for various toolsets in the early trials; scaling input/output variable errors happened frequently.
Learn more about data scaling here >
Studying the underlying agronomy and model and choosing variables
Determining tools for use and development of variable distribution/sampling
- The team dug into the math and realized the potential of Latin Hypercube sampling to fully activate the model and ensure equal coverage at the extremes and center of the variation of the variables
- Experimented with different optimizers (i.e., SGD, ADAM and SGD with momentum), activation functions for the hidden layers (Tanh and ReLU), and scalers (MinMax PowerTransformer and RobustScaler)
- Explored PyTorch, Tensorflow, and Numpy arrays to generate tensor output files for import into the ANN training software. The ultimate decision was to use Pytorch.
before the program:
The PSI team were already experienced with ‘Big Data’, and fluent in the language of supercomputing, but the team had not worked with Machine Learning.
The PSI team’s big data fluency was good already, what changed is their analysis fluency. Before, when reading papers across sectors they had to trust what they were learning from other ML teams. Now when they get to those sections they can make good judgment on what the ML techniques were and if they are a good choice of technique.
Creating an effective proxy of the complex DayCENT model was challenging
The processing speed of an ANN is extraordinary. It’s possible to explore what-if scenarios millions of times faster than it would be when running the native DayCENT model.
Learned a lot about running jobs in AWS
“We had run docker before, in a more manual process. Building the docker container was a pain, but really fun.”
Fail early, often, and just get over it!
“Failing means you need to pivot”
Not all data explorations produce the hoped-for results
“This project was pure research, and with that work sometimes you’re not going to get to answer you think you’ll get to. Not all data explorations produce the hoped-for results, especially when trying to apply AI/ML techniques for the first time.“
Be cautious in overinterpreting results!
Promising results experienced with a simple ANN begin to break down under closer scrutiny and as the number and complexity of inputs and hidden layers increases.
Data was not always correct for various toolsets
Data was not always correct for various toolsets in the early trials and PSI ran into scaling input/output variable errors.
Don’t make assumptions
Perhaps we were mistaken in our original assumption that DayCENT or other big agricultural models can be treated as a black box for analysis like this.
We can’t build soil carbon sensors
The simplest theoretical solution, building soil carbon sensors, was not possible.
Tough to build trust in outputs
Trust in results was a major concern the whole time! Sometimes we looked at input, output histograms and said ‘i dont buy it’
Accelerating the adoption of regenerative practices requires reliable low-cost soil carbon monitoring and verification (M&V). Pecan Street’s ultimate goal is to use the research outputs to lower the cost of M&V, which will allow more farmers to participate in programs to help them transition to regenerative practices.
“We included some of what we’ve learned on proposals we’re working on now – and we have no fear in proposing some of these and other similar techniques.”
“The PJMF Data Practice’s Data to Drive Climate Action Accelerator was a great program -the way they organized, they let you flounder for a while and then pull you back out of the ditch. I’d do it again in a heartbeat if i could”
There are currently no users for their product, but the team is committed to troubleshooting the model.
- PSI is still trying to understand where major issues are coming from
- Looking at methods for error collection in our data – we have a simple rule-based collection
- Data analyst is looking for errors each week
- Looking at methods for error collection in our data – we have a simple rule-based collection
- The team have not built another ANN yet from scratch on other datasets, instead have have started looking at putting together a larger Convolutional Neural Network with different training assumptions, which is giving them hints at better results.