Centro para la Biodiversidad Marina y la Conservación
Vision: Create spatially explicit models that leverage fishing quotas according to ecological productivity limits under a variety of climate scenarios, with the goal of creating an open source platform that can monitor extreme environmental events and advise the Mexican fishing sector on adaptive management
Data preparation and Cloud Infrastructure setup
More computing power, Cloud infrastructure
Citizen Science/Crowdsourcing, Open Source, Predictive Modelling, GIS Mapping
Fishery managers, Policy makers
cbmc Project TEAM
6 Members were on the Data Team:
Integration of environmental, ecological, and economic traits in a model will yield a superior adaptive management strategy than the Maximum Sustainable Yield model currently used. Such a system could help Mexico move towards sustainability goals while helping the fishing sector save millions of dollars in potential losses.
The “Maximum Sustainable Yield” model >
Within the Accelerator program, we are tackling a worldwide important problem: how fisheries will react to climate change. We are implementing semantic trajectory modeling to satellite geolocations of industrial fishing vessels that define critical fishing grounds for the fleet. We then model fishing data with ocean variables and predict future fishing efforts and catches of the fishing grounds according to climate change models. In the future, marine productivity will change as the temperature rises. The fishing industry and millions of people living in coastal areas will need to adapt to maintain their livelihoods and sustainably extract marine resources. Our model outputs will better inform the fishing industry about future climatic scenarios.
Fisheries are increasingly impacted by climate change . . .
Fisheries worldwide are under threat from climate change. As greenhouse gas emissions into the atmosphere increase, that CO2 emitted doesn’t just trap heat in the atmosphere to warm the air temperature. Much of the heat is absorbed by the oceans that cover 70% of our planet which then experience a marine heatwave on and below the surface. These heat waves can lead to harmful toxin-releasing algal blooms, contribute to fish die-off in species that are maladapted to rapid temperature change, and lead to the ultimate relocation and decline of populations. Marine heatwaves have the potential to destabilize fragile ecosystems and the communities that rely on them for a sustainable food source.
Fisheries managers need new insights to adapt . . .
A prolonged warming period between 2014-2019 was particularly destructive to the fisheries industry along the North American pacific coast. Crucially, given the vastness of the Pacific Ocean, this warming was not uniform in nature across the entire ocean’s surface. Underwater currents, prevailing winds, and other factors influence how much and where ocean surface temperature is warming. If fishery managers can determine with greater certainty where ocean temperatures are higher than others, they might find clues as to where fish populations may migrate, seeking refuge from a warming climate.
Quantifying impact with vessel monitoring systems . . .
The CBMC team sought to better understand the impacts of climate change on fisheries in the Mexican Pacific, and in doing so, to provide a forecast of spatiotemporal changes in a vessel’s catch according to different climate warming scenarios.
In order to do this, the team had to curate a dataset from the Mexican government’s Vessel Monitoring System (VMS), which tracks latitude, longitude, speed, and direction of each vessel on a hourly interval from 2008 to 2021 using satellites and GPS. The full VMS dataset reports each vessel’s name, unique ID, speed, and navigation bearing. After cleaning and standardization, the dataset has ~150 million data rows, representing tracks of 2287 industrial vessels, totaling approximately 35 gigabytes. This is far too big to store and work with efficiently on Dropbox.
HOW IT Started:
A “leap of faith”
The CBMC team came across the call for proposals, and were excited at applying because it sounded like something different from the norm. Fabio was the organization’s data science champion, and led the charge, first convincing program director Marisol that investing time in this initiative would reap future benefits. They then turned to the program’s board for formal approval.
“CBMC Board members were initially reluctant to participate, and the team had to push hard this. At first they didn’t see what we saw as a transformational opportunity. We kept working on this, and redefining what we could do with the organization with all of the data we had, and recruited fiscal sponsors to help”
Joining the PJMF Data Practice’s Data to Drive Climate Action Accelerator “was a leap of faith and a really good one. It was the first grant that CBMC ran by itself, received, managed and reported by the organization. Ultimately, it enabled us to get other funding.”
Constructed datasets to merge with vessel tracks, i.e., the vessels’ metadata containing the species and quantity each vessel declared to have caught when they reached port, their fishing permits, and fishing gear. This data wrangling creates a database of semantic trajectories enriched with information to enable results interpretation.
“The C4 model – i apply it all over the place now. Planning these stages were always inside our logical workflow. C4 model on paper is very important. Not only for this project, but the others we ran in parallel. Most of the time, raw data that needs to be wrangled and associated with govt databases, and this approach can be applied to all projects we do. Very important to structure the work in this way.”
- Serverless queries on AWS with athena – to minimize cost of overspend and only buy what server space you need!
- “Docker containers were a problem. We had heard of them before the project but never really had to use them. We were fighting with dependencies. Trying to make spatial packages work with R was painful. I hated containers at the time, and now I love them.”
- “The McGovern Data Practice team really helped us with here – we were struggling. We were about to transfer everything from an AWS server to another one. We were going crazy because we were still generating costs; the technical team helped us cancel this and with a lot of smaller things”
- “At the beginning we were supposed to use Cloudera data platform. We had problem with AWS – that was a major delay in the timeline – AWS didn’t want to increase our limitation. It was crazy difficult to understand the combination. We paid for additional AWS support to solve the problem”
Read the Final Insight Report
before the program:
Dropbox folder, 1 laptop, Excel Spreadsheets
After the program:
Reinvented data pipeline with AWS, applied novel ML methods to fisheries analysis (SageMaker with RMSProp optimization)
before the program:
“We were at level zero doing descriptive statistics with a bunch of data in excel spreadsheets. Before our strength was bringing back the data to the community. Having a digital record for our users was and still is important. If we want to create impacts on a higher level with decision making, spreadsheets are not enough.”
after the program:
“We have a lot of data, and when we put this into the blender of analysis, we are now able to come up with different solutions, recommendations. We went up the ladder on obstacles and answers. Now we know ‘Why more data’ and ‘what to do’ with more data. That’s what’s making a difference with what we’re doing. Now science is being used in lawmaking in baja california state.”
Take the time to figure out data preparation and infrastructure
“Gathering data is not most important. Invest in what you can do with the data.”
“For small organizations, it’s difficult to focus on improving data practices. We now have much more data than we’ll ever be able to analyze. It’s worth it to stop gathering and start reorganizing. Think about questions that help you move forward from what data you have.”
Materials and Methods of Analysis – Read the CBMC Final Insights Report >
Infrastructure and human capital are big hurdles to start making progress with data work
In context of this project, ML has been limitation of infrastructure. As a nonprofit, maintaining the level of cloud expenses can be tough. Using the cloud to host data is still very expensive, scaling this work up to serve more users is difficult given budgetary constraints. Development Funders tend to not give money for server space. They ask “$30,000 for what?” In order to overcome this challenge, the team built partnerships with academic institutions in order to gain access to servers and infrastructure.
For talent acquisition in the marine sciences, the team knows that young people want to go into the field and dive, rather than to stay in the office and program. There are few people who choose the latter path. The team leveraged a scholarship program from the government on data science, and have support from 4 interns directly helping carry the work forward.
Big data analysis can’t just be on personal computers
- Not everything can be stored on dropbox, especially considered the size of the datasets that the government compiles on industrial fishing fleets
- Previously the team were running models on laptops without enough RAM
Documentation is key, through the C4 model tool and a tactical roadmap
“Before this project, programming was in people’s heads, and if our programmers were missing or left the project, we would lose everything and all progress made”
– Fabio Favoretto
Local and national implementation strategy is founded on good science communication and organizing
“For us to provide general reports for fisherman, NGOs, government agencies to make decisions, we were looking for a program to be able to run this analysis with lots of robust data, forecasts, predictions”
Engaging decision makers has been a long and hard process, involving many meetings. The team hosted a forum in 2022 to present their science in general language. “When you ask scientists why they don’t relate to government, it is typically because they don’t understand laws, and scientists don’t know where their data can help. Government says they can’t understand what scientists are doing that could help them make policy.”
“We started building the bridges here.”
The team conducted an experiment with 50 scientists, asked them to speak about the science at a high level, ask them ‘why dont they make it available…?”, asked ‘why don’t you share your data?” Lots of people from government agencies came to the event, and afterward they started inviting the scientists to help on equilibrium of ocean, climate policies. The state is now more concerned about taking a step forward in what the country is asking.
“We’ve gone through the door fully and have a seat at the table to find ways to influence laws”
how PJMF’s Data Practice Accelerator grant helped
“The McGovern Data Practice Accelerator was incredible in changing how we manage all our programs across the portfolio. It ushered in a revolution on how we use data. For the organization it was a tipping point for us. “We needed to reinvent ourselves and for the organization to gain clarity on our goals and partnerships”.
“Now the results from our work will be published in science advances! A top-1% journal by impact factor”
“We weren’t able to say what we do as an organization before this experience. This work defined us, and came in a very interesting time at the 10 year anniversary to redefine the goals and scope. We didn’t know we were able to do all of the things we now have in our toolkit”
3 Primary Project Areas:
Working with blue carbon in estuaries on a program to understand how to protect more mangroves and sea grasses, and the infrastructure of deltas that can be game changing in tackling climate change
Atlas aquatic project estimating the value of the scuba industry in Mexico, which is worth more than the fisheries industry. Given how critically at risk Mexican reefs are, the team started asking why we’re not doing more to protect them. The team will be partnering with NatGeo, Scripps, UBC to do a worldwide study, and working with PADI on a second phase – to try to scale with other areas.
Working with industrial and local scale fisheries trying to figure out how Mexico can reach the 30 by 30 target. The team will be partnering with skylight and protected seas. Using other tech platforms, to find new ways to protect oceans with science-based solutions.
Within the Data Practice Accelerator’s Data to Drive Climate Action cohort, we are tackling a worldwide important problem: how fisheries will react to climate change. We are implementing semantic trajectory modeling to satellite geolocations of industrial fishing vessels that define critical fishing grounds for the fleet. We then model fishing data with ocean variables and predict future fishing efforts and catches of the fishing grounds according to climate change models. In the future, marine productivity will change as the temperature rises. The fishing industry and millions of people living in coastal areas will need to adapt to maintain their livelihoods and sustainably extract marine resources. Our model outputs will better inform the fishing industry about future climatic scenarios.