CLimate Risks to artisan enterprises
Vision: Derive critical insights from climate impacts experienced by the global informal handworker economy by analyzing trends around climate risks to artisan enterprises and their workers in order to highlight opportunities and innovative solutions to mitigate the threats.
Data management, accessibility, utilization
“We collect hundreds of data points on a weekly basis however we face challenges in synthesizing the data effectively, leveraging all the data points, and deriving key highlights from exploratory analyses rather than targeted analyses. Due to the sheer size of the datasets we have and the type of methodology we have used in the past, we tend to focus on key indicators and reactive data analysis. We were eager to set up a more proactive approach to analyzing all the data we collect and we believe there is a lot more we can do with the right tools.”
Mixed methods surveys, Data Management
key governmental, private, and public stakeholders
A project manager from the Nest team learned the needs, and managed technical support through an external contract with BlueOrange digital.
Understanding climate impacts on the artisan sector is critical to designing and scaling programs that emphasize risk mitigation and solution building
“While there are countless anecdotes about how artisan businesses have experienced negative impacts on production and worker well-being from recent natural disasters across the globe, there is no comprehensive dataset highlighting these impacts on an aggregate level.”
Globally, the informal handworker economy is estimated to include some 45 million people, most of whom are women, with limited or no access to social protections or support. Nest was founded on the belief that bringing these workers together and equipping them with training and resources will unlock previously unattainable opportunities for social inclusion and growth. The Nest guild is a network of more than 2000 artisans around the globe whose products, from textiles to crafts to jewelry, have reached over a million people.
Many of these small and medium enterprises are deeply embedded within their local economies, and as such are reliant on local materials and their businesses depend on what is happening in their environment. Extreme drought, heavy rain, wildfires, and heat exacerbated by climate change threaten to destabilize these businesses and the livelihoods that depend on them.
Given their broad international network of artisan partners, Nest has a unique ability to collect data on the impacts of climate-related disasters on these home-based artisans and handcraft producers.
In 2022, Nest joined the PJMF Data Practice’s Data to Drive Climate Action Accelerator cohort and received technical support and funding to develop a data infrastructure that advanced the systems needed to collect, retain, and analyze climate-related data appropriately
HOW IT Started:
In 2018, Nest experienced rapid growth in their programs. Until then, they had an easy time working with their data in Excel spreadsheets, but as the program grew, those datasets became too big to work with on their existing set of tools. There was not a specific moment when the team realized they were in need of a modern data pipeline; it was instead a slow build over time.
“We were not being proactive with our data. By setting up a data pipeline we realized we could activate this data and extract useful insights. Seeing what other nonprofits could do with their data increased our desire to be a more data-driven organization. We had a lot of access to data that we weren’t leveraging effectively.”
Based on the mapping of data sources and strategy discussions completed throughout the Accelerator, an initial data architecture was designed that aggregates multiple data sources using Simple Storage Service (S3) running in the AWS cloud.
Data sources pull from various locations, including Qualtrics survey software, Mailchimp, Google Sheets, Ulula (a bespoke project management application), and Docebo (a learning management system), into either an S3 bucket or directly into a Structured Query Language (SQL) data warehouse.
All data sources are refreshed and reloaded once weekly to accommodate any changes or updates to the data. The S3 bucket and other data sources are then connected to Snowflake, a cloud-based data warehousing and analytics platform that handles large amounts of structured and semi-structured data.
Snowflake provides a range of features and capabilities to support data warehousing and analytics, including support for SQL queries, data transformation and preparation, data sharing, and real-time data ingestion. It was selected as the warehousing and analytics platform over AWS Redshift because it allows automatic optimization and tuning of data and a more user-friendly interface than Amazon Redshift, which requires more technical expertise and familiarity with data warehousing concepts to manage successfully. Snowflake further connects to AWS QuickSight to provide near-real-time analytics and data visualization as information is processed and updated.
Nest and BlueOrange elected to use Fivetran for most data integration and connection tasks, as the interface allows data maintenance within Nest’s skillset and capabilities. Fivetran simplifies the process by automating the setup, maintenance, and monitoring of data pipelines, eliminating the need for manual coding or scripting. It also provides built-in data transformations and normalization, ensuring data is accurately and consistently represented in the destination system.
Grouparoo, an open-source data synchronization tool, was utilized for pipelines that could not be supported through FiveTran’s platform.”
NEST Data Pipeline:
Step 1. Collecting the data
All 2,000+ members were invited to participate in the survey to understand the impacts of climate change on Nest’s artisan business network, which was administered using an online platform. The survey consisted of a series of closed- and open-ended questions to gather information on extreme weather events experienced by the business leader and/or their workers, their knowledge and awareness of climate change, and revenue changes caused by climate-related damages.
Step 2. Consolidating the data
Consolidating the data into a data lake that stores the raw ingested data.
Step 3. Pre-processing
Pre-processing, such as binning and creating additional and inferred variables from the raw dataset.
step 4. Data processing
Data processing, accomplished by Python scripts that have been configured to run using AWS Lambda, AWS Glue, or EC2 instances before the data is written to Amazon S3.
before the program:
Google Sheets, Qualtrics, Mailchimp
After the program:
Stata, AWS sagemaker, tableau, S3 data lake, ArcGIS; Python scripts that have been configured to run using AWS Lambda, AWS Glue, Snowflake data warehouse, AWS quicksight, Fivetran for data integration, docebo, ulala (PM)
“Data fluency is important to us as an organization to learn from our programs and communicate our impact; there’s been a shift in thinking at Nest as we’ve gone through the Accelerator program. Our conceptual understanding a year ago was that data was survey information, with analog or manual storage and processing. We now know that there are so many different types of data (structured, unstructured, employee data, etc.) and our data is much more all-encompassing than we thought.”
It’s important to have an organizational approach to data governance and management.
These conversations had been very hard because of lack of data fluency and understanding, but we’re now better equipped to have them thanks to our experience in the Accelerator.
Prioritize Interoperability in Creating effective data systems
Our storage solution of choice was AWS S3 and as we started working with S3, we realized that it was important to organize it in such a way that the climate datasets could meaningfully interact with Nest’s other extensive datasets. This is essential in order to draw correlational findings, for example, to understand variations in demographics among businesses that have faced different climate impacts.
Consider the future state
Consider the future state of your datasets and interactions between varied internal and external databases.
“When we first started we were just looking at a climate dataset from a single data source; we had a ton of data on their business functions that we have collected elsewhere. We wanted to know how specific types of businesses were doing and if linking these datasets would lead to more interesting results.”
Lambda functions save time and compute power when used in survey data processing
“We are using Lambda functions to process raw survey data from our beneficiaries, and as such, we had to select between A) creating a Docker Image for the data or B) using a layer with AWS Data Wrangler. We chose to go with option B) – using AWS Data Wrangler, because it already contains most of the functionalities that would be needed for any data processing. While Docker allows for 10GB of space, we did not feel this was necessary for any of the data processing environments and there were no use-cases that would require that much memory. This also allows us to get around implementing the Lambda Runtime API.”
Organizing Content of Lambda Functions
The majority of our data cleaning and processing has been done manually, so we have had a system to download raw data, manually clean it (for example: checking data consistencies and formatting), and merge new data into the existing dataset. These processing and transformation functions will need to be replicated on our AWS platform to mirror the same formatting needs.
As such, Nest set up Lambda in such a way that each major dataset will have its own dedicated Lambda function to save computation and memory, and when each dataset is updated, there will be an update to the processed data that it maps to. They then loaded the data into a visualization tool such as Tableau to create dashboards where key insights can be presented optimally with visuals and this platform would be used to monitor key metrics on a regular basis.
“Identifying these solutions has been innovative and groundbreaking for our data systems – both with scaling our internal capacity and the potential it unlocks for our organization to share unprecedented data on the handworker sector to external stakeholders.”
“We quickly realized with AWS that we could create efficiencies in our systems organization-wide. There were different perspectives on use and access, but we took the time to work with the data lead managing mailchimp and Salesforce, and the proposed solutions were implemented, accomplishing our goal of reducing manual effort and improving efficiency.”
The project team are comfortable interacting with their data pipeline and are continuing their partnership with BlueOrange digital for ongoing maintenance and future improvements.
“We’re starting to implement a learning management system (LMS) for the businesses in the guild and a project management (PM) tool for a subset of them that are fully interoperable with our data pipeline. We also aspire to implement single-sign on (SSO) technology, improving both security and the user interface of the web platforms.”
Nest are working with the Environmental Defense Fund (EDF) on climate solutions within the US for artisan businesses, creating step by step guides to respond to climate disasters and ensuring a just transition. They are also highlighting the lived experience of female climate leaders globally, and taking into account the climate footprints of their work.
Nest’s other climate work includes in-flight research projects using data from their overhauled pipeline. Outside of their climate work, the team is focused on improving economic equality in maker businesses. They are finding more insights from their data pipeline that the vast majority of businesses are female owned or run; excited to see how we can elevate the stories of these women and equip them with the tools to succeed.