City of Boston Office of Participatory Budgeting

City of Boston Office of Participatory Budgeting

Outcomes

Outcomes Delivered

Voyatek’s analysis identified 9 main topics reflecting community priorities, including Expanding Economic Opportunities, Housing Support and Resources, and Community Health and Wellbeing. Each topic was further divided into various subtopics, providing a detailed breakdown of specific resident concerns. These topics and subtopics formed the basis for three resident forums, where discussions focused on these themes and allowed residents to vote on which ideas should be implemented. This structured approach transformed community feedback into a prioritized list of ideas, aligning the budget allocation with the true needs of Boston’s residents. 

Background

The City of Boston’s Office of Participatory Budgeting (OPB) launched a Participatory Budgeting (PB) project to gather input from residents on how to allocate $2 million to city programs, services, and projects. The goal was to ensure that Boston’s diverse communities had a voice in deciding how resources would be distributed. OPB collected proposal ideas through multiple channels, including an online portal, workshops, and PB Corners set up in public libraries. 

Challenge

After collecting feedback, OPB needed to transform the public’s comments into an actionable list of priorities. Residents submitted a moderate volume of proposals, amounting to about 60,000 words—equivalent to an average adult fiction book. However, unlike a coherent narrative, these proposals were unstructured and varied widely in content and format. They ranged from as few as two words to as many as 300 words, sometimes including complex elements like emojis, similar to sorting through thousands of tweets or product reviews. 

Manually processing and categorizing this unstructured data would have been time-consuming and prone to errors. OPB needed a systematic way to group similar proposals efficiently. Traditional topic modeling methods like Non-Negative Matrix Factorization (NMF) and Latent Dirichlet Allocation (LDA) are faster and easier to implement but often struggle with maintaining semantic coherence, especially when analyzing short or complex text. 

Solution

Voyatek implemented a more advanced topic modeling approach, BERTopic, to capture the full context of each idea. Using OpenAI’s text embeddings, each proposal was transformed into a dense vector, encoding its semantic meaning. To manage dimensionality without losing key information, Uniform Manifold Approximation and Projection (UMAP) was used for dimensionality reduction. Then, Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) was applied to group similar proposals into clusters. This method allowed OPB to identify patterns and connections between ideas, even when they were phrased differently.