The “budgeting for SDGs”–B4SDGs–paradigm seeks to coordinate the budgeting process of the fiscal cycle with the sustainable development goals (SDGs) set by the United Nations. Integrating the goals into public financial management systems is crucial for an effective alignment of national development priorities with the objectives set in the 2030 Agenda. Within the dynamic process defined in the B4SDGs framework, the step of SDG budget tagging represents a precondition for subsequent budget diagnostics. However, developing a national SDG taxonomy requires substantial investment in terms of time, human, and administrative resources. Such costs are exacerbated in least-developed countries, which are often characterized by a constrained institutional capacity. The automation of SDG budget tagging could represent a cost-effective solution. We use well-established text analysis and machine learning techniques to explore the scope and scalability of automatic labeling budget programs within the B4SDGs framework. The results show that, while our classifiers can achieve great accuracy, they face limitations when trained with data that is not representative of the institutional setting considered. These findings imply that a national government trying to integrate SDGs into its planning and budgeting practices cannot just rely solely on artificial intelligence (AI) tools and off-the-shelf coding schemes. Our results are relevant to academics and the broader policymaker community, contributing to the debate around the strengths and weaknesses of adopting computer algorithms to assist decision-making processes.