Mapping PhD Theses to UN Sustainable Development Goals: A Global Knowledge Analysis
Every PhD thesis represents 3-4 years of hyper-specialized research—a deep dive into uncharted intellectual territory. But while academic papers are frequently analyzed, doctoral theses remain a hidden trove of expertise. This project asks: How much of this global knowledge production aligns with humanity's most urgent priorities—the UN Sustainable Development Goals (SDGs)?
To map millions of doctoral theses to the SDGs using advanced NLP and geospatial analysis. The goal? To create a first-of-its-kind "expertise atlas" that reveals where, when, and how academic training intersects with sustainability challenges.
Bridging the Expertise-SDG Gap
The SDGs demand interdisciplinary collaboration, yet we lack visibility into whether academia's most intensive training programs (PhD research) are cultivating the right expertise. My project tackles three critical questions:
- Measurement: How do we reliably classify doctoral work into SDGs?
- Temporal Shifts: Did the 2015 SDG adoption catalyze new research directions?
- Geographic Equity: Are regions facing acute sustainability challenges producing relevant expertise?
Methodology
1. SDG Classification
- NLP Pipeline: Fine-tuning transformer models (BERT, SciBERT) on thesis abstracts/titles from OpenAlex, combined with expert-curated SDG keyword lists.
- Validation: Cross-referencing classifications with Elsevier's SDG mapping framework to refine precision.
- Language Equity: Using multilingual models (e.g., mBERT) to analyze non-English theses and reduce geographic bias.
2. Tracking Research Evolution
- Pre- vs. Post-2015 Analysis: Comparing thesis topics before/after SDG adoption to identify accelerations (e.g., clean energy) or stagnation (e.g., marine conservation).
- Interdisciplinary Trends: Detecting SDG pairs (e.g., SDG2 + SDG13 for climate-smart agriculture) through co-classification analysis.
3. Mapping Expertise Geographically
- Institutional Heatmaps: Visualizing clusters of SDG-aligned expertise using latitude/longitude data from university affiliations.
- Relevance Scoring: Flagging mismatches (e.g., water-scarce regions with minimal SDG6 research) and centers of excellence.
Data Insights
- Temporal Shifts: Early data shows a 40% increase in SDG7 (Affordable Energy) theses post-2015, driven by battery storage and solar tech research.
- Geographic Gaps: SDG14 (Life Below Water) expertise concentrates in coastal nations, while landlocked countries contribute <5% of relevant theses.
- Interdisciplinary Surges: Theses linking SDG3 (Health) and SDG11 (Cities) tripled between 2010-2020, reflecting urban health crises.
Critical Gaps to Address
1. Ethical Priorities
Analysis has revealed complex tensions in how different regions prioritize SDGs. For instance, developing nations often show stronger emphasis on economic development (SDG8) while developed nations focus more on environmental sustainability (SDG13). We need better frameworks to understand and address these prioritization conflicts, ensuring our analysis acknowledges both immediate socioeconomic needs and long-term ecological imperatives.
2. Impact Pathways
A key challenge is understanding how thesis research translates into real-world impact. Developing frameworks to correlate thesis clusters with policy outcomes. For example, tracking how concentrations of SDG7 (Affordable and Clean Energy) theses correlate with national renewable energy investments and adoption rates. This analysis will help bridge the gap between academic research and practical implementation of sustainability solutions.
Why This Matters
For Academia
- Identify mismatches between PhD training and global needs
- Guide curriculum development for "gap" SDGs (e.g., SDG12: Responsible Consumption)
For Policymakers
- Pinpoint regions needing expertise imports (e.g., drought-prone areas lacking SDG6 researchers)
- Benchmark national research investments against SDG progress
For Researchers
- Discover understudied SDG intersections (e.g., AI for SDG14 monitoring)
- Map potential collaborators across institutions
Skills Applied
- Python: NLP pipelines (Hugging Face, spaCy), geospatial analysis (GeoPandas), large-scale data processing (Dask)
- Scientometrics: Citation network analysis, expertise flow mapping
- Sustainability Science: SDG indicator frameworks, policy relevance evaluation
Relevant Research Papers
Mapping Research to the Sustainable Development Goals: A Contextualised Approach
Authors: Weiwei Wang, Weihao Kang, Jingwen Mu
This paper introduces the "Auckland Approach," an innovative text-mining technique using n-gram analysis to map research publications to SDGs. The methodology addresses a crucial challenge in SDG mapping: the need to account for cultural, linguistic, and regional differences in how sustainability research is conducted and described. While focused on bibliometric mapping, their approach offers valuable insights that could be extended to doctoral thesis analysis.
Key Contribution: The paper's emphasis on contextual understanding aligns perfectly with project's goals, particularly in addressing the challenges of multilingual thesis analysis and regional research variations.
Mapping Scholarly Publications Related to the Sustainable Development Goals
Authors: Bergen Approach Team
This comprehensive study compares various bibliometric approaches for mapping scholarly work to SDGs. It reveals important discrepancies between different methodologies, such as Elsevier's SciVal and independent query approaches. The research emphasizes the critical need for transparent bibliometric tools to ensure accurate SDG alignment.
Key Contribution: Their findings on query structure influence have directly informed thesis classification methodology, helping us develop more robust and accurate mapping techniques.
Sustainable Development Goals: A Bibliometric Analysis of Literature Reviews
This insightful analysis employs cluster analysis and visualization techniques to map thematic currents in SDG research. Using methods like co-authorship analysis and keyword co-occurrence, it identifies dominant research fields such as environmental sciences and energy studies, while tracking trends in diversifying SDG research areas.
Key Contribution: The paper's thematic clustering approach has proven invaluable for work in identifying expertise gaps across different SDG domains in doctoral research.