Panel Data Analysis¶
For countries with panel surveys, household IDs are automatically harmonized across waves, enabling longitudinal analysis without manual matching.
Tracking Households Over Time¶
import lsms_library as ll
uga = ll.Country('Uganda')
# Get food expenditures across all waves
food_exp = uga.food_expenditures()
# The MultiIndex includes time (t), so you can track households
household_over_time = food_exp.xs('00c9353d8ebe42faabf5919b81d7fae7', level='i')
# Compare specific waves
wave_2015 = food_exp.xs('2015-16', level='t')
wave_2019 = food_exp.xs('2019-20', level='t')
Panel Attrition¶
Check how many households appear across wave pairs:
from lsms_library import tools
panel = tools.panel_attrition(uga.household_characteristics(), uga.waves)
# Returns a matrix:
# 2005-06 2009-10 2010-11 ...
# 2005-06 3122 2606 2386
# 2009-10 NaN 2974 2617
# Diagonal = total households per wave
# Off-diagonal = overlap between waves
Panel IDs¶
The panel_ids and updated_ids properties provide the raw ID mappings:
# Computed lazily on first access
uga.panel_ids # dict mapping waves to ID tables
uga.updated_ids # dict of {old_id: new_id} per wave
To eagerly preload panel IDs at construction time:
How ID Harmonization Works¶
Different surveys handle panel IDs differently:
- Stable IDs -- the same household keeps the same ID across waves
- Backward links -- each wave provides a mapping to the previous wave's ID
- Composite IDs -- IDs are constructed from multiple survey fields
The library's updated_ids mechanism walks the ID chain so that a single
canonical ID refers to the same household across all waves. This happens
transparently when you call any table method.