Statistical aspects of modeling panel data originating from smallholder household surveys in Sub-Saharan Africa and Vietnam

Abstract: This thesis focuses on multivariate multilevel modeling of panel data originated from smallholder farm surveys in Sub-Saharan African (SSA) countries and Vietnam to investigate driving forces in maize and rice production in SSA and diversification of farms in the rice sector in Vietnam. We also present and evaluate different tests for endogeneity in explanatory binary variables in multivariate linear regression models. Paper I discusses drivers behind changes in rice production in SSA. Drivers of production were a combination of area expansion, market integration, farm technology in terms of fertilizer and tractor ploughing, village elite membership and key macro level conditions such as rice imports and GDP per capita. Rice production increase is primarily associated with area expansion and commercial drivers during the period. The role of commercialization changes in production suggests that policies hold great potential for driving rice production in SSA. Paper II focuses on the drivers behind changes in maize production. It is shown that the drivers of production were a combination of area expansion, market integration, farm technology in terms of fertilizer and ploughing, village elite membership and key macro level conditions including share of budget to agriculture and import of maize with some exception within time frame in 2002 and 2008. Maize production increases are primarily associated with area expansion, ploughing, commercial drivers, GDP per capita and village elite membership. Paper III discuss different tests for endogeneity in explanatory binary and counting variables to take into account different aspects of the possible dependence structure between the variable and the disturbance in regression model. The study is shown how approximate equivalent tests can be obtained using the commonly used technique of ‘adding residuals’ to the regression model and which corresponds to tests based on differences of regression parameter estimates obtained under different assumptions. Paper IV investigates the power of some test statistics for testing endogeneity in case of binary endogenous variables. Statistics based on correlations between residuals in the studied regression and the endogenous variable, are studied and compared with the technique of testing significance of the added residual in the studied regression and also the standard Hausman test of endogeneity. Paper V aims to analyse the income diversification patterns of total income of rice farming households in two regions of Vietnam- the Mekong River Delta (MRD) and the Red River Delta (RRD). Two indices; less dominance of rice income, and less dominance of non-farm income of the total household income and their changes over time, defines four categories of farmers: 1) High diversified, 2) Ascenders, 3) Descenders, and 4) Low diversified. In MRD we found that education as resource/wealth, differences in rice production and incomes for households showed a significant effect on total income at the later time frame. Ascenders increased their share of all other incomes but rice, mostly shift to non-farm incomes i.e., out of rice dependence. In the RRD region only initial higher income level effected in income in 2005. The ascenders increased their incomes substantially while the descenders, more dependent on rice income, sharp fall in income. Higher education level and smaller farm sizes reduce the importance of education and rice productivity changes in RRD.

