Introduction (5 pt)

Marriage plays an important and integral role in our society. Despite recent waves of societal, cultural, and generational change, the family remains the bedrock unit for studying the individual and society as well as a prime focus of study in its own (Peterson and Bush (2012)). It is widely believed that marriage associates with overall increase happiness and life satisfaction. Indeed, the empirical research of marital satisfaction has shown that in stable marriages, spouses are healthier, happier, and live longer (Abreu-Afonso et al. (2022)). Moreover, marriage has also been found to be correlated with higher levels of happiness in Taiwan, where the results of most samples showed that the happiness levels were significantly higher than the baseline within 3 years of marriage (Tao (2019)).


However, the increase in cohabitation raises questions as to whether only marriage has beneficial effects (Perelli-Harris et al. (2019)). Apart from this, because of the current trends towards self-realization and personal independence, more and more individuals are opting out of marriage, preferring being single and happy at the same time. Marriage requires financial losses, certain efforts and commitment. But what is even more important and doubtfulness is the feeling of happiness and life satisfaction while being married. Indeed in several studies, marital satisfaction is mentioned to decrease over time, being higher in the first years of the relationship (Abreu-Afonso et al. (2022)). Regarding the gender differences in context of happiness nd marriage, one research has found that there is a higher separation risk among men who are happy with their relationship, but not for women (Perelli-Harris and Blom (2021)).

The provided research paper is aimed at analyzing the connection between marital status and happiness level (including overall life satisfaction) among individuals in Germany based on the SOEP teaching data. The main objective is to investigate whether there is a strong correlation between marriage and the feeling of happiness.

Data and Method (5 pt)

The data that was used for the following Final Project is selected variables from the SOEP teaching data.

The Socio-Economic Panel (SOEP) is a German dataset, that was introduced in 1984. There are many individuals in the whole country who are participating in questionnaire regarding various aspects of their life - such as education, employment, health, happiness etc. Because the same people are surveyed every year, it is possible to track long-term psychological, economic, societal, and social developments. Teaching SOEP database provides information about 50% of the original SOEP data and, thus, is perfect for a small analysis.

In order to investigate correlation between marital status and happiness level selected tables from teaching SOEP database were chosen and Exploratory Data Analysis (EDA) was conducted.

Sample

In the selected sample we have 128752 observations over 2007-2018. In order to obtain these sample we explored 7 provided datasets in the teaching and chose the most suitable ones for the further analysis. The basic exclusion criteria that has been applied was the relevance of the data for the given topic.

Variables

The selected dataset soep contains eight variables, which are the following:

  • pid - unique person identifier
  • syear - year of the observation
  • gender
  • age
  • marital status
  • life satisfaction
  • family satisfaction
  • Frequency of being happy in the last 4 weeks

The variables were filtered to be more than zero as the negative values are regarded as NA and thus are not relevant for our analysis (either not provided or not applicable).

From our perspective, the selected variables could provide the most suitable results in regard to the correlation between marriage and the level of happiness. In this analysis, we assume that happiness can be measured by the level of life satisfaction along with family satisfaction and frequency of being happy in the last 4 weeks.

Empirical Model

In this analysis two multiple regression models were applied:

The first one: \[\begin{equation} Life Satisfaction (Happiness) = Marriage Status + Gender + Marriage Status * Gender \end{equation}\] - to see who has a better life satisfaction level males or females and of what marriage status.

The second one: \[\begin{equation} Life Satisfaction (Happiness) = Marriage Status + Age + Marriage Status * Age \end{equation}\] - to see how life satisfaction changes for people among different marriage statuses over people’s lifetime.

Furthermore, we calculated Fixed Effects (FE) and First Difference (FD) estimators of how the marriage status affects life satisfaction level.

Data Analysis (10 pt)

Load data

Firstly, we need to load all the necessary libraries for our data analysis.

library(tidyverse)
library(sjlabelled)
library(huxtable)
library(stargazer)
library(gtsummary)
library(plm)
library(haven)
library(interactions)

We will use the tidyverse package to get access to a great variety of packages such as dplyr and ggplot2 which helps us to clean and visualize the data. The package sjlabelled is a useful package for working with labelled data. gtsummary, huxtable and stargazer will allow us to report the model estimation in regression specific table format. plm will make the estimation of linear panel models straightforward. haven will be helpful for reading .dta files. Finally, interactions package will significantly add functionality to plot visualization.

Now we can import the necessary data.

#pid, syear, Age of Individual, Gender of Individual, Marital Status of Individual, Overall life satisfaction
pequiv <- read_dta("pequiv.dta", 
                   col_select = c("pid", "syear", "d11101",
                                  "d11102ll", "d11104", "p11101"))

#pid, syear, Frequency of being happy in the last 4 weeks; Satisfaction with family life
pl <- read_dta("pl.dta", 
               col_select = c("pid", "syear", "plh0186", "plh0180"))

Data Management

Here we are merging two tables into one in order to accomplish required data analysis.

master <- merge(pequiv, pl, by = c("pid", "syear"))

After that, the Stata labels are being dropped, the variables are renamed, and factor variables are created. These steps are necessary for the further analysis.

soep <- remove_all_labels(master)

soep <- soep %>%
  rename(age = d11101,
         gender = d11102ll,
         marital = d11104,
         life_satisfaction = p11101,
         freq_happy_4_w = plh0186,
         family_satisfaction = plh0180) %>%
  filter(marital > 0, life_satisfaction > 0, freq_happy_4_w > 0,
         family_satisfaction > 0) %>%
  mutate(marital = factor(marital, levels = c(1, 2, 3, 4, 5)), 
         gender = factor(gender, levels = c(1, 2)),
         freq_happy_4_w = factor(freq_happy_4_w, levels = c(1, 2, 3, 4, 5)))

levels(soep$marital) = c("married", "single", "widowed", "divorced", "separated") 
levels(soep$gender) = c("male", "female")
levels(soep$freq_happy_4_w) = c("very seldom", "seldom", "sometimes", "often", "very often")

Summary statistics

Let’s look at the general statistics of our data set without columns pid and syear with the help of the library gtsummary

tbl_summary(soep[-c(1,2)])
Characteristic N = 128,7521
gender
male 59,922 (47%)
female 68,830 (53%)
age 49 (36, 62)
marital
married 77,533 (60%)
single 29,925 (23%)
widowed 7,584 (5.9%)
divorced 10,730 (8.3%)
separated 2,980 (2.3%)
life_satisfaction 8.00 (6.00, 8.00)
family_satisfaction 8.00 (7.00, 9.00)
freq_happy_4_w
very seldom 2,348 (1.8%)
seldom 10,008 (7.8%)
sometimes 39,986 (31%)
often 64,033 (50%)
very often 12,377 (9.6%)
1 n (%); Median (IQR)


In the dataset there are 47% of male and 53% of female individuals, the vast majority of them are married (60%), which is beneficial for the further analysis. This is followed by 23% of single people. To conclude, the given data is valid for future investigation.

Moreover, here we can see that people mainly think that marriage is very important for them (as much as 65%), and they reported being often happy over the last month.

To understand the linkage between marriage and happiness in depth we should explore these data and analyze if this level of happiness is influenced by the marital status or not.

Data Visulizations

In this section we visualize the given data and find some insights regarding the effect of marriage on happiness among German individuals.

soep %>%
  group_by(marital) %>%
  summarise(life_satisfaction = round(mean(life_satisfaction), 2)) %>%
  ggplot(aes(x = marital, y = life_satisfaction, group = marital)) +
  geom_col(fill = "lightblue", width = 0.5) +
  geom_text(aes(label = life_satisfaction), vjust = 3) +
  ggtitle("Average level of life satisfaction among people of different marriage statuses") +
  xlab("Marriage Status") +
  ylab("Life Satisfaction")

On this bar chart we may clearly see that married people have the highest level of Life Satisfaction. This is, however, followed by single individuals. What is eye catchy is that the difference between this two groups is indeed small, only 0.13.

Nevertheless, it is worth to mention here that these are the average level over the whole observed period, which gives only a vague representation of the data. It is still valid as here the difference between satisfaction level of different marriage statuses could be clearly observed, however, to understand the detailed correlation we need to make further investigation.

soep %>%
  filter(marital == c("married", "single", "divorced")) %>%
  group_by(syear, marital) %>%
  summarise(life_satisfaction = round(mean(life_satisfaction), 2),
            family_satisfaction = round(mean(family_satisfaction), 2)) %>%
  ggplot(aes(x = syear, y = life_satisfaction, color=marital)) +
  ggtitle("Comparison of life satisfaction level of married, single and divorced people") +
  geom_line() +
  labs(x = "Year", y = "Life Satisfaction", color = "Marriage Status")

In the line graph, the tremendous difference between divorced and not divorced people can be observed. Although the overall life satisfaction has increased among all three groups over time, people who are divorced have significantly lower level of life satisfaction. More importantly is that although single people initially had slightly higher level of life satisfaction than married, this has changed after less than a year. This can be a hint towards the increasing importance of being happy while having marriage at the later stage of life. Young people may be happy without being married, however this is not the case after a couple of years.

soep %>%
  ggplot(aes(x=marital,y=life_satisfaction,fill=marital)) +
  geom_boxplot() +
  ggtitle("Life Satisfaction level among marriage statuses") +
  labs(x = "Marriage Status", y = "Life Satisfaction", fill = "Marriage Status")

Here we can see another evidence of marriage and single people being happier than widowed | divorced | separated. This finding concludes that there is an effect on happiness of being either single or married. What is here interesting as well is the smaller inter-quartile happiness range of married and single people, which means that they are generally have lower happiness spread and totally satisfied with their life.

Main analysis

Now we will move on to the main analysis and estimate our model.

Interaction (categorical * dummy)

interact_life_marital_gender <- lm(life_satisfaction ~ marital * gender, data = soep)
interact_life_marital_gender
## 
## Call:
## lm(formula = life_satisfaction ~ marital * gender, data = soep)
## 
## Coefficients:
##                   (Intercept)                  maritalsingle  
##                       7.31190                       -0.09228  
##                maritalwidowed                maritaldivorced  
##                      -0.34467                       -0.46368  
##              maritalseparated                   genderfemale  
##                      -0.47829                        0.07093  
##    maritalsingle:genderfemale    maritalwidowed:genderfemale  
##                      -0.05053                       -0.14084  
##  maritaldivorced:genderfemale  maritalseparated:genderfemale  
##                      -0.13837                       -0.13604

Distribution of life satisfaction level among males and females of different marriage statuses

cat_plot(interact_life_marital_gender, pred = gender, modx = marital, legend.main = "Marriage Status") +
  labs(x = "Gender", y = "Life Satisfaction")

Here we may see visualization of the model results. It totally correlates with the previous findings of married and single people being happier than the rest of the observed group. Moreover, the depicted graph gives additional information regarding gender: married and single women are generally slightly happier than male. On the contrary, widowed, divorced and separated men have generally higher life satisfaction compared to women. One of the possible reason for this may be a child presence, which often stays with woman after the divorce.

Interaction (categorical * continuous)

interact_life_marital_age <- lm(life_satisfaction ~ marital * age, data = soep)
interact_life_marital_age
## 
## Call:
## lm(formula = life_satisfaction ~ marital * age, data = soep)
## 
## Coefficients:
##          (Intercept)         maritalsingle        maritalwidowed  
##             7.840417              0.021497             -0.683758  
##      maritaldivorced      maritalseparated                   age  
##            -1.082701             -1.093432             -0.009217  
##    maritalsingle:age    maritalwidowed:age   maritaldivorced:age  
##            -0.011025              0.005869              0.010127  
## maritalseparated:age  
##             0.010171

Change of life satisfaction level of people with marriage statuses with age

interact_plot(interact_life_marital_age, pred = age, modx = marital, legend.main = "Marriage Status") +
  labs(x = "Age", y = "Life Satisfaction")

The depicted line graph highlights again very interesting correlation between life satisfaction and age of Germans grouped by marital status. The tremendous decrease by around 25% in life satisfaction can be seen among single individuals, similar, but not that speed pattern is among married ones (~ 11%). Notably, the changes in life satisfaction in other groups are very negligible and overall remain at the same level. From our perspective, this shows that single people are indeed less happy over the ages, they feel necessity to be with somebody else, compared to married.

Models comparison

huxreg("interact_life_marital_gender" = interact_life_marital_gender,
       "interact_life_marital_age" = interact_life_marital_age,
       coefs = c("married" = "(Intercept)", "single" = "maritalsingle", "widowed" = "maritalwidowed",
                 "divorced" = "maritaldivorced", "separated" = "maritalseparated",
                 "female" = "genderfemale", "single female" = "maritalsingle:genderfemale",
                 "widowed female" = "maritalwidowed:genderfemale", 
                 "divorced female" = "maritaldivorced:genderfemale",
                 "separated female" = "maritalseparated:genderfemale", "age" = "age",
                 "age single" = "maritalsingle:age", "age widowed" = "maritalwidowed:age",
                 "age divorced" = "maritaldivorced:age", "age separated" = "maritalseparated:age"),
       statistics = c("N. obs." = "nobs", "R squared" = "r.squared", "F statistic" = "statistic"),
       align = "center")
interact_life_marital_genderinteract_life_marital_age
married7.312 ***7.840 ***
(0.009)(0.023)
single-0.092 ***0.021
(0.016)(0.034)
widowed-0.345 ***-0.684 ***
(0.041)(0.121)
divorced-0.464 ***-1.083 ***
(0.028)(0.079)
separated-0.478 ***-1.093 ***
(0.049)(0.119)
female0.071 ***
(0.012)
single female-0.051 *
(0.023)
widowed female-0.141 **
(0.047)
divorced female-0.138 ***
(0.035)
separated female-0.136 *
(0.063)
age-0.009 ***
(0.000)
age single-0.011 ***
(0.001)
age widowed0.006 ***
(0.002)
age divorced0.010 ***
(0.001)
age separated0.010 ***
(0.002)
N. obs.128752128752
R squared0.0120.021
F statistic177.538311.477
*** p < 0.001; ** p < 0.01; * p < 0.05.

From the first column (first regression model) of the table above we can say that if the person is married and is male the expected level of life satisfaction is 7.312. If the person is female the life satisfaction increases by 0.071. Also, if the person is a single male than the anticipated level of life satisfaction decreases by 0.092, while for a single female it decreases by 0.143. If the person is a widowed male it decreases by 0.345, while for a widowed female it falls by 0.486, if the person is a divorced man the expected level of life satisfaction drops by 0.464 and for a divorced woman it decreases by 0.602. Finally if the person is a separated man than the life satisfaction reduces by 0.478 and for a separated female it decreases by 0.614. So, from the first regression model we may see that overall females that are married or single are more satisfied with their lives than males, therefore, they are happier. However, divorced, widowed or separated males have a higher level of life satisfaction than that of females. That totally coincides with the previous results of the analysis.

From the second column (second regression model) of the table we can conclude that if the person is married the life satisfaction makes up 7.84, however, in contrast to the first regression model, this figure rises for single people by 0.021. If the person is widowed the level of life satisfaction reduces by 0.684, if the person is divorced it drops by 1.083 and if the person is separated it falls by 1.093. Moreover, we can see that if the person is getting older the life satisfaction level decreases. For instance life satisfaction for a 25 year old will be 7.84 - 0.009 * 25 = 7.615, while for an 80 years old person it will equal 7.84 - 0.009 * 80 = 7.12. The same assumption is valid for single and widowed people (by 0.003); however the life satisfaction of singles decreases more dramatically by 0.02 (-0.009 - 0.011). Unlike single and married people, life satisfaction level of divorced and separated people tends to rise over time of their life, but very insignificantly by 0.001. The second model tells us that the life satisfaction of married, single snd widowed people tend to decrease over their lifetime, while for divorced and separated people it gradually increases.

FE and FD estimators of how the marital category affects life satisfaction

FE <- plm(life_satisfaction ~ marital, data = soep, model = "within")
FD <- plm(life_satisfaction ~ marital, data = soep, model = "fd")

stargazer(FE, FD,
          type = "text",
          column.labels = c("FE", "FD"),
          dep.var.labels.include = FALSE,
          omit = "Constant",
          covariate.labels = c("single", "widowed", "divorced", "separated"),
          omit.stat = "F",
          model.numbers = FALSE)
## 
## =========================================
##                  Dependent variable:     
##              ----------------------------
##                    FE            FD      
## -----------------------------------------
## single         -0.169***      -0.100**   
##                 (0.031)        (0.050)   
##                                          
## widowed        -0.389***      -0.749***  
##                 (0.045)        (0.075)   
##                                          
## divorced         -0.001        -0.105*   
##                 (0.038)        (0.060)   
##                                          
## separated      -0.311***      -0.243***  
##                 (0.039)        (0.051)   
##                                          
## -----------------------------------------
## Observations    128,752        104,414   
## R2               0.002          0.001    
## Adjusted R2      -0.231         0.001    
## =========================================
## Note:         *p<0.1; **p<0.05; ***p<0.01

From this table we may see that the immediate impact of getting single or separated is higher than the overall level difference of being single or separated. Thus, if person is getting separated or single than on average the level of life satisfaction is generally higher than if this person is already single or separated. The opposite can be said about widowed and divorced people. If person is getting widowed or divorced than the level of life satisfaction drops significantly higher than if this person is already widowed or divorced.

Conclusion (10 pt)

Marriage is known to be an essential part of life among majority of people. Although it is generally believed that marriage makes the life better and contributes to overall increased happiness, some researches argue towards this point of view and provide valid arguments. In this study we analyzed the real data, provided by SOEP, and found the following insights:

  1. married and single people have higher level of satisfaction than the rest of the observed group;
  2. divorced people have significantly lower level of life satisfaction over the years;
  3. married and single females are happier than men; opposite can be seen among widowed, divorced and separated individuals;
  4. single people are becoming far less happier over the years than married.

To summarize, we may certainly say that marriage has a big effect on the happiness level. Although this was not clear from the beginning as the level of happiness among married and single was similar, after model estimation and visualization it was evident that single people become far less happier than married ones. Therefore, we may indeed conclude that the given analysis is robust and meaningful.

However, there are certain limitations in the provided research. We have used lm() function, which stands for OLS regression. OLS provides perfect results under certain conditions, in particular when all OLS-assumptions are fulfilled and the model is indeed linear. We did not conduct preliminary analysis and model comparison, therefore this could be viewed as an implication for future research.

The following next steps could be to split the data into train and test, calculate mean MSE for train and test datasets and compare this results with for instance neural network using Ridge, Lasso or Elastic Net Regression.

References (5 pt)

Abreu-Afonso, José, Maria Meireles Ramos, Inês Queiroz-Garcia, and Isabel Leal. 2022. “How Couple’s Relationship Lasts over Time? A Model for Marital Satisfaction.” Psychological Reports 125 (3): 1601–27.
Perelli-Harris, Brienna, and Niels Blom. 2021. “So Happy Together… Examining the Association Between Relationship Happiness, Socio-Economic Status, and Family Transitions in the UK.” Population Studies, 1–18.
Perelli-Harris, Brienna, Stefanie Hoherz, Trude Lappegård, and Ann Evans. 2019. “Mind the ‘Happiness’ Gap: The Relationship Between Cohabitation, Marriage, and Subjective Well-Being in the United Kingdom, Australia, Germany, and Norway.” Demography 56 (4): 1219–46.
Peterson, G. W., and K. R. Bush. 2012. Handbook of Marriage and the Family. Springer US. https://books.google.de/books?id=7c3-r5QmAn0C.
Tao, Hung-Lin. 2019. “Marriage and Happiness: Evidence from Taiwan.” Journal of Happiness Studies 20 (6): 1843–61.