Chapter 3 Second Republic

3.1 Electoral laws

MATTARELLUM (1994 - 2005)

  • Camera

    • 475 seats (75%) assigned through a first-past-the-post system in single-member constituencies.

    • 155 seats (25%) assigned through a proportional system within 26 multi-member constituencies and distributed among the lists using the largest remainder method with the Hare quota. Electoral threshold of 4%.

    • Candidates can run in only one constituency. Separate candidacies for the single-member constituency and the proportional quota.

  • Senato

    • 232 seats (75%) assigned through a first-past-the-post system in single-member constituencies.

    • 83 seats (25%) assigned on a regional basis through a proportional system. After summing the votes of each group of candidates, the votes used to elect winners in the single-member constituencies are subtracted, and the remaining votes are allocated to the lists using the D’Hondt method.

    • Candidates must be linked to at least 2 candidates in as many constituencies within the same Region.

The analysis focus only on the proportional vote data, despite only 25% of the seats are allocated in this way. For single member districts, candidates were ofeten supported by multiple parties within a coalition.

Since the votes in these districts were aggregated at the coalition level, it is not possible to accurately attribute them to individual parties.

Moreover, proportional vote data offer a more precise reflection of voters’ political preferences. In the majoritarian system, strategic voting and coalition agreements could distort the actual level of support for individual parties, as voters might choose a coalition-backed candidate rather than their preferred party. In contrast, proportional vote allows citizens to express their party preference directly.

The Mattarellum faced significant criticism starting in 2001 due to the problem of the so colled “liste civetta” (“decoy lists”), which were created to circumvent the scorporo (vote deduction) mechanism used to calculate the proportional share of seats in the Chamber of Deputies. With the scorporo system, the difference in votes between the winning candidate and the runner-up in single-member constituencies was subtracted from the winning candidate’s party list when determining the proportional allocation of seats. To exploit this, fictitious lists were created to support candidates in single-member districts. The main party’s actual list (which only ran in the proportional segment) thus retained all its votes for the proportional distribution.

Additionally, with Constitutional Law No. 1/2000, the Circoscrizione Estero (Overseas Costituency) was established, electing 12 deputies and 6 senators, necessitating a revision of the number of seats allocated to each constituency.

The Mattarellum was definitively replaced by Law No. 270 of 2005, known as the Calderoli Law or “Porcellum.”

PORCELLUM (2005-2015)

  • Camera

    • Seats allocated on a national basis across 27 constituencies.

    • Proportional system with electoral thresholds and a majority bonus.

    • Electoral thresholds:

      • 4% for lists not part of a coalition.

      • 10% for coalitions. Lists within a coalition qualify for seat allocation if they individually obtain at least 2% of the votes.

    • Majority bonus: 55% of the seats (340 seats)

    • Closed lists

  • Senato

    • Seats allocated on a regional basis, with constituencies corresponding to the regions.

    • Proportional system with electoral thresholds. Single-member districts in Valle d’Aosta (1 seat) and Trentino-Alto Adige (6 seats)

    • Thresholds:

      • Coalitions: 20% of valid votes, with individual lists within coalitions needing at least 3%.

      • Non-coalition lists and coalition lists that do not meet the 20% threshold: 8%.

    • Majoriti bonus: 55% seggi

    • Closed lists

The Constitutional Court, with ruling no. 1/2014, declared the Porcellum unconstitutional for two main reasons:

  • The majority bonus without a minimum threshold, which violated the principle of democratic representation (Articles 1 and 3 of the Constitution).

  • Closed lists, which infringed on the principle of voting freedom and political representation (Article 48), as parliamentarians were selected by parties rather than voters.

The name “Porcellum” originates from an expression used by the law’s own author, Roberto Calderoli, who referred to it as “una porcata” (a disgrace).

After the rejection of the Porcellum, Italy temporarily operated under a proportional electoral system adjusted by the Court’s ruling, known as the Consultellum, until the approval of the Italicum (2015), introduced by the Renzi government.

The Italicum also sparked significant controversy and was later rejected by the Constitutional Court.

Specifically, the Italicum stipulated that if no party surpassed 40% of the vote, a runoff would occur between the two most-voted parties.

Additionally, lead candidates were locked at the top of the lists and could run in up to 10 different constituencies, choosing where to be elected after the vote.

The Italicum was never used to elect Parliament.

Subsequently, with Law no. 165 of November 3, 2017, Parliament introduced the Rosatellum, replacing the Italicum as modified by the Court’s decision.

The law passed with a broader parliamentary majority than that supporting the government (the Gentiloni government placed a confidence vote to secure its approval).

The Five Star Movement (M5S) and left-wing parties outside the Democratic Party (PD) were the only parliamentary forces to oppose the Rosatellum. The main criticism of the Rosatellum was that it favored coalitions, penalizing parties running alone (such as the M5S).

ROSATELLUM (2017 - )

  • Camera

    • 28 consituencies

      • 231 single-member constituencies (majoritarian system). One candidate per list or coalition.

      • 70 - 77 multi-member constituencies (386 seats assigned through proportional representation with national threshold). Closed lists with 2 to 4 candidates in alternating gender order.

      • 1 seat in Valle d’Aosta

      • 12 seats for Circoscrizioni Estere

    • Thresholds:

      • Coalitions: 10% (votes from parties that do not surpass the 1% threshold are not counted)

      • Single lists and intra-coalition lists (if the coalition does not surpass the 10% threshold): 3%

  • Senato

    • 20 constituencies (corresponding to regions)

      • 115 single-member constituencies (majoritarian). One candidate per list or coalition

      • One or more multi-member constituencies per region, based on population (193 seats assigned through proportional representation with a regional threshold). Closed lists with 2 to 4 candidates in alternating gender order

      • 1 seat Valle d’Aosta

      • 6 seats Circoscrizoni Estere

    • Thresholds: see Camera

With the Rosatellum, specific rules to nsure gender representation came into effect.

Firstly, under penalty of inadmissibility, candidates must be placed in alternating gender order within the lists for multi-member constituencies, both for the Camera and the Senate. In the Camera, it is also stipulated that, in the totality of the candidacies presented by each list or coalition of lists in the single-member constituencies at the national level, neither gender can be represented by more than 60%. Additionally, in the totality of the lists in the multi-member constituencies presented by each list at the national level, neither gender can be represented in the leading candidate position by more than 60%, with rounding to the nearest whole number.

The same 60% limit applies in the Senate at the regional level.

We will explore the issue of gender represenattion specifically in the next chapter

3.2 Datasets overview

3.2.1 Camera

3.2.1.1 Mattarellum

# Generate paths for all files in the folder "camera2"
file_paths_camera2 <- list.files("C:\\Users\\acer\\Desktop\\ElectoralDifferences\\data\\camera2", full.names = TRUE)

# Extract file names without extensions to use as variable names
file_names_camera2 <- tools::file_path_sans_ext(basename(file_paths_camera2))

# Read all files and assign each to a variable named like the file
for (i in seq_along(file_paths_camera2)) {
  assign(file_names_camera2[i], tibble::as_tibble(read.csv(file_paths_camera2[i], sep = ";", fileEncoding = "UTF-8")))
}

print(file_names_camera2)
##  [1] "camera-19940327_Proporzionale"      "camera-19960421_Proporzionale"     
##  [3] "camera-20010513_Proporzionale"      "Camera_19940327_Uninom_Cand&Contr" 
##  [5] "Camera_19940327_Uninom_Scrutini"    "Camera_19960421_Uninom_Cand&Contr" 
##  [7] "Camera_19960421_Uninom_Scrutini"    "Camera_20010513_Uninom_Cand&Contr" 
##  [9] "Camera_20010513_Uninom_Scrutini"    "camera_italia-20060409"            
## [11] "camera_italia-20080413"             "camera_italia-20130224"            
## [13] "Camera_Italia_LivComune"            "camera_vaosta-20060409"            
## [15] "camera_vaosta-20080413"             "camera_vaosta-20130224"            
## [17] "Camera_VAosta_LivComune"            "camera1994_candidatilista"         
## [19] "camera1996_candidatilista"          "camera2001_candidatilista"         
## [21] "camera2006_candidatilista"          "camera2008_candidatilista"         
## [23] "camera2013_candidatilista"          "camera2018_candidatilista"         
## [25] "Camera2018_livComune"               "Camera2018_VAosta_CandidCollUninom"
## [27] "camera2022_candidatilista"
library(dplyr)
library(stringr)

camera_mattarellum_regex <- "^[cC]amera-\\d{8}\\_Proporzionale.txt$"

wrangling_camera_mattarellum <- function(data) {
  data <- data %>%
    mutate(across(c(VOTANTI, ELETTORI, ELETTORI_MASCHI, VOTANTI_MASCHI, VOTI_LISTA, SCHEDE_BIANCHE), as.numeric)) %>%
    mutate(across(c(CIRCOSCRIZIONE, COLLEGIO, COMUNE, LISTA), factor))
  return(data)
}
unified_camera_mattarellum <- process_data(file_paths_camera2, camera_mattarellum_regex, 8, 11, wrangling_camera_mattarellum)
#View(unified_camera_mattarellum)
# unique(unified_camera_mattarellum$CIRCOSCRIZIONE)

regioni_list <- list(
  "ABRUZZO" = c("ABRUZZI", "ABRUZZO"),
  "BASILICATA" = "BASILICATA",
  "CALABRIA" = "CALABRIA",
  "CAMPANIA" = c("CAMPANIA 1", "CAMPANIA 2"),
  "EMILIA-ROMAGNA" = c("EMILIA-ROMAGNA", "EMILIA ROMAGNA", " EMILIA ROMAGNA"),
  "FRIULI VENEZIA GIULIA" = c("FRIULI VENEZIA GIULIA", "FRIULI-VENEZIA GIULIA"),
  "LAZIO" = c("LAZIO 1", "LAZIO 2"),
  "LIGURIA" = "LIGURIA",
  "LOMBARDIA" = c("LOMBARDIA 1", "LOMBARDIA 2", "LOMBARDIA 3", "LOMBARDIA 4"),
  "MARCHE" = "MARCHE",
  "MOLISE" = "MOLISE",
  "PIEMONTE" = c("PIEMONTE 1", "PIEMONTE 2"),
  "PUGLIA" = "PUGLIA",
  "SARDEGNA" ="SARDEGNA",
  "SICILIA" = c("SICILIA 1", "SICILIA 2"),
  "TOSCANA" = "TOSCANA",
  "TRENTINO ALTO ADIGE" = c("TRENTINO ALTO ADIGE", "TRENTINO-ALTO ADIGE", "TRENTINO-ALTO ADIGE/S_DTIROL", "TRENTINO-ALTO ADIGE/SUDTIROL"),
  "UMBRIA" = "UMBRIA",
  "VENETO" = c("VENETO 1", "VENETO 2")
)

get_region <- function(CIRCOSCRZIONE) {
  for (region in names(regioni_list)) {
    if (CIRCOSCRZIONE %in% regioni_list[[region]]) {
      return(region)
    }
  }
  return(NA)
}
unified_camera_mattarellum$REGIONE <- sapply(unified_camera_mattarellum$CIRCOSCRIZIONE, get_region)

unified_camera_mattarellum <- unified_camera_mattarellum %>%
  group_by(REGIONE, LISTA, YEAR) %>% 
  summarize(VOTI_LISTA = sum(VOTI_LISTA, na.rm=TRUE), .groups = "drop") %>% 
  group_by(REGIONE, YEAR) %>%  
  mutate(PERCENTAGE = (VOTI_LISTA / sum(VOTI_LISTA, na.rm = TRUE)) * 100)

#adding valle d'aosta data (collegio uninominale)
camera_vda_mattarellum_regex <- "^[cC]amera_\\d{8}\\_Uninom_Cand&Contr.txt$"

wrangling_camera_vda_mattarellum <- function(data) {
  data <- data %>% filter(coll=="Aosta") %>%
    rename(LISTA = descrcontrass, VOTI_LISTA = TOTVOTI, CIRCOSCRIZIONE = circ) %>%
    mutate(VOTI_LISTA = as.numeric(gsub(",", ".", VOTI_LISTA))) %>%
    mutate(CIRCOSCRIZIONE = as.factor(CIRCOSCRIZIONE))
  return(data)
}

unified_camera_vda_mattarellum <- process_data(file_paths_camera2, camera_vda_mattarellum_regex, 8, 11, wrangling_camera_vda_mattarellum)
unified_camera_vda_mattarellum$REGIONE <- as.factor("VALLE D'AOSTA")

unified_camera_vda_mattarellum <- unified_camera_vda_mattarellum %>%
  group_by(REGIONE, LISTA, YEAR) %>% 
  summarize(VOTI_LISTA = sum(VOTI_LISTA, na.rm=TRUE), .groups = "drop") %>% 
  group_by(REGIONE, YEAR) %>%  
  mutate(PERCENTAGE = (VOTI_LISTA / sum(VOTI_LISTA, na.rm = TRUE)) * 100) 

#sum(is.na(unified_camera_mattarellum$REGIONE))
#View(unified_camera_vda_mattarellum)

unified_camera_mattarellum <- full_join(unified_camera_mattarellum, unified_camera_vda_mattarellum)

3.2.1.2 Porcellum

library(dplyr)
camera_porcellum_regex <- "^[cC]amera_italia-\\d{8}\\.txt$"
#print(file_paths_camera2[grepl(camera_porcellum_regex, basename(file_paths_camera2), ignore.case = TRUE)])

wrangling_camera_porcellum <- function(data) {
  if ("VOTILISTA" %in% colnames(data)) {
    data <- data %>%
      rename(VOTI_LISTA = VOTILISTA)
  }
  data <- data %>%
    mutate(across(c(VOTANTI, ELETTORI, ELETTORI_MASCHI, VOTANTI_MASCHI, VOTI_LISTA, SCHEDE_BIANCHE), as.numeric)) %>%
    mutate(across(c(CIRCOSCRIZIONE, PROVINCIA, COMUNE, LISTA), factor))
  return(data)
}
unified_camera_porcellum <- process_data(file_paths_camera2, camera_porcellum_regex, 15, 18, wrangling_camera_porcellum)

unified_camera_porcellum$REGIONE <- sapply(unified_camera_porcellum$CIRCOSCRIZIONE, get_region)

unified_camera_porcellum <- unified_camera_porcellum %>%
  group_by(REGIONE, LISTA, YEAR) %>% 
  summarize(VOTI_LISTA = sum(VOTI_LISTA), .groups = "drop") %>% 
  group_by(REGIONE, YEAR) %>%  
  mutate(PERCENTAGE = (VOTI_LISTA / sum(VOTI_LISTA, na.rm = TRUE)) * 100)

camera_vda_porcellum_regex <- "^[cC]amera_vaosta-\\d{8}\\.txt$"

wrangling_camera_vda_porcellum <- function(data) {
  data <- data %>%
    mutate(VOTI_LISTA = as.numeric(gsub(",", ".", VOTI_LISTA))) %>%
    mutate(LISTA = as.factor(LISTA))
  return(data)
}
unified_camera_vda_porcellum <- process_data(file_paths_camera2, camera_vda_porcellum_regex, 15, 18, wrangling_camera_vda_porcellum)

unified_camera_vda_porcellum$REGIONE <- as.factor("VALLE D'AOSTA")

unified_camera_vda_porcellum <- unified_camera_vda_porcellum %>%
  group_by(REGIONE, LISTA, YEAR) %>% 
  summarize(VOTI_LISTA = sum(VOTI_LISTA), .groups = "drop") %>% 
  group_by(REGIONE, YEAR) %>%  
  mutate(PERCENTAGE = (VOTI_LISTA / sum(VOTI_LISTA, na.rm = TRUE)) * 100)

unified_camera_porcellum <- full_join(unified_camera_porcellum, unified_camera_vda_porcellum)
#View(unified_camera_porcellum)
#usa per vedere skbianche nulle per regione (violin plots/ridgeline)

#glimpse(read.csv("C:\\Users\\acer\\Desktop\\Electoral Differences\\data\\camera2\\Camera_19940327_Uninom_Scrutini.txt", sep=";"))
# #camera candlista mattarellum
# read.csv("C:\\Users\\acer\\Desktop\\Electoral Differences\\data\\camera2\\camera1994_candidatilista.txt", sep =";")
# 
# read.csv("C:\\Users\\acer\\Desktop\\Electoral Differences\\data\\camera2\\camera1996_candidatilista.txt", sep=";") #uguale ma senza preferenze (che era NA)
# 
# read.csv("data/camera2/camera2001_candidatilista.txt", sep=";") #same
# 
# #camera candlista con porcellum
# read.csv("C:\\Users\\acer\\Desktop\\Electoral Differences\\data\\camera2\\camera2006_candidatilista.txt", sep=";")
# 
# read.csv("C:\\Users\\acer\\Desktop\\Electoral Differences\\data\\camera2\\camera2008_candidatilista.txt", sep=";") #uguale ma "N" invece di "E" al posto di cotipoeletto
# 
# read.csv("C:\\Users\\acer\\Desktop\\Electoral Differences\\data\\camera2\\camera2013_candidatilista.txt", sep=";") #N
# 
# #rosatellum
# a <- read.csv("C:\\Users\\acer\\Desktop\\Electoral Differences\\data\\camera2\\camera2018_candidatilista.txt", sep = ";") #aggiunta colonna "COLLEGIOPLURINOM" e tanti altri dataset
# 
# b <- read.csv("C:\\Users\\acer\\Desktop\\Electoral Differences\\data\\camera2\\camera2022_candidatilista.csv", sep = ";") #c'è collegioplur ma no votilista

3.2.1.3 Rosatellum

library(dplyr)

camera_rosatellum_18 <- read.csv("C:\\Users\\acer\\Desktop\\ElectoralDifferences\\data\\camera2\\Camera2018_livComune.txt", sep=";")
camera_rosatellum_18$REGIONE <- sapply(camera_rosatellum_18$CIRCOSCRIZIONE, get_region)
#unique(camera_rosatellum_18$CIRCOSCRIZIONE)
#View(camera_rosatellum_18)
camera_rosatellum_18$YEAR <- "2018"

camera_rosatellum_18 <- camera_rosatellum_18 %>% 
  filter(CIRCOSCRIZIONE != "AOSTA") %>%
  select(YEAR, REGIONE, LISTA, VOTI_LISTA) %>% 
  group_by(REGIONE, LISTA, YEAR) %>% 
  summarize(VOTI_LISTA = sum(VOTI_LISTA), .groups = "drop") %>% 
  group_by(REGIONE, YEAR) %>%  
  mutate(PERCENTAGE = (VOTI_LISTA / sum(VOTI_LISTA, na.rm = TRUE)) * 100)

camera_rosatellum_18_vda <- read.csv("C:\\Users\\acer\\Desktop\\ElectoralDifferences\\data\\camera2\\Camera2018_VAosta_CandidCollUninom.txt", sep=";") 
camera_rosatellum_18_vda$REGIONE <- as.factor("VALLE D'AOSTA")
camera_rosatellum_18_vda$YEAR <- "2018"

camera_rosatellum_18_vda <- camera_rosatellum_18_vda %>%
  rename(VOTI_LISTA = TOTVOTI) %>%
  select(LISTA, VOTI_LISTA, REGIONE, YEAR) %>%
  mutate(
    VOTI_LISTA = as.integer(gsub(",", ".", VOTI_LISTA)),  
    REGIONE = as.factor(REGIONE),
    PERCENTAGE = (VOTI_LISTA / sum(VOTI_LISTA, na.rm = TRUE)) * 100
  )

camera_rosatellum_18_unified <- bind_rows(camera_rosatellum_18, camera_rosatellum_18_vda)
#View(camera_rosatellum_18_unified)

camera_rosatellum_18_unified <- camera_rosatellum_18_unified %>%
  filter(!is.na(VOTI_LISTA))

camera_rosatellum_22 <- read.csv("C:\\Users\\acer\\Desktop\\ElectoralDifferences\\data\\camera2\\Camera_Italia_LivComune.csv", sep=";")

camera_rosatellum_22$REGIONE <- sapply(camera_rosatellum_22$CIRC.REG, get_region)
#sum(is.na(camera_rosatellum_18$REGIONE))
#View(rosatellum_22)

camera_rosatellum_22$YEAR <- "2022"

camera_rosatellum_22 <- camera_rosatellum_22 %>% 
  rename(LISTA = DESCRLISTA, VOTI_LISTA = VOTILISTA) %>%
  select(YEAR, REGIONE, LISTA, VOTI_LISTA) %>% 
  group_by(REGIONE, LISTA, YEAR) %>% 
  summarize(VOTI_LISTA = sum(VOTI_LISTA), .groups = "drop") %>% 
  group_by(REGIONE, YEAR) %>%  
  mutate(PERCENTAGE = (VOTI_LISTA / sum(VOTI_LISTA, na.rm = TRUE)) * 100)


camera_rosatellum_22_vda <- read.csv("C:\\Users\\acer\\Desktop\\ElectoralDifferences\\data\\camera2\\Camera_VAosta_LivComune.csv", sep=";")
#View(camera_rosatellum_22_vda)

camera_rosatellum_22_vda <- camera_rosatellum_22_vda %>%
  rename(LISTA = CONTRASSEGNO, VOTI_LISTA = TOTVOTI) %>%
  mutate(YEAR = "2022", REGIONE = as.factor(REGIONE)) %>%
  select(YEAR, REGIONE, LISTA, VOTI_LISTA) %>%  
  group_by(REGIONE, LISTA, YEAR) %>% 
  summarize(VOTI_LISTA = sum(VOTI_LISTA), .groups = "drop") %>% 
  group_by(REGIONE, YEAR) %>%  
  mutate(PERCENTAGE = (VOTI_LISTA / sum(VOTI_LISTA, na.rm = TRUE)) * 100)


camera_rosatellum_22_unified <- full_join(camera_rosatellum_22, camera_rosatellum_22_vda)


unified_camera_rosatellum <- full_join(camera_rosatellum_18_unified, camera_rosatellum_22_unified)
View(unified_camera_rosatellum)
library(dplyr)
unified_camera2 <- unified_camera_mattarellum %>% full_join(unified_camera_porcellum) %>% full_join((unified_camera_rosatellum))

DT::datatable(unified_camera2, filter = "top", options = list(
  scrollX = TRUE, autowidth = TRUE
)) %>%
  DT::formatRound(columns = c("PERCENTAGE"), digits = 2)
saveRDS(unified_camera2, "unified_camera2.rds")

3.2.2 Senato

# Generate paths for all files in the folder "senato2"
file_paths_senato2 <- list.files("C:\\Users\\acer\\Desktop\\ElectoralDifferences\\data\\senato2", full.names = TRUE)

# Extract file names without extensions to use as variable names
file_names_senato2 <- tools::file_path_sans_ext(basename(file_paths_senato2))

# Read all files and assign each to a variable named like the file
for (i in seq_along(file_paths_senato2)) {
  assign(file_names_senato2[i], tibble::as_tibble(read.csv(file_paths_senato2[i], sep = ";", fileEncoding = "UTF-8")))
}

#print(file_names_senato2)
#glimpse(`senato-19480418`)

3.2.2.1 Mattarellum

# Define the regex pattern for files like senato-yyyymmdd
senato_mattarellum_regex <- "^senato-\\d{8}\\.txt$"

wrangling_senato_mattarellum <- function(data) {
  data <- data %>%
    mutate(across(c(VOTANTI, ELETTORI, ELETTORI_MASCHI, VOTANTI_MASCHI, VOTI_LISTA, SCHEDE_BIANCHE), as.numeric)) %>%
    mutate(across(c(REGIONE, COLLEGIO, COMUNE, LISTA), factor))%>%
  group_by(REGIONE, LISTA, YEAR) %>% 
  summarize(VOTI_LISTA = sum(VOTI_LISTA, na.rm = TRUE), .groups = "drop") %>% 
  group_by(REGIONE, YEAR) %>%  
  mutate(PERCENTAGE = (VOTI_LISTA / sum(VOTI_LISTA, na.rm = TRUE)) * 100)
  return(data)
}

#calling the function
unified_senato_mattarellum <- process_data(file_paths_senato2, senato_mattarellum_regex, 8, 11, wrangling_senato_mattarellum)
#View(unified_senato_mattarellum) #dati su elettorimaschi starting from 2001
# glimpse(`senato_italia-20060409`)
# glimpse(`senato_vaosta_trentino-20060409`)

3.2.2.2 Porcellum

# Define the regex pattern for files like senato_italia-yyyymmdd
senato_porcellum_regex <- "^senato_italia-\\d{8}\\.txt$"

wrangling_senato_porcellum <- function(data) {
  
  if ("VOTANTI_TOTALI" %in% colnames(data)) {
    data <- data %>%
      rename(VOTANTI = VOTANTI_TOTALI)
  }
  
    if ("ELETTORI_TOTALI" %in% colnames(data)) {
    data <- data %>%
      rename(ELETTORI = ELETTORI_TOTALI)
    }
  
  data <- data %>%
    mutate(across(c(VOTANTI, ELETTORI, ELETTORI_MASCHI, VOTANTI_MASCHI, VOTI_LISTA, SCHEDE_BIANCHE), as.numeric)) %>%
    mutate(across(c(REGIONE, PROVINCIA, COMUNE, LISTA), factor))%>%
    mutate(REGIONE = str_replace_all(REGIONE, "FRIULI VENEZIA GIULIA", "FRIULI-VENEZIA GIULIA")) %>%
  group_by(REGIONE, LISTA, YEAR) %>% 
  summarize(VOTI_LISTA = sum(VOTI_LISTA, na.rm = TRUE), .groups = "drop") %>% 
  group_by(REGIONE, YEAR) %>%  
  mutate(PERCENTAGE = (VOTI_LISTA / sum(VOTI_LISTA, na.rm = TRUE)) * 100)
  return(data)
}


#calling the function
unified_senato_porcellum <- process_data(file_paths_senato2, senato_porcellum_regex, 15, 18, wrangling_senato_porcellum)
#View(unified_senato_porcellum) 
#unique(unified_senato_porcellum$VOTANTI_MASCHI) #votanti_maschi sì ma elettori NA


#data for VDA and TAA
senato_porcellum_VT_regex <- "^senato_vaosta_trentino[-_]\\d{8}.txt$"

wrangling_senato_VT_porcellum <- function(data) {
  
  if ("VOTANTI_TOTALI" %in% colnames(data)) {
    data <- data %>%
      rename(VOTANTI = VOTANTI_TOTALI)
  }
  
  if ("ELETTORI_TOTALI" %in% colnames(data)) {
    data <- data %>%
      rename(ELETTORI = ELETTORI_TOTALI)
    }
  
  if ("PROVINCIA" %in% colnames(data)) {
    data <- data %>%
      select(-PROVINCIA)
    }
  
  data <- data %>%
    mutate(across(c(VOTANTI, ELETTORI, ELETTORI_MASCHI, VOTANTI_MASCHI, VOTI_LISTA, SCHEDE_BIANCHE), as.numeric)) %>%
    mutate(across(c(REGIONE, COLLEGIO, COMUNE, LISTA), factor))%>%
    mutate(REGIONE = str_replace_all(REGIONE, "TRENTINO ALTO ADIGE", "TRENTINO-ALTO ADIGE")) %>%
    group_by(REGIONE, LISTA, YEAR) %>% 
  summarize(VOTI_LISTA = sum(VOTI_LISTA, na.rm = TRUE), .groups = "drop") %>% 
  group_by(REGIONE, YEAR) %>%  
  mutate(PERCENTAGE = (VOTI_LISTA / sum(VOTI_LISTA, na.rm = TRUE)) * 100)
  return(data)
}

#calling the function
unified_senato_VT_porcellum <- process_data(file_paths_senato2, senato_porcellum_VT_regex, 24, 27, wrangling_senato_VT_porcellum)

unified_senato_porcellum <- full_join(unified_senato_porcellum, unified_senato_VT_porcellum)

View(unified_senato_porcellum)

##2008: data for TAA and VDA are together

3.2.2.3 Rosatellum

#just 2 elections with rosatellum: we won't use process_data()

senato_rosatellum_18 <- read.csv("data/senato2/Senato2018_Liste_Italia_Liv_Reg.txt", sep = ";")
#glimpse(senato_rosatellum_18)

senato_rosatellum_18$YEAR <- "2018"

senato_rosatellum_18 <- senato_rosatellum_18 %>% 
  rename(VOTI_LISTA = VOTILISTA) %>%
  select(YEAR, REGIONE, LISTA, VOTI_LISTA) %>% 
  mutate(across(c(REGIONE, LISTA), as.factor),
         VOTI_LISTA = as.numeric(gsub(",", ".", VOTI_LISTA))) %>% 
  group_by(REGIONE, LISTA, YEAR) %>% 
  summarize(VOTI_LISTA = sum(VOTI_LISTA, na.rm = TRUE), .groups = "drop") %>% 
  group_by(REGIONE, YEAR) %>%  
  mutate(PERCENTAGE = (VOTI_LISTA / sum(VOTI_LISTA, na.rm = TRUE)) * 100)


senato_rosatellum_18_vda <- read.csv("data/senato2/Senato2018_VAosta_CandidCollUninom.txt", sep=";")
senato_rosatellum_18_vda$YEAR <- "2018"
senato_rosatellum_18_vda$REGIONE <- as.factor("VALLE D'AOSTA")
#glimpse(senato_rosatellum_18_vda) 

senato_rosatellum_18_vda <- senato_rosatellum_18_vda %>% 
  rename(VOTI_LISTA = TOTVOTI) %>%
  select(YEAR, REGIONE, LISTA, VOTI_LISTA) %>% 
  mutate(
    LISTA = as.factor(LISTA),
    VOTI_LISTA = as.numeric(gsub(",", ".", VOTI_LISTA))
  ) %>%
  group_by(REGIONE, LISTA, YEAR) %>% 
  summarize(VOTI_LISTA = sum(VOTI_LISTA, na.rm = TRUE), .groups = "drop") %>% 
  group_by(REGIONE, YEAR) %>%  
  mutate(PERCENTAGE = (VOTI_LISTA / sum(VOTI_LISTA, na.rm = TRUE)) * 100)


senato_rosatellum_22 <- read.csv("C:\\Users\\acer\\Desktop\\ElectoralDifferences\\data\\senato2\\Senato_Italia_LivComune.csv", sep = ";")
#glimpse(senato_rosatellum_22)

senato_rosatellum_22$YEAR <- "2022"

senato_rosatellum_22 <- senato_rosatellum_22 %>% 
  rename(VOTI_LISTA = VOTILISTA, REGIONE = CIRC.REG, LISTA = DESCRLISTA) %>%
  select(YEAR, REGIONE, LISTA, VOTI_LISTA) %>% 
  mutate(across(c(REGIONE, LISTA), as.factor),
         VOTI_LISTA = as.numeric(gsub(",", ".", VOTI_LISTA))) %>%
  group_by(REGIONE, LISTA, YEAR) %>% 
  summarize(VOTI_LISTA = sum(VOTI_LISTA, na.rm = TRUE), .groups = "drop") %>% 
  group_by(REGIONE, YEAR) %>%  
  mutate(PERCENTAGE = (VOTI_LISTA / sum(VOTI_LISTA, na.rm = TRUE)) * 100)



senato_rosatellum_22_vt <- read.csv("data/senato2/Senato_VAosta&Trentino_LivComune.csv", sep=";")
senato_rosatellum_22_vt$YEAR <- "2022"
#glimpse(senato_rosatellum_22_vda) 

senato_rosatellum_22_vt <- senato_rosatellum_22_vt%>% 
  rename(VOTI_LISTA = TOTVOTI, LISTA = CONTRASSEGNO) %>%
  select(YEAR, REGIONE, LISTA, VOTI_LISTA) %>% 
  mutate(
    LISTA = as.factor(LISTA),
    VOTI_LISTA = as.numeric(gsub(",", ".", VOTI_LISTA))
  ) %>%
  group_by(REGIONE, LISTA, YEAR) %>% 
  summarize(VOTI_LISTA = sum(VOTI_LISTA, na.rm = TRUE), .groups = "drop") %>% 
  group_by(REGIONE, YEAR) %>%  
  mutate(PERCENTAGE = (VOTI_LISTA / sum(VOTI_LISTA, na.rm = TRUE)) * 100)

#View(senato_rosatellum_22_vda)

unified_senato_rosatellum <- full_join(senato_rosatellum_18, senato_rosatellum_18_vda) %>% full_join(senato_rosatellum_22) %>% full_join(senato_rosatellum_22_vt)
#View(unified_senato_rosatellum)
unified_senato2 <- full_join(unified_senato_mattarellum, unified_senato_porcellum) %>% full_join(unified_senato_rosatellum)

DT::datatable(unified_senato2, filter = "top", options = list(
  scrollX = TRUE, autowidth = TRUE
)) %>%
  DT::formatRound(columns = c("PERCENTAGE"), digits = 2)
saveRDS(unified_senato2, "unified_senato2.rds")