This readme file was generated on 2023-05-02 by Matheus H.C. Barboza

# 1 INTRODUCTORY INFORMATION
- Title: Data associated with the paper "A comparative analysis of leisure accessibility metrics and equity impacts of a new monorail line in São Paulo"
- Paper authors: Matheus H.C. Barboza, Mariana Giannotti. Anna B. Grigolon, and Karst T. Geurs
- Contact: matheushenriquebarboza@gmail.com
- Short description of files:
    - \input
        - \data_base
			- rmsp.pbf: an OpenStreetMap street network of São Paulo, used by the r5r package
			- SP20181016_Base.zip: GTFS of the São Paulo public transport network for the base scenario
        - \data_prop
			- rmsp.pbf: an OpenStreetMap street network of São Paulo, used by the r5r package
			- SP20181016_L15C.zip: GTFS of the São Paulo public transport network for the proposed scenario 
		- \layers
			- \EQUIPAMENTOS_SHP_TEMA_CULTURA
				- SIRGAS_SHP_TEMA_-_CULTURA_BIBLIOTECAS.shp: location of libraries
				- SIRGAS_SHP_TEMA_-_CULTURA_ESPACOS_CULTURAIS.shp: location of cultural centers
				- SIRGAS_SHP_TEMA_-_CULTURA_MUSEUS.shp: location of museums
				- SIRGAS_SHP_TEMA_-_CULTURA_TEATRO-CINEMA-SHOW.shp: location of cinemas, theaters and concert halls
			- \SIRGAS_SHP_parquespde5
				- SIRGAS_SHP_parquespde5.shp: location of parks
			- Area_funcionamento_parques.csv: info about the parks
			- Zonas2017b.shp: layer with the traffic zones
			- centroids_zones_coord.shp: zones centroids
		- Basico_SP1.csv: data from the census
    - \results
        - persons_access.csv: accessibility results for each person
        - zones_access.csv: accessibility results for each zone
	- shared_version.R: code to process the data and generate the results and pictures of the paper
 
# 2 METHODOLOGICAL INFORMATION
- Methods: descripted in the paper
- Software: RStudio 2023.03.0

# 3 SHARING AND ACCESS INFORMATION
- License: CC-BY

# 4 DATA SPECIFIC INFORMATION FOR \input\data_base\rmsp.pbf
- an OpenStreetMap street network of São Paulo, used by the setup_r5 function of the r5r package
- available at the research group

# 4 DATA SPECIFIC INFORMATION FOR \input\data_base\SP20181016_Base.zip
- General Transit Feed Specification (GTFS) of the São Paulo public transport network for the base scenario, including municipal buses, and rail lines
- it represents the network from 2018-10-16
- available at the research group

# 4 DATA SPECIFIC INFORMATION FOR \input\data_prop\rmsp.pbf
- an OpenStreetMap street network of São Paulo, used by the setup_r5 function of the r5r package, the same of the base scenario
- available at the research group

# 4 DATA SPECIFIC INFORMATION FOR \input\data_prop\SP20181016_L15C.zip
- General Transit Feed Specification (GTFS) of the São Paulo public transport network for the proposed scenario. Contains the same information of the base scenario, adding the complete Line 15
- it represents the network from 2018-10-16
- designed by Freiberg (2022), following the “Relatório Integrado da CMSP” (Metrô, 2021)

# 4 DATA SPECIFIC INFORMATION FOR \input\layers\EQUIPAMENTOS_SHP_TEMA_CULTURA\SIRGAS_SHP_TEMA_-_CULTURA_BIBLIOTECAS.shp
- location of 171 libraries, other attributes are not used

# 4 DATA SPECIFIC INFORMATION FOR \input\layers\EQUIPAMENTOS_SHP_TEMA_CULTURA\SIRGAS_SHP_TEMA_-_CULTURA_ESPACOS_CULTURAIS.shp
- location of 321 cultural centers, other attributes are not used

# 4 DATA SPECIFIC INFORMATION FOR \input\layers\EQUIPAMENTOS_SHP_TEMA_CULTURA\SIRGAS_SHP_TEMA_-_CULTURA_MUSEUS.shp
- location of 135 museums, other attributes are not used

# 4 DATA SPECIFIC INFORMATION FOR \input\layers\EQUIPAMENTOS_SHP_TEMA_CULTURA\SIRGAS_SHP_TEMA_-_CULTURA_TEATRO-CINEMA-SHOW.shp
- location of 1062 cinemas, theaters and concert halls, other attributes are not used

# 4 DATA SPECIFIC INFORMATION FOR \input\layers\SIRGAS_SHP_parquespde5\SIRGAS_SHP_parquespde5.shp
- Number of variables: 7
- Number of cases/rows: 501
- Variable List:
	- pde5_id: identifier
	- pde5_nome: not used;
	- pde5_codig: not used;
	- pde5_esfer: not used;
	- pde5_area: not used;
	- pde5_categ: class of the park, we filter those listed as "Parque Estadual de Proteção Integral", "Parque Estadual Urbano" or "Parque Municipal Existente"
- Missing data codes: NA

# 4 DATA SPECIFIC INFORMATION FOR \input\layers\Area_funcionamento_parques.csv
- Number of variables: 10
- Number of cases/rows: 105
- Variable List: 
	- OBJECTID: identifier;
	- pq_nome: name of the park;
	- ORIG_FID: not used;
	- Abertura: opening hour of the park;
	- Fechamento: closing hour of the park;
	- OBJECTID_1: not used;
	- pq_nome_1: not used;
	- SHAPE_Length: not used;
	- SHAPE_Area: not used;
	- Area: not used.
- Missing data codes: NA

# 4 DATA SPECIFIC INFORMATION FOR \input\layers\Zonas2017b.shp
- Number of variables: 8
- Number of cases/rows: 517
- Variable List:
	- NumeroZona: identifier;
	- NomeZona: associated name;
	- NumeroMuni: number of the municipality. we use it to filter the zoens in São Paulo;
	- NomeMunici: associated name;
	- NumDistrit: number of the district;
	- NomeDistri: associated name;
	- Area_ha_2: area in hectare
- Missing data codes: NA

# 4 DATA SPECIFIC INFORMATION FOR \input\layers\centroids_zones_coord.shp
- Number of variables: 10
- Number of cases/rows: 517
- Variable List:
	- NumeroZona: identifier;
	- NomeZona: associated name;
	- NumeroMuni: number of the municipality. we use it to filter the zoens in São Paulo;
	- NomeMunici: associated name;
	- NumDistrit: number of the district;
	- NomeDistri: associated name;
	- Area_ha_2: area in hectare;
	- xcoord: longitude;
	- ycoord: latitude:
- Missing data codes: NA

# 4 DATA SPECIFIC INFORMATION FOR \input\Basico_SP1.csv
- Number of variables: 
- Number of cases/rows: 
- Variable List:
	- Cod_setor: identifier
	- Cod_Grandes Regioes: not used;
	- Nome_Grande_Regiao: not used;
	- Cod_UF: not used;
	- Nome_da_UF: not used;
	- Cod_meso: not used;
	- Nome_da_meso: not used;
	- Cod_micro: not used;
	- Nome_da_micro: not used;
	- Cod_RM: not used;
	- Nome_da_RM: not used;
	- Cod_municipio: not used;
	- Nome_do_municipio: not used;
	- Cod_distrito: not used;
	- Nome_do_distrito: not used;
	- Cod_subdistrito: not used;
	- Nome_do_subdistrito: not used;
	- Cod_bairro: not used;
	- Nome_do_bairro: not used;
	- Situacao_setor: not used;
	- Tipo_setor: not used;
	- V001: not used;
	- V002: Residents in permanent private households or resident population in permanent private households;
	- V003: not used;    
	- V004: not used;               
	- V005: Value of the average monthly nominal income of the people responsible for private permanent residences (with and without income);  
	- V006: not used;     
	- V007: not used;               
	- V008: not used;
	- V009: not used;               
	- V010: not used;               
	- V011: not used;               
	- V012: not used;
	- ...34: not used;
- Missing data codes: NA

# 4 DATA SPECIFIC INFORMATION FOR \results\persons_access.csv
- Number of variables: 222
- Number of cases/rows: 27059 
- Variable List:
	- ID_PESS: identifier of the person
	- FE_PESS: expansion factor adopted by the household OD survey
	- ZONA: home zone
	- ZONATRA1: job zone
	- RENDA_FA: household monthly income
	- quantile_income: quantile income, from 1 to 10
	- NomeDistri: name of district where person lives
	- selec: boolean marking those who live in the selected zones
	- for all the remaining columns: split by "_", the first part defins the measure, the last defines the scenario and in the middle 1 till 3 parts (depending on the measure) mark the opportunity, the time threshold and the hour.
- Missing data codes: NA

# 4 DATA SPECIFIC INFORMATION FOR \results\zones_access.csv
- Number of variables: 265
- Number of cases/rows: 342
- Variable List: 
	- idz: identifier;
	- NomeZona: associated name;
	- NumeroMuni: number of the municipality. we use it to filter the zoens in São Paulo;
	- NomeMunici: associated name;
	- NumDistrit: number of the district;
	- NomeDistri: associated name;
	- Area_ha_2: area in hectare
	- urban: boolean marking if the zone is mainly urban or not
	- P5 to P23: each column has the area of parks that are open in the hour
	- E5 to E23: each column has the area of parks that are open in the hour, in the extended open hours scenario 
	- cpd: number of cultural opportunities during the day
	- cpn: number of cultural opportunities during the night
	- pop: population of the zone
- Missing data codes: NA