;
  • (916) 548-4161
  • info@lumecg.com

California Civil Rights Department (CRD) Data Cleansing Project

The Civil Rights Department (CRD) is the largest state civil rights agency in the country. CRD has been at the forefront of protecting the rights of Californians since its inception. The mission of the CRD is to protect the people of California from unlawful discrimination in employment, housing, and public accommodations (businesses) and from hate violence and human trafficking in accordance with the Fair Employment and Housing Act (FEHA), Unruh Civil Rights Act, Disabled Persons Act, and Ralph Civil Rights Act.

Data Cleansing

Implementation of Proposition SB 973 (chaptered September 30, 2020). This Senate Bill requires on or before March 31, 2021, and annually thereafter, a private employer that has 100 or more employees who are required to file an annual Employer Information Report (EEO-1) under federal law, to submit a Pay Data Report to CRD that contains specified confidential employment, demographic, and payment information. The bill further requires CRD to make the Pay Data Reports available to the Department of Industrial Relations Division of Labor Standards Enforcement (DLSE) upon request and permits CRD to use the Pay Data Reports publishing aggregate reports as necessary for administrative enforcement or civil actions. For each of reporting years 2020 and 2021, CRD received approximately 20,000 certified Pay Data Reports. The 2021 reports provided pay and demographic data for approximately 8.6 million employees at 178,000 establishments and included 3 million employee detail records. CRD provided the raw 2021 data from several data sources to Lume for data cleansing and preparing data for statistical analysis, and to assist CRD with creating reports and data visualization from California establishments data at the statewide, industry, and regional levels.
Upon initial analysis of the data files, Lume combined the raw data into a single consolidated data file. We utilized our data-cleaning approach and developed methods to identify and remove duplicate data, flag invalid data for manual review, and perform data enrichment for missing data.
The outcome of our effort was a final clean and publication-ready 2021 data file for CRD including eligible data for reporting and analytics purposes; as well as a final data file for ineligible and duplicate records. Lume continues this effort by providing data aggregation and visualization to CRD. This enables CRD to produce publication-ready data reports for public access.
Challenge
Implementation of Proposition SB 973 (chaptered September 30, 2020). This Senate Bill requires on or before March 31, 2021, and annually thereafter, a private employer that has 100 or more employees who are required to file an annual Employer Information Report (EEO-1) under federal law, to submit a Pay Data Report to CRD that contains specified confidential employment, demographic, and payment information. The bill further requires CRD to make the Pay Data Reports available to the Department of Industrial Relations Division of Labor Standards Enforcement (DLSE) upon request and permits CRD to use the Pay Data Reports publishing aggregate reports as necessary for administrative enforcement or civil actions. For each of reporting years 2020 and 2021, CRD received approximately 20,000 certified Pay Data Reports. The 2021 reports provided pay and demographic data for approximately 8.6 million employees at 178,000 establishments and included 3 million employee detail records. CRD provided the raw 2021 data from several data sources to Lume for data cleansing and preparing data for statistical analysis, and to assist CRD with creating reports and data visualization from California establishments data at the statewide, industry, and regional levels.
Approach
Upon initial analysis of the data files, Lume combined the raw data into a single consolidated data file. We utilized our data-cleaning approach and developed methods to identify and remove duplicate data, flag invalid data for manual review, and perform data enrichment for missing data.
Outcome
The outcome of our effort was a final clean and publication-ready 2021 data file for CRD including eligible data for reporting and analytics purposes; as well as a final data file for ineligible and duplicate records. Lume continues this effort by providing data aggregation and visualization to CRD. This enables CRD to produce publication-ready data reports for public access.

Enquiry Form

Thank you for your message. It has been sent.