Like our site's new design? In April 2023, Circa was acquired by Mitratech.
>> Learn More

The “unknown problem” is actually very familiar to anyone who analyzes applicant data. So, it’s not really “unknown” at all. But the problem is much broader than one might imagine, and has implications that ripple through several analyses in your Affirmative Action Plan.

Unknown Race and/or Gender Status in Applicant Data

I have never seen an applicant data set that didn’t have several fields left blank or marked as “unknown” for race and/or gender. The question is – how do they enter into the analysis, if at all? It might seem obvious to exclude all applicants for whom both the race and gender are unknown. But what about candidates who self-ID for one, but not the other?

And just in case you missed my last article, “Getting it Right the First Time,” I’ll mention here that you must ensure that candidates who are hired have their race and gender updated in the applicant listing if they self-ID at the time of hire. Most federal contractors have a database for applicant information that is separate from the HRIS system. This means that a hire in the HRIS system (with a known race and gender) may be listed in the applicant log as an unknown race and/or gender. Suffice it to say, these data anomalies must be corrected before starting your selection impact ratio analysis (IRA).

Now, let’s suppose we have 1000 candidates for a position, and 900 self-ID for gender, but only 500 self-ID for race and gender. Let’s also suppose that 300 of the 400 people who provided gender – but not race – were males. Of the candidates who provided self-ID information for both race and gender, 250 were female and 250 were male. If there were 150 selections, and 100 were male, and 50 were female, the IRA would be .5, with a standard deviation of 4.88. If we then add in the applicants with known gender and unknown race, the selection rate for females would be 50/350 and the selection rate for males would be 100/550. This produces an IRA of .79, with no statistical significance.

As we know, a different composition of the gender in the “unknown race/known gender” applicants will produce a different result. What is important to understand is that how you decide to handle unknowns will impact the results of your data. Further, to have data integrity in the process, you must apply your method consistently.

Unknown Disability and/or Veteran Status in Applicant Data

Now that we need to consider disability and veteran status in our data collection and analyses, should we exclude candidates who have not self-identified as an individual with a disability or as a veteran?

Why does this matter? Well, let’s suppose you have 1000 candidates for a position, 600 of whom self-ID for everything, and 300 of them are individuals with disabilities (yes, I know this is unlikely, but stay with me here). If you make 300 selections, and 100 are individuals with a disability, then the selection rate for individuals with a disability is .33 and the rate for people without a disability is .67, with an IRA of .5 and a standard deviation of 8.16. This is a problem.

At the SWARM conference last year, OFCCP Regional Director Melissa Speer confirmed that when we analyze our workforce for purposes of setting a utilization goal, we count unknowns in our workforce as persons without a disability. We should be able to do the same for our applicant data.

So, if we now include the 400 candidates in our analysis as persons without a disability, the selection rate for individuals with a disability remains .33, but the selection rate for people without a disability drops to .29, with an IRA of 1.17. Now we have no adverse impact against persons with a disability.

The regulations do not currently require an adverse impact analysis for disability, but if the data is there, the analysis can be done. This is an area about which you will want to stay out in front.

Unknown Race and/or Gender Status in the Employee Database

OFCCP has an FAQ on their website stating the following:

  • What is the correct procedure for a contractor to obtain the ethnic information of its employees and applicants?

    OFCCP regulations 41 CFR 60-1.12(c) indicate that for any personnel or employment record a contractor maintains, it must be able to identify the gender, race, and ethnicity of each employee and, where possible, the gender, race and ethnicity of each applicant.

    OFCCP has not mandated a particular method of collecting the information. Self-identification is the most reliable method and preferred method for compiling information about a person's gender, race and ethnicity. Contractors are strongly encouraged to rely on employee self-identification to obtain this information. Visual observation is an acceptable method for identifying demographic data, although it may not be reliable in every instance. If self-identification is not feasible, post-employment records or visual observation may be used to obtain this information. Contractors should not guess or assume the gender, race or ethnicity of an applicant or employee. [emphasis added]

Though the regulation states a zero tolerance policy, the guidance from the OFCCP website anticipates that there may be employees with an unknown race and/or gender. With the new regulations regarding Gender Identity, this guidance is even more important. Why? Because guessing the gender of an employee could be a basis for discrimination, particularly if you guess incorrectly. So, leave the unknowns as unknowns.

Next, you need to decide how you will analyze the data with unknowns in your workforce. Your AAP software may not allow for unknowns, but OFCCP will want all employees accounted for in the AAP – regardless of whether you are missing any data points for some employees. To provide the most accurate picture of your workforce, unknowns should be included in as many analyses as possible. Many of the analyses do not rely on mathematical computations, so it is relatively easy to include unknowns in the Workforce Analysis, Job Group Analysis, and transactional summaries. Unknowns could then be excluded from the Internal Availability, External Availability, Two Factor Analysis, Utilization Analysis, and all Impact Ratio Analyses as the computations rely on the race and gender data points for all employees included in the analysis.


It is almost impossible to eliminate the number of “unknowns” in your data analyses. Even our best efforts to collect this information will rarely result in a response of 100% self-ID for race, gender, disability, and veteran status. The best we can do is adapt to this reality and develop a consistent and reliable strategy for handling them. Then be prepared to defend your approach when OFCCP knocks on your door.

For more information on how to analyze data with unknown factors, please contact Marilynn Schuyler at [email protected].

Please note: nothing in this article is intended as legal advice or as a substitute for any professional advice about your organization's particular circumstances. All original materials copyright © Schuyler Affirmative Action Practice.



Skip to content