HR software companies? Why structuring your data is crucial for your business?

Benchmarking Resume Parsing Solutions: Daxtra, Sovren, Hireability, Textkernel and Segmentr (by Riminder)

Mouhidine SEIV
HrFlow.ai (formely Riminder)

--

If you are working in the HR software industry or a member of an HR department, you definitely have thought about structuring your CV data before. Projects requiring resume structuring can be crucial to your business and can range from simple user experience improvement to strategic product roadmap advancements. These are some examples of the common use cases of CV structured data in the HR Software Industry:

Now you can parse all your resumes, no exception made!
  • creating an efficient and relevant talent search experience
  • getting market-relevant insights about your talent pools
  • Building usable datasets for an AI-based job matching tool.

Resume Parsing, the inevitable solution to your problem

copyrights Riminder 2018

A resume parsing solution is a software that takes a resume as an input that can be in any media format (PDF, Word or image) or template, then convert it into a structured data format like — such as XML or JSON.

The information that is extracted by a resume parser usually includes the following:

personal information: name, email, address, phone

list of experience: start date, end date, location, job title, company, description, …

list of education: start date, end date, location, degree, university …

list of skills, …

list of interests

Seems easy? But the reality is hard!
No improvement for more +10 years

Here are some few metrics:

+1.4 Billion resumes are parsed every year.

+40% of resumes have a complex layout (multi-column,etc.)

+7% of resumes are either scans or images

The first resume parsers were born in the late '90s to provide a data structuring technology to HR software companies that are looking for a stand-alone packaged solution in order to focus on their core business. Some of these first-mover solutions are:

  • Sovren (1996)
  • TextKernel (2001)
  • Daxtra (2002)

How Daxtra, Sovren, Hireability, Textkernel and Segmentr (by Riminder) are doing at this task?

Building a general and reliable parser requires many building blocks.
For instance, the system should be able to handle:

  • complex layouts (ex: multi-column resumes, pictures with backgrounds, etc.)
  • ambiguous entities (ex: Facebook, as a former employer vs. a social media skill)
  • different media formats (PDF, Word, Image, etc.)
  • multiple languages
  • etc.

The following comparison between some of Segmentr’s features and famous existing resume parsing solutions is the result of extensive validation tests we led at Riminder:

Features benchmark

Segmentr (by HrFlow.ai) is the only Resume Parser able to handle such examples

We’ve also computed the performance of each solution over a validation dataset of around 100 resumes randomly sampled. For each output, we averaged the accuracy obtained across the multiple labels. Below is a graph summarizing the obtained results:

Extracted information accuracy overview

Segmentr example in python

First, you have to post your data using a POST REQUEST on following the endpoint below:

Here is the structure of the data that you’ll get:

What’s next for you?

Discover Segmentr Live
If you are interested to know more about Segmentr, you can book us for demo : https://hrflow.ai/book-us .

You can also visit our https://labs.hrflow.ai to see AI applied to HR in action.

Are you a Developer?
You can start now using our self-service API without any painful setups.
Get started in a few minutes with our documentation:

If you enjoyed this article, it would really help if you hit recommend below :)

Follow us on Twitter @hrflowai

If you want to work on AI + FAIRNESS, check our jobs page!

--

--

Founder and CEO at Riminder, revealing people’s full potential and making hiring bias-free. I share my entrepreneurial lessons learned the hard way.